You are on page 1of 660

Shawn X. Wang (Ed.

)
Current Trends in Computer Science and Mechanical Automation
Selected Papers from CSMA2016 - Volume 1
Shawn X. Wang (Ed.)
Current Trends
in Computer Science
and Mechanical
Automation

Selected Papers from CSMA2016 - Volume 1


ISBN: 978-3-11-058496-7
e-ISBN: 978-3-11-058497-4

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.
For details go to http://creativecommons.org/licenses/by-nc-nd/3.0/.

Library of Congress Cataloging-in-Publication Data


A CIP catalog record for this book has been applied for at the Library of Congress.

© 2017 Shawn X. Wang (Ed.) and chapters contributors


Published by De Gruyter Open Ltd, Warsaw/Berlin
Part of Walter de Gruyter GmbH, Berlin/Boston
The book is published with open access at www.degruyter.com.

www.degruyteropen.com
Cover illustration: © cosmin4000 / iStock.com
Contents

Preface XIII
Introduction of keynote speakers XIV

Part I: Computer Science and Information Technology I

Yan-zhao Chen, Yu-wei Zhang


Research and Development of Upper Limb Rehabilitation Training Robot 1

Xiao-fei CUI, Ya-dong WANG, Guang-ri QUAN, Yong-dong XU


K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design 9

Hong ZHANG, Ling-ling ZHANG


Research and Prospects of Large-scale Online Education Pattern 21

Lebi Jean Marc Dali, Zhi-guang QIN


New LebiD2 for Cold-Start 33

Li-jun ZHANG, Fei YU, Qing-bing JI


An Efficient Recovery Method of Encrypted Word Document 40

Gao-yang LI, Kai WANG, Yu-kun ZENG, Guang-ri QUAN


A Short Reads Alignment Algorithm Oriented to Massive Data 49

Yan-nan SONG, Shi LIU, Chun-yan ZHANG, Wei JI, Ji QI


Based on the Mobile Terminal and Clearing System for Real-time
Monitoring of the AD Exposure 58

Li-ming LIN, Guang-cao LIU, Yan WANG, Wei LU


Star-shaped SPARQL Query Optimization on Column-family Overlapping
Storage 67

Zhen-yu LV, Xu ZHANG, Wei MING, Peng LI


Assembly Variation Analysis based on Deviation Matrix 74
Yan XU, Rui CHANG, Ya-fei WANG
Design and Realization of Undergraduate Teaching Workload Calculation System
Based on LabVIEW 84

Yan SU, Yi-Shu ZHONG


Smooth Test for Multivariate Normality of Innovations in the Vector Autoregressive
Model 94

Bi-kuan YANG, Guang-ming LIU


Research on the Privacy-Preserved Mechanism of Supercomputer Systems 102

Jiu-chuan LIN, Yong-jian WANG, Rong-rong XI, Lei CUI, Zhi-yu HAO
Reliability Evaluation Model for China’s Geolocation Databases 114

Jie TAN, Jian-min PANG, Shuai-bing LU


Using Local Library Function in Binary Translation 123

Kai CHENG, Fei SONG, Shiyin QIN


Type Recognition of Small Size Aircrafts in Remote Sensing Images based on Weight
Optimization of Feature Fusion and Voting Decision of Multiple Classifiers 133

Bo HU, Yu-kun JIN, Wan-jiang GU, Jun LIU, Hua-qin QIN, Chong CHEN, Ying-yu WANG
Research of User Credit Rating Analysis Technology based on CART Algorithm 149

Cong-ping CHEN, Yan-hua RAN, Jie-guang HUANG, Qiong HU, Xiao-yun WANG
Research of the Influence of 3D Printing Speed on Printing Dimension 157

Fan ZHANG, Fan YANG


Research on Docker-based Message Platform in IoT 164

Yan-xia YANG
Research on the Recommendation of Micro-blog Network Advertisement based
on Hybrid Recommendation Algorithm 171

Zhong GUO, Nan LI


Fuzz Testing based on Sulley Framework 181
Jun ZHANG, Shuang ZHANG, Jian LIANG, Bei TIAN, Zan HOU, Bao-zhu LIU
A Risk Assessment Strategy of Distribution Network Based on Random
Set Theory 188

Rui LIU, Bo-wen SUN, Bin TIAN, Qi LI


A Software Homology Detection based on BP Neural Network 199

Ming LIU, Hao-yuan DU, Yue-jin ZHAO, Li-quan DONG, Mei HUI
Image Small Target Detection based on Deep Learning with SNR Controlled
Sample Generation 211

Yi-xin ZHANG, Wen-sheng SUN


Agricultural Product Price Forecast based on Short-term Time Series Analysis
Techniques 221

Xiao-lin ZHAO, Jing-feng XUE, Qi ZHANG, Zi-yang WANG


The Correction of Software Network Model based on Node Copy 234

Wei-ping WANG, Fang LIU


A Linguistic Multi-criteria Decision Making Method based on the Attribute
Discrimination Ability 254

Part II: Computer Science and Information Technology II

Gui-fu YANG, Xiao-yu XU, Wei-shuo LIU, Cheng-lin PU, Lu YAO, Jing-bo ZHANG,
Zhen-bang LIU
The Reverse Position Fingerprint Recognition Algorithm 263

Xiao-Lin XU
Global Mean Square Exponential Stability of Memristor-Based Stochastic
Neural Networks with Time-Varying Delays 270

Zhen-hong XIE
Research on Classified Check of Metadata of Digital Image based on Fuzzy String
Matching 280
Shao-nan DUAN, Yan-jie NIU, Yao SHI, Xiao-dong MU
Quantitative Analysis of C2 Organization Collaborative Performance based
on System Dynamics 286

Kuai-kuai ZHOU, Zheng CHEN


Comprehensive Evaluation and Countermeasures of Rural Information Service
System Construction in Hengyang 297

Wen-qing HUANG, Qiang WU


Image Retrieval Algorithm based on Convolutional Neural Network 304

Bing ZHOU, Juan DENG


A Cross-domain Optimal Path Computation 315

Feng LIU, Huan LI, Zhu-juan MA, Er-zhou ZHU


Collaborative Filtering Recommendation Algorithm based on Item Similarity
Learning 322

Huo-wen JIANG, Hai-ying MA, Xin-ai XU


The Graph Merge-Clustering Method based on Link Density 336

Qing-yun QIU, Jun-yong LUO, Mei-juan YIN


Person Name Disambiguation by Distinguishing the Importance of Features based
on Topological Distance 342

Bo HU, Yu-kun JIN, Jun LIU, Ai-jun FAN, Hong-bo MA, Chong CHEN
A Security Technology Solution for Power Interactive Software Based
on WeChat 352

Miao FAN, Jia-min MAO, Jao-gui DING, Wei-feng LI


Two-microphones Speech Separation Using Generalized Gaussian
Mixture Model 362

Zhi-qiang LI, Sai CHEN, Wei ZHU, Han-wu CHEN


A Common Algorithm of Construction a New Quantum Logic Gate for Exact
Minimization of Quantum Circuits 371
Qian WANG, Xiao-guang REN, Li-yang XU, Wen-jing YANG
A Post-Processing Software Tool for the Hybrid Atomisitc-Continuum Coupling
Simulation 379

Jun XU, Xiao-yong LI


Achieve High Availability about Failover in Virtual Machine Cluster 392

Kai-peng MAO, Shi-peng XIE,Wen-ze SHAO


Automatic Segmentation of Thorax CT Images with Fully Convolutional
Networks 402

Yong-jie WANG, Yi-bo WANG, Dun-wei DU, Yan-ping BAI


Application of CS-SVM Algorithm based on Principal Component Analysis
in Music Classification 413

Si-wen GUO, Yu ZUO, Tao YAN,Zuo-cai WANG


A New Particle Swarm Optimization Algorithm Using Short-Time Fourier
Transform Filtering 422

Zhen WANG, Hao-peng CHEN, Fei HU


RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual
Machines 435

Part III: Sensors, Instrument and Measurement I

Shang-yue Zhang, Yu-ming Wang, Zheng-guo Yu


AIS Characteristic Information Preprocessing & Differential Encoding
based on BeiDou Transmission 451

You LI, Xing-shu WANG, Hao XIONG


Modeling of Ship Deformation Measurement based on Single-axis
Rotation INS 460

Mei-ling WANG, Hua SONG, Chun-ling WEI


Research on Fault Diagnosis of Satellite Attitude Control System based
on the Dedicated Observers 470
Ming-hui YAN, Yao-he LIU, Ning GUO, Hua-cheng TANG
Data Advance Based on Industrial 4.0 Manufacturing System 482

Hai-tao ZHAI, Wen-shen MAO, Wen-song LIU, Ya-Di LU, Lu TANG


High Performance PLL base on Nonlinear Phase Frequency Detector and Optimized
Charge Pump 492

Fang-yan LUO
Accelerated ICP based on linear extrapolation 500

Li-ran PEI, Ping-ping JIANG, Guo-zheng YAN


Studies of falls detection algorithm based on support vector machine 507

Ting-ting GUO, Feng QIAO, Ming-zhe LIU, Ai-dong XU, Jun-nan SUN
Research and Development of Indoor Positioning Geographic Information System
based on Web 517

Chun FANG, Man-feng DOU, Bo TAN, Quan-wu LI


Harmonic Distribution Optimization of Surface Rotor Parameters for High-Speed
Brushless DC Motor 528

Sheng-yang GAO, Xian-yang JIANG, Xiang-hong TANG


Vehicle Motion Detection Algorithm based on Novel Convolution
Neural Networks 544

Yong LI, En-de WANG, Zhi-gang DUAN, Hui CAO, Xun-qian LIU


The Bank Line Detection in Field Environment Based on Wavelet
Decomposition 557

Xiao-ming LI, Jia-yue YIN, Hao-jun XU, Chengqiong BI, Li ZHU,


Xiao-dong DENG, Lei-nan MA
Research on Measurement Method of Harmonics Bilateral Energy 566

Long-fei WANG, Wei ZHANG, Xiang-dong CHEN


Traffic Supervision System Using Unmanned Aerial Vehicle based on Image
Recognition Algorithm 573
Yu-fei LI, Ya-yong LIU,Lu-ning XU,Li HAN, Rong SHEN, Kun-quan LU
Improving the Durability of Micro ER Valves for Braille Displays Using an
Elongational Flow Field 585

Jinxia WU, Fei SONG, Shiyin QIN


Aircraft Target Detection in Remote Sensing Images towards Air-to-Ground
Dynamic Imaging 592

Xiao-mei LIU, Shuai ZHU


A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems 609

Qi YANG, Yu-liang QIN, Bin DENG, Hong-qiang WANG


Research on Terahertz Scattering Characteristics of the Precession Cone 620

Chun-yu CHENG, Meng-lin SHENG, Zong-min YU, Wen-xuan ZHANG, An-qi LI,
Kai-yu WANG
Application Development of 3D Gesture Recognition and Tracking Based on the Intel
Real Sense Technology Combing with Unity3D and WPF 630
Preface
The 2nd International Conference on Computer Science and Mechanical Automation
carried on the success from last year and received overwhelming support from the
research community as evidenced by the number of high quality submissions. The
conference accepted articles through rigorous peer review process. We are grateful to
the contributions of all the authors. For those who have papers appear in this collection,
we thank you for your great effort that makes this conference a success and the volume
of this proceeding worth reading. For those whose papers were not accepted, we assure
you that your support is very much appreciated. The papers in this proceeding represent
a broad spectrum of research topics and reveal some cutting-edge developments.
Chapter 1 and 2 contain articles in the areas of computer science and information
technology. The articles in Chapter 1 focus on algorithm and system development
in big data, data mining, machine learning, cloud computing, security, robotics,
Internet of Things, and computer science education. The articles in Chapter 2 cover
image processing, speech recognition, sound event recognition, music classification,
collaborative learning, e-government, as well as a variety of emerging new areas of
applications. Some of these papers are especially eye-opening and worth reading.
Chapter 3 and 4 contain papers in the areas of sensors, instrument and measurement.
The articles in Chapter 3 cover mostly navigation systems, unmanned air vehicles,
satellites, geographic information systems, and all kinds of sensors that are related
to location, position, and other geographic information. The articles in Chapter 4 are
about sensors and instruments that are used in areas like temperature and humidity
monitoring, medical instruments, biometric sensors, and other sensors for security
applications. Some of these papers are concerned about highly critical systems such as
nuclear environmental monitoring and object tracking for satellite videos.
Chapter 5 and 6 contain papers in the areas of mechatronics and electrical
engineering. The articles in Chapter 5 cover mostly mechanical design for a variety
of equipment, such as space release devices, box girder, shovel loading machines,
suspension cables, grinding and polishing machines, gantry milling machines, clip type
passive manipulator, hot runner systems, water hydraulic pump/motor, and turbofan
engines. The articles in Chapter 6 focus on mechanical and automation devices in
power systems as well as automobiles and motorcycles.
This collection of research papers showcases the incredible accomplishments of
the authors. In the meantime, they once again prove that the International Conference
on Computer Science and Mechanical Automation is a highly valuable platform for the
research community to share ideas and knowledge. Organization of an international
conference is a huge endeavor that demands teamwork. We very much appreciate
everyone who is involved in the organization, especially the reviewers. We are looking
forward to another successful conference next year.
Shawn X. Wang
CSMA2016 Conference Chair
Introduction of keynote speakers
Professor Lazim Abdullah
School of Informatics and Applied Mathematics, Universiti Malaysia Terengganu,
Malaysia
Lazim Abdullah is a professor of computational mathematics at the School of
Informatics and Applied Mathematics, Universiti Malaysia Terengganu. He holds a
B.Sc (Hons) in Mathematics from the University of Malaya, Kuala Lumpur in June 1984
and the M.Ed in Mathematics Education from University Sains Malaysia, Penang in
1999. He received his Ph.D. from the Universiti Malaysia Terengganu, (Information
Technology Development) in 2004. His research focuses on the mathematical theory
of fuzzy sets and its applications to social ecology, environmental sciences, health
sciences, and manufacturing engineering. His research findings have been published
in over two hundred and fifty publications, including refereed journals, conference
proceedings, chapters in books, and research books. Currently, he is Director,
Academic Planning, Development and Quality of his University and a member of
editorial boards of several international journals related to computing and applied
mathematics. He is also a regular reviewer for a number of local and international
impact factor journals, member of scientific committees of several symposia and
conferences at national and international levels. Dr Abdullah is an associate member,
IEEE Computational Intelligence Society, a member of the Malaysian Mathematical
Society and a member of the International Society on Multiple Criteria Decision
Making. 

Professor Jun-hui Hu
State Key Lab of Mechanics and Control of Mechanical Structures, Nanjing University
of Aeronautics and Astronautics, China
Dr. Junhui Hu is a Chang-Jiang Distinguished Professor, China, the director of
Precision Driving Lab at Nanjing University of Aeronautics and Astronautics, and
the deputy director of State Key Laboratory of Mechanics and Control of Mechanical
Structures, China. He received his Ph.D. Degree from Tokyo Institute of Technology,
Tokyo, Japan, in 1997, and B.E. and M.E. degrees in electrical engineering from
Zhejiang University, Hangzhou, China, in 1986 and 1989, respectively. He was an
assistant and associate professor at Nanyang Technological University, Singapore,
from 2001 to 2010. His research interest is in piezoelectric/ultrasonic actuating
technology. He is the author and co-author of about 250 papers and disclosed
patents, including more than 80 full SCI journal papers and one editorial review
for an international journal, and the sole author of monograph book “Ultrasonic
Micro/Nano Manipulations” (2014, World Scientific, Singapore). He is the editorial
board member of two international journals. Dr. Hu won the Paper Prize from the
Institute of Electronics, Information and Communication Engineers (Japan) as the
first author in 1998, and was once awarded the title of valued reviewer by Sensors
and Actuators A: Physical and Ultrasonics. His research work has been highlighted
by 7 international scientific media.

Professor James Daniel Turner


College of Engineering, Aerospace Engineering, Texas A&M University (TAMU), America
Dr. James Daniel Turner is a research professor in College of Engineering, Texas
A&M University (TAMU) from 2006 to current. In 1974, he received his B.S. degree in
Engineering Physics in George Mason University. In 1976, he received his M.E. degree
in Engineering Physics, University of Virginia. And he received his Ph.D. Degree from
Engineering Science and Mechanics, Virginia Tech in 1980.
He has broad experience in public, private, and acedemic settings for working with
advanced engineering and scientific concepts that are developed from first principles;
modeled and simulated to understand the limits of performance; developed as
hardware proto-types; tested in operationally relevant environments; and transitioned
through partnering with industry and government to missions of critical national
interest. Dr. James Daniel Turneris engaged in exploratory research where the goal
is to transition aerospace analysis tools to bioinformatics. This research consists of
applying multibody dynamics for drug design problems in computational chemistry,
and most recently working with the immunological group at Mayo Clinic for exploring
the development of generalized preditor-prey models for analyzing melenoma cancer
in Human cell behaviors.

Professor Rong-jong Wai


Department of Electronic and Computer Engineering, National Taiwan University of
Science and Technology, Taiwan
Rong-jong Wai was born in Tainan, Taiwan, in 1974. He received the B.S. degree in
electrical engineering and the Ph.D. degree in electronic engineering from Chung
Yuan Christian University, Chung Li, Taiwan, in 1996 and 1999, respectively. From
August 1998 to July 2015, he was with Yuan Ze University, Chung Li, Taiwan, where
he was the Dean of the General Affairs Office from August 2008 to July 2013, and
the Chairman of the Department of Electrical Engineering from August 2014 to July
2015. Since August 2015, he has been with National Taiwan University of Science and
Technology, Taipei, Taiwan, where he is currently a full Professor, and the Director of
the Energy Technology and Mechatronics Laboratory. He has authored more than 150
conference papers, near 170 international journal papers, and 51 inventive patents.
He is a fellow of the Institution of Engineering and Technology (U.K.) and a senior
member of the Institute of Electrical and Electronics Engineers (U.S.A.).

Professor Zhen-guo Gao


Dalian University of Technology, China
Zhen-guoGao had been a visiting professor in University of Michigan, Dearborn with
full financial support from China Scholarship Council. He has been working as a
XVI   Introduction of keynote speakers

visiting professor in University of Illinois at Urbana-Champaign in 2010. He received


his Ph.D. degree in Computer Architecture from Harbin Institute of Technology,
Harbin, China, in 2006 and then joined Harbin Engineering University, Harbin, China.
His research interests include wireless ad hoc network, cognitive radio network,
network coding, game theory applications in communication networks, etc.
He is a senior member of China Computer Federation. He is serving as a reviewer
for project proposals to National Science Foundation of China, Ministry of Education
of China, Science Foundation of HeiLongJiang Province, China, etc. He is also
serving as a reviewer for some important journals including IEEE Transactions on
Mobile Computing, Wireless Networks and Mobile Computing, Journal of Parallel
and Distributed Computing, Journal of Electronics (Chinese), Journal of Computer
(Chinese), etc.

Professor Steven Guan (Sheng-Uei Guan)


Director, Research Institute of Big Data Analytics, Xi’an Jiaotong-Liverpool University,
China
Steven Guan received his M.Sc. & Ph.D. from the University of North Carolina at
Chapel Hill. Prof. Guan has worked in a prestigious R&D organization for several
years, serving as a design engineer, project leader, and department manager. After
leaving the industry, he joined Yuan-Ze University in Taiwan for three and half
years. He served as deputy director for the Computing Center and the chairman for
the Department of Information & Communication Technology. Later he joined the
Electrical & Computer Engineering Department at National University of Singapore
as an associate professor. He is currently a professor in the computer science and
software engineering department at Xi’an Jiaotong-Liverpool University (XJTLU).
He created the department from scratch and served as the head for 4.5 years.
Before joining XJTLU, he was a tenured professor and chair in intelligent systems
at Brunel University, UK. Prof. Guan’s research interests include: machine learning,
computational intelligence, e-commerce, modeling, security, networking, and
random number generation. He has published extensively in these areas, with 130
journal papers and 170+ book chapters or conference papers. He has chaired and
delivered keynote speeches for 20+ international conferences and served on 130+
international conference committees and 20+ editorial boards.

General Chair

Prof. Wen-Tsai Sung, National Chin-Yi University of Technology, Taichung, Taiwan


Prof. Shawn X. Wang, Department of Computer Science, California State University,
Fullerton, United States
 Introduction of keynote speakers   XVII

Editors

Prof. Wen-Tsai Sung, National Chin-Yi University of Technology, Taichung, Taiwan


Prof. Hong-zhi Wang, Department of Computer Science and Technology, Harbin
Institute of Technology, China
Prof. Shawn X. Wang, Department of Computer Science, California State University,
Fullerton, United States

Co-editor

Professor Cheng-Yuan Tang, Huafan University, New Taipei City, Taiwan

Prof. Wen-Tsai Sung, Department of Electrical Engineering, National Chin-Yi University


of Technology, Taichung, Taiwan, songchen@ncut.edu.tw
His mian research areas are Electrical Engineering and Wireless Sensors Network.

Prof. Shawn X. Wang, Department of Computer Science, California State University,


Fullerton, United States, xwang@fullerton.edu
His main research areas are Mathematics, Computer and Information Science.

Prof. Hong-zhi Wang, Department of Computer Science and Technology, Harbin


Institute of Technology, China, whz_hit@qq.com
His main research area is Dig data.

Professor Cheng-Yuan Tang, Department of Information Management, Huafan


University, New Taipei City, Taiwan, cytang@cc.hfu.edu.tw, chengyuantang@
outlook.com. His main research areas are Computer Science and Information
Engineering.

Technical Program Committee 

Prof. ZhihongQian, Jilin University, Changchun, Jilin, China 


Prof. Jibin Zhao, Shenyang Institute of Automation, Chinese Academy of Science, China 
Prof. LixinGao, Wenzhou university, China
Prof. HungchunChien, Jinwen University of Science and Technology, New Taipei City,
Taiwan
Prof. Huimi Hsu, National Ilan University, Yilan, Taiwan
Prof. Jiannshu Lee, Department of Computer Science and Information Engineering,
National University of Tainan, Tainan, Taiwan
Prof. Chengyuan Tang, Huafan University, New Taipei City, Taiwan
XVIII   Introduction of keynote speakers

Prof. Mingchun Tang, Chongqing University, Microwave Electromagnetic and


Automation, China
Dr. Jing Chen, Computer School of Wuhan University, China
Dr. Yinghua Zhang, Institute of Automation, Chinese Academy of Sciences, China
Dr. Lingtao Zhang, College of Computer Science and Information Technology, Central
South University of Forestry and Technology, China
Dr. Kaiming Bi, School of Civil and Mechanical Engineering, Curtin University, Perth,
Australia
Dr. Jingyu Yang, Faculty of Aerospace Engineering, Shenyang Aerospace University
Shenyang, China
Dr. Dong Wang, School of Information and Communication Engineering, Dalian
University of Technology, Dalian, China
Dr. Kang An, College of Communications Engineering, PLA University of Science and
Technology, Nanjing, China
Dr. Kaifeng Han, Department of Electrical and Electronic Engineering (EEE), The
University of Hong Kong, Hong Kong
Dr. Sri YulisBinti M Amin, UniversitiyTun Hussein Onn Malaysia, BATU PAHAT,
Malaysia
Dr. Longsheng Fu, Northwest A&F University, Yangling, China
Dr. Hui Yang, Beijing University of Posts and Telecommunications, Beijing, China
Dr. T. BhuvanEswari, Faculty of Engineering and Technology, Multimedia University,
Melaka, Malaysia
Dr. Xiangjie Kong, School of Software, Dalian University of Technology, Dalian, China
Dr. Kai Tao Nanyang, Technological University, Singapore
Dr. Lainchyr Hwang, Dept. of Electrical Engineering, I-Shou University, Kaohsiung,
Taiwan
Dr. Yilun Shang, Department of Mathematics, Department of Mathematics, Shanghai,
China
Dr. Thang Trung Nguyen, Ton Duc Thang University, Ho chi Minh, Vietnam
Dr. Chichang Chen, Department of Information Engineering, I-Shou University,
Kaohsiung, Taiwan
Dr. Tomasz Andrysiak, Technology and Science University, Bydgoszcz, Poland
Dr. Rayhwa Wong, Department of Mechanical Eng., Hwa-Hsia University of
Technology, New Taipei City, Taiwan
Dr. Muhammad Naufal Bin Mansor, KampusUnicitiAlam, Universiti Malaysia Perlis
(UniMAP), Sungai Chuchuh, Malaysia
Dr. Michal Kuciej, Faculty of Mechanical Engineering, Bialystok University of
Technology, Bialystok, Poland
Dr. Imran Memon, Zhejiang university, Hangzhou, China
Dr. Yosheng Lin, National Chi Nan University, Nantou, Taiwan
Dr. Zhiyu Jiang, University of Chinese Academy of Sciences, Beijing, China
 Introduction of keynote speakers   XIX

Dr. WananXiong, School of Electronic Engineering, University of Electronic Science


and Technology of China(UESTC), Chengdu, China
Dr. Dandan Ma, University of Chinese Academy of Sciences, Beijing, China
Dr. ChienhungYeh, Department of Photonics, Feng Chia University, Taichung, Taiwan
Dr. Adam Głowacz, AGH University of Science and Technology, Cracow, Poland
Dr. Osama Ahmed Khan, Lahore University of Management Sciences, Lahore, Pakistan
Dr. Xia Peng, Microsoft, Boston, United States
Dr. Andrzej Glowacz, AGH University of Science and Technology, Cracow, Poland
Dr. Zhuo Liu, Computer Science and Software Engineering Department, Auburn
University Auburn, United States
Dr. ZuraidiSaad, Universiti of Teknologi MARA, Shah Alam, Malaysia
Dr. Gopa Sen, Institute for Infocomm Research, Agency for Science Technology and
Research (A*STAR), Singapore
Dr. Minhthai Tran, Ho Chi Minh City University of Foreign Languages and Information
Technology, Ho Chi Minh City, Vietnam
Dr. FatihEmreBoran, Department of Industrial Engineering, Faculty of Engineering,
Gazi University, Ankara, Turkey
Prof. SerdarEthemHamamci, Electrical-Electronics Engineering Department, Inonu
University, Malatya, Turkey
Dr. Fuchien Kao, Da-Yeh University, Zhanghua County, Taiwan
Dr. NoranAzizan Bin Cholan, Faculty of Electrical and Electronics Engineering,
UniversitiTun Hussein Onn Malaysia, BATU PAHAT, Malaysia
Dr. Krzysztof Gdawiec, Institute of Computer Science, University of Silesia, Sosnowiec,
Poland
Dr. Jianzhou Zhao, Cadence Design System, San Jose, United States
Dr. Malka N. Halgamuge, Department of Electrical & Electronic Engineering,
Melbourne School of Engineering, The University of Melbourne, Melbourne, Australia
Dr. Muhammed EnesBayrakdar, Department of Computer Engineering, Duzce
University, Duzce, Turkey
Dr. DeepaliVora, Department of Information Technology, Vidyalankar Institute of
Technology, Mumbai, India
Dr. Xu Wang, Advanced Micro Devices (Shanghai), Co. Ltd, Shanghai, China
Dr. Quanyi Liu, School of Aerospace Engineering, Tsinghua University, Beijing, China
Dr. YiyouHou, Department of Electronic Engineering, Southern Taiwan University of
Science and Technology, Tainan City, Taiwan
Dr. Ahmet H. ERTAS, Biomedical Engineering Department, Karabuk University,
Karabuk, Turkey
Dr. Hui Li, School of Microelectronics and Solid-State Electronics, University of
Electronic Science and Technology of China, UESTC, China
Dr. Zhiqiang Cao, Institute of Automation, Chinese Academy of Sciences multi-robot
systems, intelligent robot control, China
XX   Introduction of keynote speakers

Dr. Hengkai Zhao, School of Communication And Information Engineering, Shanghai


University, China
Dr. Chen Wang, School of Electronic Information and Communications, Huazhong
University of Science and Technology, China
Part I: Computer Science and Information Technology I
Yan-zhao Chen*, Yu-wei Zhang
Research and Development of Upper Limb
Rehabilitation Training Robot
Abstract: This paper studies rehabilitation mechanisms and the inadequateness of
traditional clinical means rehabilitation of stroke patients. Related progress and the
research status of upper limb rehabilitation robot aided rehabilitation are discussed.
The results show that self-training has become a promising rehabilitation method,
sEMG based upper limb motion recognition is becoming a key interaction technology,
and a task-based training mode in the environment combined robot and VR is the
trend of future development.

Keywords: rehabilitation training; robot; hemiplegia patient

1 Introduction

Hemiplegia is usually caused by stroke and other diseases, and blocks the activities
of daily living abilities (ADL) of most patients [1-3]. The patient’s daily work and life
are seriously affected, which bring a burden on society and family. The incidence
of stroke is high, and it has become one of the common causes of human death
[4-6]. Clinical studies have indicated that in the early stage of illness, some exercise
training for patients is helpful to the recovery of the activity of daily living [7-9]. The
commonly used traditional clinical rehabilitation training method is a one to one
style between patients and rehabilitation therapists. Due to the fact that the number
of patients is large, and the number of rehabilitation therapists is limited, the
actual implementation of rehabilitation training, training intensity, training time
and training accuracy and other aspects are not guaranteed, and so the patients’
rehabilitation is poor.
The rehabilitation robot emerged with the development of modern science
and technologies such as robot technology, signal processing, pattern recognition
and clinical technology. Robot aided rehabilitation training can overcome the
shortcomings of traditional clinical rehabilitation methods, facilitate the generation
of new rehabilitation models and ultimately improve rehabilitation [10,11].

*Corresponding author: Yan-zhao Chen, School of Mechanical and Automotive Engineering, Qilu
University of Technology, Jinan, China, chyzh_ql@126.com
Yu-wei Zhang, School of Mechanical and Automotive Engineering, Qilu University of Technology,
Jinan, China, zhangyuwei_scott@126.com
2   Research and Development of Upper Limb Rehabilitation Training Robot

Through the mechanism of stroke rehabilitation research and considering the


clinical deficiency of traditional rehabilitation methods, this article analyzes the
current research development of rehabilitation robots used for assisting patients
doing rehabilitation training. The development status of upper limb rehabilitation
robots and the corresponding recovery mode changes are reviewed.

2 Changes in the Way of Rehabilitation and Rehabilitation


Mechanism
Stroke is a central nervous system disease, its causes are generally a sudden hemor-
rhage or ischemia in the brain, resulting in damage to the cerebral cortex. Thus it
affects the control instruction formation in the central nervous system or blocks the
pathways for nerve control instruction, and eventually leads to the patient’s move-
ment intent formed incorrectly or the nerve control instruction can not transmit to
the movement terminal to achieve movement. The human body’s motor function,
especially the upper limbs are blocked. Medical studies have shown that the human
nervous system has a certain degree of plasticity [12], as well as the ability to re-learn
motor skills [13].
Practice shows that rehabilitation therapy in the early stages of the patient’s sickness
is more conducive to the recovery of their motor function [7], and the most effective way to
promote motor function reconstruction of patients is repeated exercise training [8]. Now
the rehabilitation training methods often used in clinics are traditional, namely the one to
one style between patients and rehabilitation therapists, such a training method has a lot
of drawbacks. The first is place restriction, the process of rehabilitation is generally only
executed in a hospital or rehabilitation center, it lacks without of flexibility. Secondly,
because of the high incidence of stroke, the number of patients is numerous, and the
number of rehabilitation therapists is relatively limited, causing many patients to not get
timely and stable treatment, which affects the rehabilitation. On the other hand, patient
rehabilitation usually takes a long period of repeated training in training processes.
In this case, the workload of rehabilitation therapists will increase and easily lead to
fatigue, training may cause errors, inadequate training efforts and other phenomena,
which will produce adverse effects on rehabilitation patients. Due to the poor effect of
the traditional clinical rehabilitation method, which is not conducive to the promotion of
rehabilitation clinical practice as well as the research on the rehabilitation mechanisms
of stroke patients, a new and modern rehabilitation method is in urgently needed as a
complement to traditional rehabilitation methods to promote the development of clinical
rehabilitation theory and practice. In addition, clinical studies show that patients’
active participation is more helpful to enhance the rehabilitation effect. In this context,
patients’ independent rehabilitation, especially in the home environment becomes a
meaningful self-rehabilitation method, while research on the self-rehabilitation based on
rehabilitation robot has broad applications [14,15].
 Research and Development of Upper Limb Rehabilitation Training Robot   3

The upper arm plays an important role in people’s daily life, so an upper limb
rehabilitation robot assisting the upper limb to execute rehabilitation training in order
to achieve functional reconstruction of the movement is particularly important. An
upper limb rehabilitation robot is a mechanical structure used for assisting patients
with limited upper limb motor function, such as hemiplegic patients to implement
the rehabilitation of their ability of daily living, which is guided by rehabilitation
medicine theory, and based on the integration of robotics, human anatomy as well as
disciplines of computer science and other technologies.
Upper limb rehabilitation robots have good fatigue resistance, have high
controllability, which can achieve high-precision control and ensure safe and reliable
control during the operation process. The manner that upper limb rehabilitation
robot for upper limb rehabilitation training is adopted can change doctor-patient
relationships and provide a new rehabilitation training for patients, and has become
a promising field of research and application.

3 Upper Limb Rehabilitation Robot

With the development of science and technology as well as rehabilitation medicine


theory, the rehabilitation concepts and methods of stroke patients have changed. The
means of upper limb rehabilitation training have transformed from the traditional way
to the robot aided manner, which brings a series of changes from robot mechanism
design of upper limb to rehabilitation mode.

3.1 The Mechanism Design for Rehabilitation Training

In the related research on upper limb rehabilitation robot started earlier, in the design
of mechanical structure, some efforts have been made by many scholars in related
fields, the researchers designed a variety of rehabilitation training devices, from an
early simple assisted rehabilitation tool with single degree of freedom to an automated
multi-DOF rehabilitation robot. The early rehabilitation training device appears, such
as hand-object-hand in master-slave means developed at the American University of
Pennsylvania [16], which can assist a patient’s hand do simple movements with the
mirror training. Shortly after, Researchers in Stanford University designed a series of
upper limb rehabilitation devices to assist upper limbs to do rehabilitation training,
known as Mirror-image Motion Enable [17]. In recent years, the development of upper
limb rehabilitation robots tend to more freedom, intelligence and porTable. It is more
and more user-friendly, and the wearable style has become a research trend. Arizona
State University developed an upper limb rehabilitation training mechanical structure
called Robot Upper Extremity Repetitive Therapy Device [18]. Researchers of the
University of Washington studied wearable neurological rehabilitation exoskeletons
4   Research and Development of Upper Limb Rehabilitation Training Robot

robots called Cable-actuated Dexterous Exoskeleton for Neurorehabilitation [19], and


so on.

3.2 Interactive Mode

A rehabilitation robot is a mechanical device used to assist patient rehabilitation, its


interaction with the patient is an important aspect of this rehabilitation. Since it is
usually one body side of the patient with stroke that has lost voluntary movement
functions. Guiding disabled upper limbs to do movement using the upper limb on the
healthy side become a viable rehabilitation training manner, at the same time become
a trend. The general process of this approach firstly identifies the movements of the
healthy arm, and then, converts the recognition result to rehabilitation robot motion
control instruction and drives the robot, at last, the disabled upper limb executes
movement with the aid of the robot in order to achieve rehabilitation. The action of
the healthy arm becomes one of the core technologies.
Since the 1980’s, the motion tracking used for rehabilitation has become a hot
area of research. Motion recognition tracking technologies currently available for
rehabilitation can be summarized into three classes:
First, the tracking technologies for body motion based on physical sensor.
This type of technology mainly refers to adopting various physical sensors for
human movement identification and tracking. The physical sensors commonly used
include gyroscopes, acceleration sensors, gravity sensors, acceleration sensors, etc.
[20,21].
This can get physical parameters of the upper limb movement directly, such
as posture, freedom of movement, velocity and acceleration. However, when using
the contact sensors to track body movements, it necessary to install the sensors in
the human body permanently, due to the special nature of human physiological
structure, the sensor is not easy to mount with the location and angle are difficult to
fix. Moreover, due to the randomness of body movement, such physical sensors are
prone to shift, delay and jitter as well as generate other issues. Thus, human motion
tracking method based on contact sensors does not apply to robot-assisted upper
limb rehabilitation.
Second, the tracking technologies for body motion based on the non-contact
physical sensor.
Since the tracking technology for body motion based on the non-contact sensors
does not require direct contact with the human body, the problems such as offset and
installation are not be generated. The non-contact physical sensor most commonly
used is based on optical devices such as Kinect, etc. [21,22]. Such technology is more
convenient, however, in the process of human motion tracking and identification
by optical sensor, the body location needs to be stable, which means a lack of
flexibility. There are also special requirements for the environment and the light
 Research and Development of Upper Limb Rehabilitation Training Robot   5

level of application areas. In addition, it produces body part overlap, occlusion and
other issues, especially with noisy backgrounds, such as in the home environment
of remote rehabilitation, the motion recognition results are difficult to achieve at the
desired level [23]. Moreover, 3D positioning requires high-precision mathematical
calculations, which will result in delay and other problems and it is difficult to ensure
real-time performance. So the technology is not applicable to this kind of robot-
assisted upper limb rehabilitation.
Third, there is physiological signals based motion pattern recognition technology.
In addition to physical sensors, a physiological signal as an emerging tool
has been brought to the field of human motion tracking with pattern recognition
as its core technology. The physiological signals commonly used are mainly EMG
(Electromyography) [24], EEG (Electroencephalogram) [25] and so on. EMG signal
is the most commonly used. According to the work mode, an EMG signal can be
divided into a needle electromyography signal and a surface electromyography signal
(sEMG). The signal acquisition by needle electromyography needs to insert the needle
electrodes into the muscle inside, which is inconvenient and will result in trauma
to patients. The signal acquisition by sEMG just needs to stick the electrodes to the
surface of corresponding muscles at the skin surface, with a non-invasive, real-time,
wide collection area, and sEMG is weak potential difference signal collected on the
skin surface, the signal is rich in information on body movements, and capable of
reacting to human movement intent. Therefore, sEMG based human motion tracking
is more suitable for clinical application, and this is the reason why it has received
widespread attention and study.
Pattern recognition is usually used to establish the relation between sEMG and
the upper limb motions. Its process can be described as follows: first perform the
specified upper limb movement, while collecting the sEMG from the corresponding
muscle at the skin surface; and then, perform a signal pretreatment which includes
filtering and amplification, followed by the feature extraction; training pattern
classifier and implement motion classification. The motion recognition results can be
used as upper limb rehabilitation robot motion control instructions.
In summary, these three techniques can be used to track the motion, EMG,
particularly sEMG is more in line with the special nature of the patient’s physiological
state, which is more suitable for clinical application, thus it is becoming a promising
field for rehabilitation medicine and technology research.

3.3 Rehabilitation Mode

The introduction of rehabilitation robot brings changes in rehabilitation method.


The purpose of rehabilitation is to restore the patient’s activities, and a rehabilitation
robot supported rehabilitation training is possible to promote the structural recovery
of their nervous system, but it is not easy to transfer this restoration to functional
6   Research and Development of Upper Limb Rehabilitation Training Robot

recovery [26]. Under normal circumstances, although the patient’s neural pathways
is opened to some extent, it is still not able to complete some functional movements
independently, such as picking up a cup, this phenomenon called “learned disuse”
[27]. The reason for this phenomenon may be that the previous training is only
mechanical training without functional objectives.
Studies have shown that, task-based training can induce the transition from
structural recovery to functional recovery of nervous system, and then, promote the
functional rehabilitation of patients and rebuild their activities of daily living (ADL) [1].
Meanwhile, introducing the virtual reality technology into the field of rehabilitation
medicine and letting patients performing task-based training in realistic virtual scene
can enhance their training initiative, which will benefit their functional recovery and
be more conducive to the recovery of their activities of daily living. In particular, with
technological advances, remote recovery, especially in the home environment has
drawn increasing attention. Virtual reality based task training provides support for the
rehabilitation in this model. Combining the upper limb rehabilitation robotics, virtual
reality technology and sEMG pattern recognition technology to achieve rehabilitation
of patients with hemiplegia becomes a viable approach. In 2011, the researchers
presented a robot ARMin III supported ADL rehabilitation training systems [28], and
designed variety of tasks in virtual reality scene for patients’ ADL training, such as
cooking.

4 Conclusion

Stroke is a disease which has become one of the major causes of death. The voluntary
movement ability of patient’s body, especially the upper limb is usually impaired,
the activities of daily living are impeded. The traditional clinical rehabilitation
method depends on the rehabilitation therapist and hospital with a lot of drawbacks
and restrictions. With the development of science and technology, robot-assisted
rehabilitation research has created enthusiasm. The mechanism design from the early
single degree of freedom and simple function to multi-degree of freedom, portability,
automation, intelligent. In the aspect of interaction mode, the sEMG emerged as its
various advantages. The rehabilitation method changed from a traditional mode with
fixed time, fixed location, fixed form to remote recovery, family rehabilitation and
other diversified methods. By the aid of a robot in a virtual reality environment, a
task-based rehabilitation method is promising for future development.

Acknowledgment: The authors are thankful to the Higher Educational Science and
Technology Program of Shandong Province, China (No.J15LB01) and the Natural
Science Foundation of Shandong Province, China (ZR2014EEQ029, ZR2015FM021) in
carrying out this research for support.
 Research and Development of Upper Limb Rehabilitation Training Robot   7

References
[1] Trotti, C., Menegoni, F., Baudo, S., Bigoni, M., Galli, M., and Mauro, A.: “Virtual reality for the
upper limb rehabilitation in stroke: A case report”, Gait & Posture, 2009, 30, Supplement
1, (0), pp. S42.
[2] La, C., Young, B.M., Garcia-Ramos, C., Nair, V.A., and Prabhakaran, V.: “Chapter Twenty -
Characterizing Recovery of the Human Brain following Stroke: Evidence from fMRI Studies”,
in Seeman, P., and Madras, B. (Eds.): “Imaging of the Human Brain in Health and Disease”
(Academic Press, 2014), pp. 485-506.
[3] Sumida, M., Fujimoto, M., Tokuhiro, A., Tominaga, T., Magara, A., and Uchida, R.: “Early rehabi-
litation effect for traumatic spinal cord injury”, Archives of physical medicine and rehabilitation,
2001, 82, (3), pp. 391-395.
[4] Kwakkel, G., Kollen, B.J., and Krebs, H.I.: “Effects of robot-assisted therapy on upper limb
recovery after stroke: A systematic review”, Neurorehabil. Neural Repair, 2008, 22, (2), pp.
111-121.
[5] Prange, G.B., Jannink, M.J., Groothuis-Oudshoorn, C.G., Hermens, H.J., and Ijzerman, M.J.:
“Systematic review of the effect of robot-aided therapy on recovery of the hemiparetic arm after
stroke”, Journal of rehabilitation research and development, 2006, 43, (2), pp. 171-184.
[6] gang, S.X., li, W.Y., Zhang, N., te, W., hai, L.Y., Jin, X., juan, L.n., and Feng, J.: “Incidence and
trends of stroke and its subtypes in Changsha, China from 2005 to 2011”, Journal of Clinical
Neuroscience, 2014, 21, (3), pp. 436 – 440.
[7] Kwakkel, G., Kollen, B., and Twisk, J.: “Impact of time on improvement of outcome after stroke”,
Stroke, 2006, 37, (9), pp. 2348 – 2353.
[8] Langhorne, P., Bernhardt, J., and Kwakkel, G.: “Stroke rehabilitation”, The Lancet, 377, (9778),
pp. 1693-1702.
[9] Patton, J., Stoykov, M., Kovic, M., and Mussa-Ivaldi, F.: “Evaluation of robotic training forces
that either enhance or reduce error in chronic hemiparetic stroke survivors”, Exp. Brain Res.,
2006, 168, (3), pp. 368-383.
[10] An-Chih, T., Tsung-Han, H., Jer-Junn, L., and Te, L.T.: “A comparison of upper-limb motion
pattern recognition using EMG signals during dynamic and isometric muscle contractions”,
Biomedical Signal Processing and Control, 2014, 11, pp. 17 - 26.
[11] Morris, J.H., and Wijck, F.V.: “Responses of the less affected arm to bilateral upper limb task
training in early rehabilitation after stroke: a randomized controlled trial”, Archives of physical
medicine and rehabilitation, 2012, 93, (7), pp. 1129 – 1137.
[12] Howell, M.D., and Gottschall, P.E.: “Lectican proteoglycans, their cleaving metalloproteinases,
and plasticity in the central nervous system extracellular microenvironment”, Neuroscience,
2012, 217, (0), pp. 6-18.
[13] Carr, J.H., and Shepherd, R.B.: “A Motor Learning Model for Stroke Rehabilitation”, Physio-
therapy, 1989, 75, (7), pp. 372-380.
[14] Takahashi, C.D., Der-Yeghiaian, L., Le, V., Motiwala, R.R., and Cramer, S.C.: “Robot-based hand
motor therapy after stroke”, Brain, 2008, 131, pp. 425-437.
[15] Zollo, L., Rossini, L., Bravi, M., Magrone, G., Sterzi, S., and Guglielmelli, E.: “Quantitative
evaluation of upper-limb motor control in robot-aided rehabilitation”, Medical & Biological
Engineering & Computing, 2011, 49, (10), pp. 1131-1144.
[16] Lum, S.P., Reinkensmeyer, D.J., and Lehman, S.L.: “Robotic assist devices for bimanual physical
therapy: preliminary experiments”, IEEE Transactions on Rehabilitation Engineering, 1993, 1,
(3), pp. 185 - 191.
8   Research and Development of Upper Limb Rehabilitation Training Robot

[17] Lum, P.S., Burgar, C.G., and Shor, P.C.: “Evidence for improved muscle activation patterns after
retraining of reaching movements with the MIME robotic system in subjects with post-stroke
hemiparesis”, IEEE Trans. Neural Syst. Rehabil. Eng., 2004, 12, (2), pp. 186 – 194.
[18] Sugar, T.G., ping, H.j., Koeneman, E.J., Koeneman, J.B., Herman, R., H, H., Schultz, R.S.,
Herring, D.E., Wanberg, J., Balasubramanian, S., Swenson, P., and Ward, J.A.: “Design and
Control of RUPERT: A Device for Robotic Upper Extremity Repetitive Therapy”, IEEE Trans. Neural
Syst. Rehabil. Eng., 2007, 15, (3), pp. 336-346.
[19] Perry, J.C., Powell, J.M., and Rosen, J.: “Isotropy of an upper limb exoskeleton and the
kinematics and dynamics of the human arm”, Applied Bionics and Biomechanics, 2009, 6, (2),
pp. 175 – 191.
[20] Pastor, I., Hayes, H.A., and Bamberg, S.J.M.: “A feasibility study of an upper limb rehabilitation
system using kinect and computer games”, in Editor (Ed.)^(Eds.): “Book A feasibility study of
an upper limb rehabilitation system using kinect and computer games” (Institute of Electrical
and Electronics Engineers Inc., 2012, edn.), pp. 1286-1289.
[21] Chang, C.-Y., Lange, B., Zhang, M., Koenig, S., Requejo, P., Somboon, N., Sawchuk, A.A., and
Rizzo, A.A.: “Towards pervasive physical rehabilitation using microsoft kinect”, in Editor
(Ed.)^(Eds.): “Book Towards pervasive physical rehabilitation using microsoft kinect” (IEEE
Computer Society, 2012, edn.), pp. 159-162.
[22] Pogrzeba, L., Wacker, M., and Jung, B.: “Potentials of a low-cost motion analysis system for
exergames in rehabilitation and sports medicine”, in Editor (Ed.)^(Eds.): “Book Potentials
of a low-cost motion analysis system for exergames in rehabilitation and sports medicine”
(Springer Verlag, 2012, edn.), pp. 125-133.
[23] Sturman, D.J., and Zeltzer, D.: “A survey of glove-based input”, Computer Graphics and
Applications, 1994, 14, (1), pp. 30-39.
[24] Kamavuako, E.N., Scheme, E.J., and Englehart, K.B.: “Combined surface and intramuscular EMG
for improved real-time myoelectric control performance”, Biomedical Signal Processing and
Control, 2014, 10, (3), pp. 102 – 107.
[25] Dhiman, R., Saini, J.S., and Priyanka: “Genetic algorithms tuned expert model for detection of
epileptic seizures from EEG signatures”, Applied Soft Computing, 2014, 19, pp. 8 – 17.
[26] Y, B., Y, H., Y, W., Y, Z., Q, H., C, J., L, S., and W, F.: “A prospective, randomized, single-blinded
trial on the effect of early rehabilitation on daily activities and motor function of patients with
hemorrhagic stroke”, Journal of clinical neuroscience : official journal of the Neurosurgical
Society of Australasia, 2012, 19, (10), pp. 1376-1379.
[27] Peper, E., Harvey, R., and Takabayashi, N.: “Biofeedback an evidence based approach in clinical
practice”, Japanese Journal of Biofeedback Research, 2009, 36, (1), pp. 3-10.
[28] Guidali, M., Duschau-Wicke, A., Broggi, S., Klamroth-Marganska, V., Nef, T., and Riener, R.: “A
robotic system to train activities of daily living in a virtual environment”, Medical and Biological
Engineering and Computing, 2011, pp. 1-11.
Xiao-fei CUI, Ya-dong WANG, Guang-ri QUAN, Yong-dong XU*
K-mer Similarity: a Rapid Similarity Search Algorithm
for Probe Design
Abstract: Actual hybridization is performed on a global identity scenario. However,
searching for sequences similar to a given sequence in a large data set is very
challenging. This is especially true for global alignment. A local alignment algorithm
BLAST or semi-global algorithm Myers’ bit-vector algorithm is used to instead in most
cases. We introduce a novel global alignment method in this paper. It computes the
same alignment as a certain dynamic programming algorithm, while executing over
60 times faster on appropriate data. Its high accuracy and speed makes it a better
choice for the alignment of probe design.

Keywords: component; probe design; global alignment; fast

1 Introduction

Sequence alignment algorithms are very important to bioinformatics applications.


They can be divided into 3 categories, namely global, semi-global, and local [1]. A
general global alignment technique is the Needleman-Wunsch algorithm [2], which
is based on dynamic programming. It can produce the optimal alignment of two
sequences, but the high time complexity makes it inappropriate for the comparison
of huge data set. The most widely used local alignment algorithm is BLAST [3]. Its
emphasis on speed makes the algorithm practical on the huge genome databases
currently available [4,5]. But, it cannot guarantee the optimal alignments of the
query and databases sequences. Myers’ bit-vector algorithm is the fastest semi-global
algorithm [6]. However, it has its limitations too.
Sequence alignment is a component of probe design tools. Most probe design
tools calculate identities using local alignment algorithm such as BLAST [7-10]. There
are also some software [11] using semi-global algorithm such as bit-vector. However,
actual hybridization is performed on a global identity scenario [12]. In this paper we
present a new global alignment method to find the most similar sequences with the
same length of the query in a huge data set.

*Corresponding author: Yong-dong XU, School of Computer Science & Technology, Harbin Institute of
Technology at Weihai, Weihai, China, ydxu@insun.hit.edu.cn
Xiao-fei CUI, Ya-dong WANG, Guang-ri QUAN, School of Computer Science & Technology, Harbin
Institute of Technology at Weihai, Weihai, China
10   K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design

2 Method

Our goal is to find all the subsequences in a huge data set whose identity are greater
than MI to the query sequence. And the length of them is same with the query
sequence. In order to achieve this goal, we first split each sequence in the data set
into L-mer fragments. L is the length of the query sequence. The comparison steps
between the query sequence and each L-mer is described below.

2.1 Preliminaries

Let A=a1a2…an and B=b1b2…bn be DNA sequences. We are given a positive threshold
MI ≥ 0. The problem is to determine whether the identity of the two sequences is greater
than MI or not. The identity between A and B is defined as I(A,B) = (the number of
optimal matches between A and B) / (the sequence’s length L).

2.2 The Basic Algorithm

There are two parts of our method. First, the comparison pairs that cannot be more similar
than MI are filtered out with a k-tuple method. Second, for the remaining comparison
pairs, a modified greedy algorithm is used to make further determination. Both the k-tuple
method and the greedy algorithm are not completely new. Our main contribution lies in
the joint use of the two algorithms, the estimation of the filtering parameters of the k-tuple
algorithm and to deal with the global alignment problem with this method.

2.2.1 The look-up Table


In the first step, the comparison pairs are filtered according to the total number of exact
match k-tuple. Firstly, a lookup Table [13] of A is constructed to locate the identical k-tuple
segments rapidly. Any k-tuple that consists of the characters in alpha={A,C,G,T,a,c,g,t}
is converted to an integer between 0 and 4k where A/a equals to 0, C/c equals to 1, G/g
equals to 2 and T/t equals to 3. If a character of the k-tuple is not in alpha, the k-tuple
is converted to -1. The k-tuple converted to -1 is not considered in the later calculation.
Then, an array C of length 4k that consists of pointers set initially to nil is used. In a
single pass through A, each position i is added to the list that the pointer at C(ic) point
to, where ic is the coded form of the k-tuple beginning at i in A.

2.2.2 The filter step


In conjunction with the lookup Table, the number of exact match k-tuple between A and
B match is counted. In a single pass through B, the k-tuple tj beginning at each position
j is coded to jc. If there is an element pos in the list of C(jc) which is constructed from
sequence A that makes |pos - j| < W, let match = match + 1. W is the window size. A and
 K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design   11

B is reported no more similar than MI, if match < MinNum. Otherwise, a modified greedy
algorithm is used. The parameter MinNum is discussed later.

2.2.3 The modified greedy algorithm


The greedy algorithm used for further determination is especially suitable for the
alignment between two high similar sequences. It is significantly faster than the
traditional dynamic algorithm. The original one is firstly described in paper [14]. In
the initial algorithm, they use a user-specified score X for pruning. In order to translate
back and forth between an alignment’s score and the number of its difference, the
alignment scoring parameters is constrained by ind = mis – mat / 2, where mat > 0,
mis < 0 and ind < 0 are the scores for a match, mismatch and insertion/deletion. We
use the maximum differences D for pruning. A difference is defined as an insertion, a
deletion or a mismatch. This is more useful in some cases such as probe design. The
algorithm shows in Figure 1. R(d, k) is the x-coordinate of the last position on diagonal
k. d is the number of difference of the comparison. L and U are used for pruning. N is
the length of sequence A and B.
1. i? 0
2. while i < N and ai+1 = bi+1 do i ? i+1
3. R(0,0) ? i
4. d? L? U? 0
5. Dhalf =  D / 2 

6. repeat
7. d ? d+ 1
8. if L < 1 – Dhalf then
9. L ? 1 – Dhalf
10. if U > Dhalf – 1 then
11. U ? Dhalf – 1
12. for k ? L – 1 to U + 1 do

 R( d − 1, k − 1) + 1, if L < k ,

13. i ← max  R( d − 1, k ) + 1, if L ≤ k ≤ U ,
 R( d − 1, k + 1), if k <U.

14. j? i–k
15. if i > –8 then
16. while i < N, j < N and a i+1 = bj+1 do
17. i ? i + 1; j ? j + 1
18. R(d,k) ? i
19. else R(d,k) ? –8
20. if k = 0 and R(d,k) = N then
21. report similar
22. if d = D and R(d,0) < N then
23. report not similar
24. L? L –1
25. U? U+1

Figure 1. The modified greedy algorithm.


12   K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design

2.3 Parameters Estimate

Two parameters, the sequence’s length L and the similarity threshold MI, should be
specified by the custom. Then, all the other parameters can be estimated automatically.

2.3.1 The maximum difference parameter D


The identity between A and B greater than MI equals to the number of differences
between them smaller than D. D is calculated with (1).

D = ⎣(1 – MI) × L ⎦ (1)

2.3.2 The parameter K


To obtain a result with no accuracy loss, the parameter k should be small enough. But
to achieve a great speed, k should be as great as it can. We get the parameter k by (2).
L is the length of sequences in comparison and D is the maximum difference between
them. This value is small enough to obtain at least one exact k-tuple match between
the two sequences. And, it is the greatest value can be used with no accuracy loss.

k = ⎡(L – D) /(D + 1)⎤ (2)

2.3.3 The minimum k-tuple matches MinNum


The minimum number of k-tuple match MinNum is calculated by Lemma 1.
LEMMA1. Suppose two sequences with length L, there are D differences between
them. Then, they have at least MinNum = L – k × (D + 1) + 1 k-tuple matches.
PROOF. Slit the two sequences into identical regions by differences as Figure 2. In the
Figure, red represents the identical regions. Yellow represents the differences. The
length of each region is k–1. Then, the two sequences have no identical k-tuple with a
minimum difference d.

Figure 2. The divided alignment.

Suppose there are only k-tuple matches on the 0 diagonal, the minimum number of
k-tuple match is L – k × D – (k – 1). And it equals to L – k × (D + 1) + 1.
 K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design   13

The maximum continuous stretch should be less than a threshold MCS (maximum
contiguous stretches) in probe design. Lemma2 is used to deal with this situation.
LEMMA2. Suppose two sequences with length L, there are D differences between
them and they have no MCS contiguous stretches. Then, they have at least MinNum =
L – k × (D + 1) + 1 k-tuple match.
PROOF. Think about the scenario of above. Borrow x differences from the regions
before the last one as Figure 3. In the Figure, red represents the identical regions.
Yellow represents the differences. The last region meets (x + 1) × (MCS – 1) ≥ L – k × D
+ (k – 1) × x. x can be calculated by (3).

L – k × D –MCS + 1
x= ⎡ MCS – k ⎤ (3)
Then, the minimum number of k-tuple matches with no MCS contiguous stretches can
be calculated with (4).

MinNum = (MCS – 1 – (k – 1)) × x + L – k × D + (k – 1) (4)

× x – (MCS – 1) × x – (k – 1) = L – k × (D + 1) + 1

Figure 3. The divided alignment with contiguous stretches.

3 Results and Discussion

In this section, we first conduct some experiments to select the best parameter K for
KS (K-mer Similarity algorithm). Then, a sequence of data sets are used to convince
the performance of KS. Finally, the comparison between KS and other alignment
algorithms which are widely used in probe design is done.

3.1 Examination of KS Algorithm.

3.1.1 The selection of parameter K


We have implemented a C++ program of KS. On a data set of 100 sequences with
610 50-mer candidate probes after the MCS=20 filter, there are 34874402 pairs total
comparisons. Our program is used to select the pairs which similarity are higher than
14   K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design

MI (MI=0.85). The experiment is done on a PC with 2.60GHz Inter i5-3230M CPU and
4.00GB memory.
The execution time of KS with different parameter K are shown in Figure 4. It is
obvious that the greater K is, the less time it costs. To obtain a better performance,
the parameter K should be as greater as it can. However, the optimal alignment of the
query and subject sequences cannot be guaranteed with such a large K. We use (2) to
select the greatest K with no accuracy loss. It guarantees one exact k-tuple match for
two sequences of length L and the difference of D at least.

Figure 4. The relationship between K and execution time.

3.1.2 Efficiency of each part


We have implemented a stand-alone modified greedy algorithm and a dynamic
algorithm to do the same task as KS. The execution time of them are shown in Figure 5.
As can be seen from the Figure, the efficiency of the greedy algorithm is almost 20
times better than dynamic algorithm.
The execution time of the stand-alone greedy algorithm and KS algorithm are
shown in Figure 6. As we can see, the execution time of greedy algorithm is about 3
times of KS. It indicates that the efficiency of the algorithm is improved greatly after
filtering.
 K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design   15

Figure 5. The execution time of greedy and DP algorithm.

Figure 6. The execution time of greedy and KS algorithm.

The KS’s performance with different data size


We have tested the performance of KS on different size of data sets. The filter rate and
execution time are shown in Table 1. The filter rate is stable, above 99%. It guarantees
less execution time of KS than the dynamic algorithm.
16   K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design

Table 1. The Filter Rate and Execution Time of KS on Different Data Sets.

No. Total Pairs Filtered Pairs Filter rate [%] execution time [s]

1 13326339 13232473 99.296 11.194


2 16984412 16869880 99.326 14.225
3 26024135 25857935 99.361 21.873
4 38849145 38610875 99.387 32.744
5 248038281 246759999 99.485 211.987
6 1279717970 1269328593 99.188 1172.455
7 3.55891e+009 3.53145e+009 99.228 3053.496

The relationship between execution time and the data size is shown in Figure 7. The
line indicates that the execution time of KS algorithm is proportional to the data size.

Figure 7. The relationship between KS’s execution time and data size.

3.2 Compared with Other Algorithms

In terms of accuracy and efficiency, KS algorithm is compared with two commonly


used algorithms in probe design, BLAST and Myer’s bit-vector algorithm. Given a set
of candidate probes and a set of subject sequences, the three algorithms are used to
select the final probes. A candidate probe can be a final probe when the similarity
 K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design   17

between it and each subject sequence is less than MI=0.85. Two sets of candidate
probes are used, one has 610 candidate probes, and the other has 4383. The length
of the candidate probes is 50. Five sets of subject sequences are used, the number
of sequences are 100, 200, 500, 1000 and 10000. Because actual hybridization is
performed on a global identity scenario [12], the results of dynamic algorithm are used
as the standard probes.

3.2.1 The comparison of accuracy


We designed 7 experiments, the word size of BLAST is the same as KS algorithm to
guarantee optimal alignments. The word size used is 6. The results are shown in
Table 2. DP means the results of dynamic algorithm, KS means the results of K-mer
Similarity algorithm, Myers means the results of the bit-vector algorithm. The KS
algorithm obtained the same result as DP did in all the 7 experiments. Myers lost some
standard probes in the 6th and 7th experiments. BLAST designed more probes than DP
in all 7 experiments and lost some standard probes in the 6th and 7th experiments.

Table 2. The Probe Results of DP, BLAST, KS and Myers.

No. subject sequences set candidate probes set DP BLAST KS Myers

1 100 610 89 204 89 89

2 200 610 64 169 64 64

3 500 610 55 159 55 55

4 1000 610 55 125 55 55

5 10000 610 42 66 42 42

6 1000 4383 1810 2559 1810 1786

7 10000 4383 267 869 267 256

The results of BLAST and DP algorithm and the number of same results of them are
shown in Table 3. The difference of the results is mainly because of the different
similarity measurements of the two algorithms. In the BLAST algorithm, the similarity
between two sequences is defined as I(A,B) = (the number of optimal match bases)
/ (the length of the alignment result). In the alignment process, an insertion in the
query sequence is equal to a deletion in the subject sequence at the same position.
Similarly, an insertion in the subject sequence is equal to a deletion in the query
sequence. In the generation of an alignment, BLAST uses insertion to replace the
deletion. It is confusing sometimes that the similarity of the one with more optimal
match bases to the query is less than one with less optimal match bases. As Figure 8
shows, the number of optimal match bases is 45 in (a) and its similarity is 90.0%. The
number of optimal match bases is 46 in (b), while its similarity is 88.5%. Because of
18   K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design

this, the BLAST similarity of some high similar pairs is less than it really is. Then, we
obtain some final probes which should not be. In addition, the actual length of the
query and subject sequences can be different in the final alignment of BLAST. This
can cause the greater similarity between the query sequence and subject sequence
than the global alignment with the same length sometime. In Figure 9, the similarity
is greater than the global alignment between two sequences with the same length
because of the shorter actual subject sequence in alignment. This causes the losing of
some standard probes of BLAST compared with DP algorithm. Myers lost some probes
too. The reason is same as BLAST. After introducing the deletion, there are multiple
alignment results for the equivalent between insertion in one sequence to a deletion
in the other sequence. Therefore, one may get different similarity values for the same
pair in comparison.

Table 3. The Probe Results Comparison Between DP and BLAST.

No. subject sequences set candidate probes set DP BLAST the number of same results

1 100 610 89 204 89

2 200 610 64 169 64

3 500 610 55 159 55

4 1000 610 55 125 55

5 10000 610 42 66 42

6 1000 4383 1810 2559 1806

7 10000 4383 267 869 261

Figure 8. The similarity confusion of BLAST.

Figure 9. The higher similarity case BLAST than dynamic algorithm.


 K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design   19

In order to overcome the confusions above, we constraint the insertion and deletion
in the subject sequence in Myers, DP and KS algorithm. The similarity is calculated
by the formula: I(A,B) = (the number of optimal match bases) / (the length of query
sequence). In this way, we can not only guarantee that there is only one similarity
value for one comparison but also ensure that the more optimal matches are the
greater the similarity.

3.2.2 The comparison of efficiency


The execution time of different algorithms in the above 7 experiments, are shown in
Figure 10. The BLAST program used is the blast program in the ncbi-blast-2.2.31+.

0 100 200 300 400 500 600 700 800

Myers BLAST KS DP

Figure 10. The execution time of Myers, BLAST, KS and DP on different data sets.

As the Figure shows, KS algorithm is faster than BLAST and the execution time of the
DP algorithm is about 64 times larger than the KS algorithm. Myers algorithm is the
fastest and is about 9 times faster than the KS algorithm.

4 Conclusion

The alignment problem of probe design is a problem of global alignment. The well-
known Needleman-Wunsch algorithm is not efficient enough for large data set. The
widely used local alignment algorithm BLAST and the fast Myers’ bit-vector algorithm
may fail some times. We have introduced a novel global alignment method called
KS. It consists of two parts: the filter step and the greedy algorithm. With above 99%
low similar comparison pairs filtered by the first step, KS runs 3 times faster. The
modified greedy algorithm is very efficient for the alignment of high similar pairs.
20   K-mer Similarity: a Rapid Similarity Search Algorithm for Probe Design

It is about 20 times faster than the traditional dynamic programming algorithm. With
the combination of the two method, our algorithm executes over 60 times faster on
appropriate data. The high accuracy and speed makes KS a better choice for probe
design.

Acknowledgment: This work is supported by China National Natural Science


Foundation (61172099).

References
[1] J. Daily, “Parasail: SIMD C library for global, semi-global, and local pairwise sequence
alignments,” BMC Bioinformatics. vol. 16(1), pp. 81, 2016.
[2] S.B. Needleman, and C.D. Wunsch, “A general method applicable to search for similarities in
amino acid sequence of 2 proteins, J. Journal of Molecular Biology,” vol. 48(3), pp. 443-453,
1970.
[3] S.F. Altschul, W. Gish, W. Miller, E.W. Myers, and D.J. Lipman, “Basic local alignment search
tool,” Journal of Molecular Biology. vol. 215(3), pp. 403-410, 1990.
[4] R. Ariyadasa, and N. Stein, “A sequence-ready physical map of barley anchored genetically by
two million single-nucleotide polymorphisms,” Plant Physiology. vol. 164(1), pp. 412-423, 2013.
[5] M. Pfeifer, K.G. Kugler, S.R. Sandve, B. Zhan, H. Rudi, and T.R. Hvidsten, “Genome interplay in
the grain transcriptome of hexaploid bread wheat,” Science. vol. 345(6194), 1250091, 2014.
[6] G. Myers, “A fast bit-vector algorithm for approximate string matching based on dynamic
programming,” Journal of the ACM (JACM). vol. 46(3), pp. 395-415, 1999.
[7] S. Terrat, E. Peyretaillade, O. Goncalves, E. Dugat-Bony, F. Gravelat, A. Mone, et al., “Detecting
variants with metabolic design, a new software tool to design probes for explorative functional
DNA microarray development,” BMC Bioinformatics. vol. 11(3), pp. 2611-2619, 2010.
[8] E. Dugat-Bony, M. Missaoui, E. Peyretaillade, C. Biderrepetit, O. Bouzid, and C. Gouinaud,
“HiSpOD: probe design for functional DNA microarrays,” Bioinformatics. vol. 27(5), pp.
641-648, 2011.
[9] X. Wang, and B. Seed, “Selection of oligonucleotide probes for protein coding sequences,”
Bioinformatics. vol. 19(7), pp. 796-802, 2003.
[10] S.H. Chen, C.Z. Lo, S.Y. Su, B.H. Kuo, A.H. Chao, and C.Y. Lin, “UPS 2.0: unique probe selector
for probe design and oligonucleotide microarrays at the pangenomic/genomic level,” BMC
Genomics. vol. 11(Suppl 4), pp. 325, 2010.
[11] X. Li, Z. He, and J. Zhou, “Selection of optimal oligonucleotide probes for microarrays using
multiple criteria, global alignment and parameter estimation,”Nucleic Acids Research. vol.
33(19), pp. 6114-6123, 2005.
[12] M.D. Kane, T.A. Jatkoe, C.R. Stumpf, J. Lu, J.D. Thomas, and S.J. Madore, “Assessment of the
sensitivity and specificity of oligonucleotide (50mer) microarrays,” Nucleic Acids Research. vol.
28(22), pp. 4552-4557, 2000.
[13] J.P. Dumas, and J. Ninio, “Efficient algorithms for folding and comparing nucleic acid
sequences,” Nucleic Acids Research. vol. 10(1), pp. 197-206, 1982.
[14] Z. Zhang, S. Schwartz, L. Wagner, and W. Miller, “A greedy algorithm for aligning DNA
sequences,” Journal of Computational Biology. vol. 7(1-2), pp. 203-214, 2000.
Hong ZHANG, Ling-ling ZHANG*
Research and Prospects of Large-scale Online
Education Pattern
Abstract: MOOC (Massive Open Online Courses) have been studied a lot by academia,
industry and business since 2012. With the quick development of MOOC, there are
still a lot of challenges to solve. Firstly, this paper gives a definition and explanation
of MOOC based on current research. Then, we list the problems hindering the
development of MOOC by analysis of the popularity of MOOC. At last, we propose our
perspectives and advice of MOOC through the study of these challenges and the future
of it. Therefore, this paper will refer to the development of MOOC and itself.

Keywords: MOOC; Current Research; Operation mode; Challenges

1 Introduction

MOOC (Massive Open Online Course) is a term proposed by Canadian scholar Dave
Cormier and Bryan Alexander in 2008. It has been developed as a new online
education pattern. Although it was proposed recently, its development is not inferior
to the rise of the traditional education. MOOC is well-known because of its rise in 2012
and its global popularity in 2013. It has become a beacon of the global development
of Internet education. The reason of this quick development is the idea that MOOC
satisfies the expectation of customers. There is an example of edX which is a free
learning pattern. In general, people can learn what they want at anywhere and
anytime without expensive cost. Most MOOC courses are free and provide various
learning sources for a earner. Online patterns give everyone a chance to touch the
famous teachers around the world. They also provides a platform for the learners to
communicate [1]. The development of MOOC has a determinable effect on the reform
of traditional education. This paper introduces the leading MOOCs around the world
and compares the difference. We illustrate the advantages of MOOCs and discuss
the specific operating process of MOOCs by examing two aspects. We analyzed the
challenges and opportunities of MOOC at last.

*Corresponding author: Ling-ling ZHANG, Department of Computer Science and Technology, Beihua
University, Jilin, China, E-mail: 373752782@qq.com
Hong ZHANG, Academic Affair Department, Beihua University, Jilin, China
22   Research and Prospects of Large-scale Online Education Pattern

2 Definition of MOOC

There are many excellent MOOCs, such as edX, Coursera, Udacity, FutureLearn,
Online Cource, NetEase Cloud Course, Shell MOOC etc. [2]. Through the number of
people registering every year, we can come to the conclusion that foreign MOOCs (edX,
Coursera and Udacity) are the pioneers in this area. These foreign MOOCs are leading
the development of the global online education, including the running pattern and
the teaching style. In our country, Tsinghua Online Course (Chinese MOOC), NetEase
Cloud Course and Shell MOOC are the representatives of MOOC. Through the different
characteristics and patterns of these MOOC, researchers can know the current status
of large scale online education and its evolution in the future.

2.1 Foreign MOOC

All of these three foreign representative MOOC platforms aim at sharing the best courses
of the best universities with all over the world. Apart these aims, they also provide
course resources of high quality (like video), setting homework requests and developing
good learning atmospheres, which are the reasons MOOC achieve their success. In April
2012, MIT and Harvard created edX, which is a non-profit platform. It aims at improving
teaching quality and spread online education. MIT, Harvard and their cooperative
enterprises funded this free pattern. Cousrsera was founded by Minnie Kohler
and Andrew Engle, who are Stanford professors. It is a large scale online education
platform which combines profit and online education together. So far, there are a lot
of famous universities working with Coursera, including Princeton and Pennsylvania.
In addition, Coursera has provided at least 1500 courses. In the begining, Udacity was
free. However, a reform in 2013 made Udacity promote online special training instead
of free online education. This behavior shows that Udacity has found a new profiTable
pattern instead of the methods of edX and Coursera. In field of the teaching style, edX
is gorgeous. To the contrary, Coursera provides a normal education. Udacity provides
a different method called ‘Exchange Identity’. In this teaching method, teachers ask
questions and students solve them [3]. Apart from online study opportunities provided
by MOOC, after they complete the corresponding courses, the educational institution
will issue certificates to learners [4], which can help them find a good job. Generally,
these certificates are classified as freeof charge, which means the free certificates are
relatively useless. For example, edX provides normal certificates (free), authenticated
certificates (cost) and series of X certificates (cost and point to specific theme). Coursera
provides free authentication and signature tracking (cost). However, among these three
platforms, only Udacity can provide a large amount of credits to students, which are
accept by most colleges. In addition, there are some other MOOC platform like Iversity
in Germany and FutureLearn in England, which engage teachers from all over the world
and aim at building a famous online education platform.
 Research and Prospects of Large-scale Online Education Pattern   23

2.2 Chinese MOOC

Chinese MOOC began in 2013. In May of the same year, Tsinghua University and Peking
University joined the edX respectively. At the same time, Peking University established
a collaborative relationships with Coursera. In October of the same year, Tsinghua
University promoted an online educational platform called ‘Online Course’ which
is known as Chinese edX. In addition, five universities on both sides of the Taiwan
Straits promote an online education site called ‘Ewant Open Education Platform’
together. After that, more and more universities have joined an online education
platform one after another. Various Internet companies want to take a share of MOOC
spoils. Due to that ambition, they promote their own online education platforms like
Netease Cloud Course, Shell MOOC, YY Education and so on. In comparison with
MOOC created by Universities, these kinds of MOOC has more diversification and can
satisfy more requests of people. For example, except for the course resources, Netease
Cloud Course cooperates with institutes to engage some online teachers to interact
with student and solve their problems. It also coaches students according to different
characteristics, which is hard to achieve in open MOOCs. Other platforms may put
particularly emphasis on high quality courses to serve the customers. This high
quality mean the courses are more lively, acceptable and understandable, instead of
boring. However, no matter how different these Chinese MOOCs are, the final test and
graduate authentication are necessary. Normally, the final test includes an average
grade (a system test and peer rating) and an online attendance rate. This kind of test
is similar with offline education, such as an ‘Online University’. The difference is
whether the test platform is online or offline. Offline tests are the core method for
many traditional educational authorities to check their student. However, the pattern
of Internet education drives the relevant authorities to combine the characteristics
of online education with traditional education. The future of MOOC is broad, the
development of Chinese MOOC is promoted by the universities and their partners.
Most Internet media companies and educational authorities aim at promoting MOOC
to earn profit. Even though there are lots of online education platforms, most of
them just import a large amount of teaching videos, which results in the confusion
of students’ choices and resource waste. In this condition, students also can’t receive
resources of high quality as soon. Beyond that, many online platforms lack the trailing
feedback and supervision and have poor interaction. At present, the technologies are
not the reason for these problems. In addition, personalized online education always
has an expensive cost, which hiders the development and promotion of Chinese
MOOC [5].
These MOOC introduced above are just the tip of the iceberg among the global
online education. As shown in Table 1, except for the three biggest foreign MOOC
platforms, there are also FutureLearn, Open2Study, Iversity, Khan Academy, Open
Learing, NovoEd, Canvas, Fun, Udemy, Openup Ed, IOC Athlete MOOC and so on.
The development of Chinese MOOC is not lagging behind. There are Geeks College,
24   Research and Prospects of Large-scale Online Education Pattern

MOOC of Chinese University, iMOOC, NTHU MOOCs, Class Network, Dauber MOOC,
51CTO and Chinese Ewant. More detailed information of MOOC is introduced in
reference [6,7].

Table 1. Foreign MOOC and Chinese MOOC


MOOC platform Brief

Coursera Coursera is a non-profit educational technology company with and


currently offers more than 400 courses themes and more than 1500
courses.
edX EdX is a nonprofit platform created by Harvard and MIT in 2012. It offers
more than 900 online courses.
Udacity Udacity was founded in 2011 and directly cooperated with professors of
universities. It mainly provides online computer courses.
Foreign MOOCs

FutureLearn FutureLearn is a united effort of 12 universities in the UK in 2013. It


currently provides more than 280 online courses as well as offline tests
to obtain certificates.
Iversity Iversity is a Germany MOOC platform. It cooperates with teachers
to collect courses around the world. It currently offers more than 60
courses.
OpenupEd OpenupEd is supported by the European Commission. It currently covers
more than 215 courses and 12 different languages.
OpenLearing, NovoEd, Canvas, Google, FUN, IOC-Athlete-MOOC, World-Science-U, Udemy,
stanford-open-edx, P2PU, Alison, Saylor, Allversity, Academic Earth, JANUX, Microsoft
Virtual Academy (MVA), Khan Academy

Online Class It is based on edX and founded by Tsinghua University. Online Class is the
largest MOOC platform in China and provides about 508 online courses.
Ewant Ewant is jointly set up by Shanghai Jiaotong University, Xi’an Jiaotong
University, Southwest Jiaotong University, Beijing Jiaotong University,
Hsinchu and Taiwan Jiaotong University. It offers the 112 courses.
Netease Cloud It is a platform offering online skills learning, including computers,
Class foreign languages, sports, etc. and more than ten categories of online
Chinese MOOCs

courses.
Shell MOOC Shell MOOC is founded by Shell company. It covers most courses of three
major foreign MOOC platforms and displays these courses in Chinese.
Online University It was created in April 2014. It is an educational platform organized by
some universities and has the characteristics of open, public welfare and
so on.
YY Education It is a interactive Internet education platform established in 2011. It
takes advantages of Internet and has a online interactive class cross
region and space time.
Wisdom Tree, Supernova MOOC, Geek College, Chinese MOOC, MOOCs@pku, iMOOC, NTHU
MOOCS(Taiwan), Transmission Class, Daube, 51CTO, Chinese Ewant
 Research and Prospects of Large-scale Online Education Pattern   25

3 Operation Mode of MOOC

The official definition of MOOC’s business model is: a paradigm formed during the
overall development of MOOCs. It is used to select a series of methods. In normal, it is
the methods MOOCs used to choose its customers, providing a category according to
the current status of online education, keeping a balance between the cost of online
courses and benefit of that and etc. This section will introduce the most important
part of MOOC’s running model that is course operation and profit model.

3.1 Course Operation

In normal, whether a company’s products are competitive directly determines the fate
of the enterprise. Similarly, whether the curriculum products provided by MOOCs are
competitive directly determines the success of MOOCs. Although there are countless
domestic and overseas MOOCs, their curriculum operation basically includes three
points: providing the course, teaching method and evaluation method [8].
Different MOOCs provide different courses. Some of the large scale MOOCs
(like Cousera and edX) provide the subject contents from all walks of life and are
able to satisfy the requirements of any learner. However, there are some MOOCs
which limit the courses they provide probably because of their own capability or
the specific cooperative relationships with some universities. Learners can choose
the courses according to their own needs without any worry. At the same time,
with the consideration of optimizing the learning experience, almost all MOOCs
invariably offer short videos (about 6-15 minutes) to the learner, so as to avoid
having the learner’s interests plummet due to a long time spent learning. In spite of
this, the duration of each online course will last for about 1-2 hours. So far, teaching
methods of MOOCs are divided into two types: synchronous and asynchronous
(real-time and not real time). For those who wish to be free to control their learning
time and learning progress, it is recommended to choose asynchronous MOOCs.
Many videos and notes in this kind of teaching method are prepared in advance.
It is less likely to have teachers online at any time due to the unfixed time. The
interactive mode of this teaching method is mainly based on a human-machine
interface. In contrast, students who choose the synchronous mode have more
opportunities to communicate with teachers or discuss with other students who
are studying together. This kind of communication is very important for learners to
understand the relevant knowledge. But there is a clear date of commencement of
synchronous MOOCs. Sometimes, learners’ plan may be directly interrupted due to
missing the commencement date which will not wait for anyone [9]. Each of the two
ways has their advantages and disadvantages, which need to be carefully chosen by
the student according to their personal situation. However, from the view of data,
asynchronous MOOCs are more popular currently. The most obvious phenomenon
26   Research and Prospects of Large-scale Online Education Pattern

is that although many people have chosen synchronize MOOCs, more than half of
these people can’t complete the course. For example, the lack of perfect attendance
mechanism leads to many students skipping classes. Assessment of online courses
is similar with that of the traditional offline education. According to the time of
preparation, online assessment can be divided into once a week or once a month.
Such an assessment is usually used for testing the knowledge students have learned
before. According to the time of feedback, the test can be divided into online tests
and offline tests. In addition, according to the importance of assessment, the test
can also be divided into a midterm test and a final test. People who pass the final test
can receive a corresponding certificate. Both Coursera and edX currently provide
two kinds of certificates for learners: a paying certificate and a free certificate of
honor. It is obvious that a company likes people who have paying certificates. In
order to meet the needs of some specific learners, some platforms recommend the
credits mechanism of corresponding universities to these learners, such as Udacity.
These credits gained from Udacity have been recognized by many universities in US.

3.2 Profit Model

From the view of the great success of the MOOCs in the short term, we can imagine the
great influence of MOOCs on future educational reform. The emergence, development
and maturity of any new technology must be accompanied by a specific business
model. The initial MOOCs (such as Udacity) are free of charge. But the development
process needs the support of the economy. As a result, MOOCs have begun to charge
and have evolved into a unique business model. Research shows that the current
MOOCs business model is still in the exploratory stage. The major mainstream MOOCs
profit channels mainly consist of learners or other organizations. Any learners can
get free access to the general learning materials and the corresponding curriculum
content on the MOOCs. But the question is how MOOCs get profit from learners. In
fact, the principle is very simple. Although the learners can be free to complete the
course assessment, they need to pay tuition if they want to obtain the corresponding
certificate. The course is free and has no supervision. However, some MOOCs have a
strict assessment with special invigilators. After passing the exam, MOOCs will also
issue the corresponding certificates to learners to prove their abilities. In addition,
some MOOCs also offer some senior professional courses training (such as language
R), which are certainly not free.
A stable investment is very important for the sustainable development of MOOCs.
Generally, such a stable investment includes corporate sponsorship and venture
capital. However, only the MOOC which has high quality, a high social impact and a
high efficiency can get this kind of investment. Therefore, most of the MOOCs must
find other ways to survive, which usually means enterprise cooperation. This kind of
cooperation not only includes potential relationships established between enterprises
 Research and Prospects of Large-scale Online Education Pattern   27

and learners, which aims at looking for personnel for the enterprise but also includes
profit sharing with offering online courses. The model of cooperating with enterprises
based on the characteristics of broad coverage of MOOCs service has been adopted by
most of the MOOCs, and is gradually developing into a mature business model.

4 Meaning of MOOC

MOOC’s popularity has been an irresistible trend in the world. However, the question
is why we need MOOCs. The obvious reason is many people can receive the free
education of the best universities in the world through MOOCs. They can learn the
latest scientific knowledge of these universities. In addition, MOOC’s popularity has
many other reasons concluded as follows.

4.1 Openness of MOOC

All the MOOCs in the world have good openness. People anywhere in the network
can login MOOCs to study what they want. The most important thing is MOOCs has
no tuition compared to the traditional education. Online education can reduce the
burden of poor students. In addition, traditional education needs an appointed place
and appointed time to study. Student must attend the class in time. On the contrary,
MOOCs just provide an relative free time and place for students to study, which can
satisfy requirements of different people. Besides the time and place, MOOCs never
limit the age of learners, which is opposite to the traditional education. In addition,
as an open online education, MOOCs emphasizes the share of the knowledge, which
helps people receive most free resources of MOOCs.

4.2 Scale of MOOC

MOOCs means massive open online courses. The scale of MOOCs is different with that
of traditional education which is related with the scale of classroom. Technologically,
the scale of MOOCs is associated with the infrastructure internet service provider. In
normal, an online classroom can contain thousands of people at the same time. In
this comparison, MOOCs have a great advantage of scale.
At present, all the MOOCs provide a great quantity of curriculum resources for
learners to choose, such as Netease open class and etc. In order to facilitate learners,
MOOCs also provide a very convenient and accurate searching method and sharing
method. Learners can combine their interests and hobbies with the classes. Then,
they can choose the courses precisely according to the corresponding MOOC. On the
contrary, traditional education leaves a rigid and boring impression to students.
28   Research and Prospects of Large-scale Online Education Pattern

4.3 Autonomy of MOOCs

At present, the mainstream of MOOCs offer a variety learning methods of courses.


MOOCs is different from traditional education which teaches students in fixed
time and place. MOOCs take the needs of learners’ own work into consideration.
Therefore, it provides a relative free learning method. Students can go on their online
study according to their own time. In this case, there are more interactions between
computers and learners. In addition, people’s discussion is not real-time and will take
some time to get feedback.

4.4 Consistency of MOOC

In traditional education, a student cannot have two courses with two teachers at the
same time. However, one person can attend math class of Netease and history class
of Online Class on two different computers respectively. Another advantage of this
consistency is one student can learn a same course of two different teachers. Learners
can get more knowledge and consideration through the comparison of study.

4.5 Improvement of Course

MOOCs is an online open class, which means the learners may be students or teachers.
Different teachers may have different opinions on the same course. They can do
some communication to discuss the teaching methods and how to alter their courses
through the platform provided by MOOCs. This phenomenon greatly improves the
quality of network course. In addition, people can upload their best course resources
because online curriculum resources are shared, which also helps MOOCs improve
the quality of the curriculum.
Certainly, these advantages which are obvious in MOOCs can adequately attract
attention of the educational circles. Therefore, the development of MOOCs has
become an inevitable trend. The key lies in how MOOCs get along with traditional
educational circles.

5 Challenges and Developing Direction

Recently, MOOCs have become a very hot topic at home and abroad. However, it also
shocks the traditional education which will not be replaced by MOOCs in a short time.
Therefore, we list the advantages and disadvantages of MOOC in this paper.
 Research and Prospects of Large-scale Online Education Pattern   29

5.1 Challenges of MOOC

The biggest challenge of MOOCs comes from learners. The flexibility of MOOCs
introduced in the above section is also a disadvantage from another perspective.
When the learners are too young or too old, their enthusiasm for studying may not
be as stronger as the learners in traditional education, which causes truancy. On
the contrary, the curriculum system of traditional education is compulsory, which
can prevent truancy. In addition, traditional education needs to pay tuition, which
also prevents truancy. When taking these factors into consideration, there will not
be too many people play truant. There are lots of people who register in MOOCs and
a few people stick with it until to the end. Educational specialist Pill Hill divides the
learners of MOOCs into 5 categories: active learners, passive learners, temporary
learners, spectators, and learners who give up halfway. The statistical results show
that learners who give up halfway occupy 47%, which is the highest proportion. The
temporary learners occupy 7%, which is the lowest proportion. The active learners of
MOOCs just occupy 21%. The passive learners occupy 11%. The rest are spectators who
occupy 14%. In addition, there are only 15% learners who pass the final test smoothly
at the end of the study [10]. Most of the people who pass the final test are the original
active learners. Learners can freely access the classroom of MOOCs because there are
not mandatory requirements. Therefore, if learners want to complete the MOOCs and
achieve better grades, learners must be conscientious. In fact, people who are not
conscientious are relatively more suitable for traditional education. Although some
MOOCs improve students’ learning efficiency through increasing tuition, this behavior
may result in losing part of the students and enrollment difficulties. In addition,
learners usually choose the MOOCs who cooperate with well-known universities.
Other MOOCs who are relatively small will collapse with the competition. However,
this kind of competition is very important for the development and innovation of
MOOCs. Besides the competition, how to recruit students is also a problem for these
small MOOCs. Therefore, the solution of this problem is also an important factor for
the development and innovation.
In addition, there is hardly a strict and unified standard for the MOOCs because
of the online mode of MOOCs and open mind. However, the quality of MOOCs is still a
difficult problem. On the one hand, poor English learners are difficult to understand
the content of courses due to the English teaching method of most MOOCs, which will
reduce the enthusiasm for these people. In addition, the enthusiasm of students will
reduce continuously because the quality of online courses will reduce continuously
over time. On the other hand, the enthusiasm and participation of online teachers
will reduce as well as they will face a recording camera for a long time. After all,
the communication established by MOOCs only exists on the network. In real life,
online teachers and learners seldom meet each other, which make it difficult for
teachers to understand the characteristics of all the students and teach or guide
students in a pertinent way. In contrast, teachers can actually interact with students
30   Research and Prospects of Large-scale Online Education Pattern

in traditional education. This interaction is visible and important. In addition, the


frosty relationship between online teachers and students, the massive open model
may also cause many excellent students to be lost. The reason is that there are too
many learners and too few teachers. At present, most teachers of MOOC are online
teachers as well as traditional teachers, which means they have to spend double or
even more time to prepare and teach the course. Repeated work is boring and easily to
leads teachers to feel fatigued, which may influence the normal lives of these teachers.
How to coordinate the online and offline educational work is not only a challenge for
educators, but also a challenge for MOOCs. It is obvious that MOOC’s development
needs the support of educators.
In addition to the obvious challenges above, we can also foresee some other
challenges. With the development and improvement of MOOCs, some aspects of
traditional education are bound to be replaced by MOOCs. If MOOC certificates are
recognized by society, some small universities will be eliminated quickly. People
can get education from the best universities and achieve relevant certificate through
the study on MOOC, which avoid the stress of college entrance examination. Most
people hope to attend better universities. So do the teachers. Famous teachers will
be more and more welcome and other teachers will be eliminated gradually. This
serious polarization will finally impact on the society and threaten some people and
authorities’ benefit. Long-term polarization will make MOOC’s development get into
trouble. It is possible that MOOCs will disappear in the future due to these challenges.
In fact, these challenges are deduced by us and they may be the worst condition and
may not happen.

5.2 Developing Direction of MOOC

Although there are lots of challenges and problems hinder the development of
MOOCs, MOOCs are still the most hopeful educational mode. MOOC will direct the
development of global education. After taking this into consideration, we make some
assumptions on MOOC’s future.
In the long run, a good business model is the driving force which promotes the
development of MOOCs. It’s critical for MOOCs to optimize the running operation and
develop a reasonable business model. At present many MOOCs which want to profitt
have explored in this area which includes professional recommendation, certification,
advertisement, copyright, credit etc. Professional recommendation and certification
have become common benefit mode. However, most learners still choose free MOOCs
to study with. Therefore, it’s very important for MOOCs to find a sustainable business
mode during its development.
MOOCs are well-known for their large scale and openness. So far, there are lots of
learning resources and large quantity of learning data on the platforms [11]. Through
some algorithms and mathematical model, we can do data mining based on these
 Research and Prospects of Large-scale Online Education Pattern   31

data. The data mining can help us find the potential learning problems and more
challenges of MOOCs. It is also very important for evaluation, analysis and improving
the quality of learning. This mode can gradually develop into a personalized and
customized learning program, so as to serve the new business model. In addition,
the ability of data mining also reduces the burden of teacher. Teachers can get the
characteristics of all the students through the report of big data instead of interacting
with the students, which can help them design a syllabus. However, these capabilities
of MOOCs are unstable. Therefore, data mining and analysis will be the development
direction of MOOCs.
Online education has an inherent advantage relative to traditional education
model, such as distance education, large scale and free mode. However, traditional
education also makes up for many deficiencies of MOOCs. For example, MOOCs
do not offer face-to-face teaching and related practical activities. Therefore, we
shouldn’t put them on the opposite. We should consider the relationship between
them from the complementary point of view. In addition, the department of
education mentioned that major colleges and universities can select a suitable
platform to undertake the corresponding educational work in the book named
“Strengthen Management and Application of Open Online Course Construction of
Colleges and Universities”. Traditional teachers must draw on the strong points of
MOOCs and promote a combination of traditional education and online education.
For example, teachers can work online and students can organize face-to-face
activities by themselves. Besides that practice, students can preview the classes on
a MOOC at a suitable time, which can help them have a better grasp of the learned
knowledge. It is very important for teachers to know the characteristics of all the
students through online analysis of big data and offline interactions. Teachers can
also customize a more detailed educational plan. Therefore, MOOC’s development
can be promoted by the combination of MOOC and traditional education. Both
MOOC and traditional education promote the transformation of education model
to a better direction.

6 Conclusion

Educational informational construction has been the trend of the times in the whole
world. As a new online educational mode, MOOCs already have an important impact
on the traditional education. Although the development of MOOCs is faster than
traditional education, MOOCs still cannot replace the traditional education, which
shows the inevitability of the coexistence of MOOCs and traditional education. It was
said by Prof. Christophe that what MOOCs bring to traditional education is not a threat
but an extension. Therefore, we believe there will be a new educational mode which
combines MOOCs and traditional education. This combination takes advantage of the
Internet technology and traditional education, which form a relatively more complete
32   Research and Prospects of Large-scale Online Education Pattern

education mode. This new educational mode can offer full-scale service to all the
learners.

Acknowledgment: This work is supported by the Education Department of Jilin


Province for Higher Education Reform (Grant No. BHSY007).

References
[1] PAPPANO L, The Year of the MOOC, The New York Times, 2(12), 2012.
[2] Xiaoxia Dong, Jianwei Li. Research of the MOOC’s Operating Model. China Educational
Technology, 330(7): 34-39, 2014.
[3] Yulei Zhang, Yan Dang, Beverly Amer, A Large-Scale Blended and Flipped Class: Class Design
and Investigation of Factors Influencing Students’ Intention to Learn, IEEE Transactions on
Education, 99(3):1-11, 2016.
[4] Liangtao Yang, Dilemma and Development Strategy of MOOC Localization, 2015 7th
International Conference on Information Technology in Medicine and Education, 2015: 439 –
442.
[5] Xiaohong Su, Tiantian Wang, Jing Qiu, Lingling Zhao, Motivating students with new
mechanisms of online assignments and examination to meet the MOOC challenges for
programming, IEEE Frontiers in Education Conference, 2015: 1-6.
[6] Zhuyun Yang, Qi Zhen, SPOC: INTEGRATING INNOVATION OF COMBINING WITH UNIVERSITY
EDUCATION, Tsinghua University Press, 33(2): 9-12, 2014.
[7] Arjit Sachdeva, Prashast Kumar Singh, Amit Sharma, MOOCs: A comprehensive study to
highlight its strengths and weaknesses, 2015 IEEE 3rd International Conference on MOOCs,
2015: 365-370.
[8] Xiaohong Su, Tiantian Wang, Jing Qiu, Lingling Zhao, Motivating students with new
mechanisms of online assignments and examination to meet the MOOC challenges for
programming, IEEE Frontiers in Education Conference, 2015: 1-6.
[9] MOOC panel - Future educational challenges in computer engineering education: Will MOOCs
be a threat or an opportunity?, Field-Programmable Technology, 2014
[10] Mi Fei, Dit-Yan Yeung, Temporal Models for Predicting Student Dropout in Massive Open Online
Courses, IEEE International Conference on Data Mining Workshop, 11(4):256-263, 2015
[11] YanyanZheng,Beibei Yin, Big Data Analytics in MOOCs, IEEE International Conference on
Computer and Information Technology, 2015: 681-686.
Lebi Jean Marc Dali*, Zhi-guang QIN
New LebiD2 for Cold-Start
Abstract: As the Cold Start problem emerges as a serious bottleneck in the world of
internet companies, it has become a priority to solve it efficiently. Previous techniques
fail at addressing this problem. But in this paper, we present LebiD2 a hybrid Trust
based technique which solves efficiently the Cold-Start. The secret of LebiD2 is that
it doesn't need the active user history which is exactly the downfall of the other
recommenders. In this paper, we explain our method in detail.

Keywords: Cold Start, Recommenders, model-based RS, Trust based algorithm,


social network

1 Introduction

The boom in the online business was mainly due to the inception of the Recommender
systems (RS) technology. RS are very helpful for the users in making purchase decision
online. Indeed as the name indicate RS systems recommends potential products a user
will be interested in. The recommendation is mainly based on the user’s history with
the company. Recommendation systems are mainly divided into two groups namely
content based recommendation systems and Collaborative Filtering recommendation
systems. Content based methods use semantics in predicting items the user will be
interested in. It uses information such as the user interests, his occupation, age,
favorite authors and information pertaining to the item such as title, topic. On the
other hand, Collaborative Filtering (CF) recommends items to users solely based on
the rating matrix in the company database. The latest RS technique is called the trust
based CF method. It makes recommendation based on the social network information
of the user. Our method LebiD2 belongs to this category. Indeed, nowadays almost
everyone can be traced in the social networking world and by considering information
on the social network, we can successfully predict the behavior of any user on a
particular item. This method is very effective in addressing the cold start problem.
Indeed the cold start problem refers to predicting the behavior of a novel user with no
history. Here we describe an improved RS technique LebiD2 which applies the model
based methodology into the trust based technique. The result is amazing. We have
a better performance at solving the cold start problem than we did with the former

*Corresponding author: Lebi Jean Marc Dali, Department of Computer Science and Engineering,
University of Electronics, Science and Technology of China
Zhi-guang QIN, Department of Computer Science and Engineering, University of Electronics, Science
and Technology of China
34   New LebiD2 for Cold-Start

technique LebiD1 [1] which will be briefly described here. This paper has three (3)
sections. In section 2, we discuss related works in this area of study. In section 3, we
explain our method “LebiD2” in detail, then in section 4 we evaluate LebiD2 against
other well-known methods for solving the cold start problem and finally we conclude
by showing the advantages of our method.

2 Related Work

The use of recommendation systems has had a considerable impact on the income of
online businesses. The companies with the best RS systems have witnessed a surge
in their client base set. The first really popular RS technique is the memory based RS
[2], a famous such technique is the nearest neighbor method. The memory based RS
is popular because it is relatively simple to use and it is very effective. Indeed, the
memory based method analyzes the entire rating matrix to find the closest friends of
the active user based on the user history and hence predict the behavior of the active
user based on these data. The memory based CF does not perform well in a real world
arena especially in situations where the rating matrix is very large. This technique
is called memory based because it uses the rating matrix residing in the memory
in order to compute the prediction. This technique has serious issues pertaining
to sparsity, scalability and shilling. Sparsity refers to the state of affair wherein the
matrix contains many empty cells, scalability refers to the situation where the number
of users and items grows to a very large level and shilling refers to spurious data
inserted into the matrix. The next important RS method is the model based CF [2].
Unlike the CF where the prediction is computed directly from the rating matrix, here
the rating matrix is used to first learn a model. And using this model, we can perform
the required predictions. One important characteristic of this model is that once the
model is learned, we do not need the dataset anymore. While in the case of the former
CF (memory based CF), the rating matrix is accessed whenever a prediction is needed.
Also the model based CF makes use of matrix decomposition and matrix factorization
in computing the model parameters. And matrix factorization is a process for which
is no guarantee of success. The novel CF method to appear in the RS universe is the
trust based system. Indeed the trust based technique combines the company rating
matrix with the social networking for its prediction. The Tidal Trust recommendation
system [3] is one such method. In fact, it performs a modified breadth-first search
in the system and computes the trust value based on all the raters at the shortest
distance from the target user. The trust between users uand ν is given by:

∑ (tu,w ∗ tw,v ) (1)


tu ,v = w∈N

w∈N
∑ tu,w
where N denotes the set of the neighbors of u.
 New LebiD2 for Cold-Start   35

Also the trust depends on all the connecting paths.


The formula used for the prediction is:
∑ (tu,v ∗ rv,i ) (2)
ru ,i = v=raters
∑ tu,v
v = raters

where rv,i denotes rating of user ν for item i.


In fact this technique addresses to some extend the infamous cold start problem.
We will test our method against this technique.
The next method in this category is the MoleTrust technique [4]. In its operation, it
is very similar to the previous technique (TidalTrust). However, these two techniques
differ in how they select trusted users. The TidalTrust uses users at the shortest
distance of the active user, while MoleTrust goes beyond that up to a maximum-depth
d. Here the difficulty is in the selection of the depth d. If it is too high, the accuracy will
be good but the processing time will also be high. However, the MoleTrust performs
better than than TidalTrust. MoleTrust will also be evaluated against our technique.
We will test our technique (lebiD1) against this method as well.
We also have in this category the TrustWalker method [5]. This method belongs
to the Trust based RS. It behaves almost like MoleTrust. Their differences is that here
we use the near friends who have rated similar items instead of the far neighbors. The
similarity between items is given by:
corr (i, j )
|corr ( i , j )| (3)
sim(i, j ) =

1+ e 2

Finally to close this section, we inspect LebiD1. LebiD1 is a special type of trust
based technique. Indeed it combines memory based technique with social networking.
When used with the QQ social network and the movielens database [6], it shows a
better performance than the previous methods. LebiD1 prediction formula is given as:
∑ ru,i (4)
=ra,i u=F
,i ∈ I
card( F )
wherein F represents the closest friends of active new user a, I is the set of all the items
rated by all the users u of F, ra,i represents the predicted rating of novel user a to item
i, while F stands for the set of social network friend shaving rated item i. ru,i stands for
the rating of friend user u to item i and finally card (F) denotes the number of such
friends.

3 LebiD2

The problem of Cold start is very disturbing, especially to the e-commerce community.
However the benefits that one will obtain from solving this problem are enormous.
36   New LebiD2 for Cold-Start

However using our method LebiD2, the cold start problem is solved effectively. Already
in our previous method LebiD1 [1], we addressed the cold start problem. In LebiD1, we
combined the memory-trust based CF with the QQ social network and the Movielens
database [6]. We use the same databases in LebiD2 but here we use the model based
CF combined with trust based instead of using the memory-trust based LebiD1 [1].
LebiD2 performs better than our previous method LebiD1 as will be witnessed in the
next section. We use the same working condition as in the case of LebiD1, that is here
too we used the Movielens [6] dataset and our work is to predict whether a particular
user will like a particular movie. In the case of the LinkedIn dataset, we are interested
in predicting the skillset of an employee or which companies will be interested in a
particular employee. To sum up, here our tasks are to predict the movies a new user
will like or what will be his ratings on given movies while our user has no history with
our database before. We can also sort the movies according to the computed ratings
from the most liked (highest rating) to the least (lowest rating). In LebiD1, we used the
rating matrix to compute the predictions. But here, we will first lean the model using:

|W = (XT X)–1 XT.t (5)

where W denotes the model parameters needed and it is solved using LebiD2
decomposition. X denotes the user dataset and t denotes the movie database.
LebiD2 is a *congruent like decomposition except it is more stable and less time
consuming. The algorithm used in the LebiD2 preprocessing phase is given as:
N: represents the number of friends on the social network having rated item i
-For k=1 to n
-For i=k to m
S (k ) = S (k ) 2 + A(i, k )
A(i, k ) = A(i, k ) / S (k )
-For j=k+1 to n
-For i= k to m
A(i, j ) = ( A(i, j ) + A(i, k ) 2 ) / A(k , k )
S (i ) = A(i, i )
Comparing our LebiD2 with the matlab SVD we obtain Figure 1.
LebiD2 is shown in the green color. We clearly see that our method LebiD2 shows
a very stable condition compared to the matlab SVD. Hence LebiD2 will perform better
than the matlab SVD in a real time environment because while being more stable,
it requires less processing time which is the prime concern of real applications. But
we need to specify that the matlab SVD is more complex hence more effective in its
operation than our technique. But we know that users hate to wait for a web page to
open. And if the page takes too long to open, they just go to the competitor, hence we
lose the client and his money. So time complexity is very important with an acceptable
performance which is the goal of LebiD2. The matrix factorization operation is the
main operation that consumes the time. So in LebiD2, we tricked (somehow) the
 New LebiD2 for Cold-Start   37

system, in the sense that instead of computing the model parameters directly with
the entire rating matrix, we first identify the main eigenvalues and using these, we
derive the model which is way faster than operating directly on the large database.
Also here an approximation of the model is enough so we don’t need all the complex
operations required in the matlab SVD. So speedy processing results in increasing our
ability to retain online customers and attract even more from customer advertising
about our product. Also as we will see in the next section, the results are exceptional
in addressing the cold start problem.

Figure 1. Diagram comparing lebiD2 (green) and matlab SVD (blue)

4 Experimental Evaluation

Our technique (LebiD2) has been tested against the methods discussed above. The
methods are: the MoleTrust, the TidalTrust and theTrust Walker and our former
technique LebiD1. And we used RMSE [2] as the evaluation metric. The result is given
as per Table 1:

Table 1. Table comparing the rmse of lebid2 with previous trust based methods

Method RMSE results

MoleTrust 1.400
TidalTrust 1.200
TrustWAlker 1.180
LebiD1 0.800
LebiD2 0.200
38   New LebiD2 for Cold-Start

The diagrammatic representation is given in Figure 2.

Figure 2. Barchart comparing LebiD2 with previous trust based methods

It can be observed that our model-trust based approach LebiD2 (the last bar in the
Figure) has a better error rate than the other methods.

RMSE (Root Mean Squared Error)


The RMSE is a rating metric that is used to test the accuracy of the recommender
technique. Its formula is given by:

1 2
=RMSE  ∑ {i , j} ( Pi , j − ri , j )  (6)
n 

n represents the total number of ratings over all users, Pi,j denotes the predicted rating
for user i on item j, and finally ri,j stands for the actual rating.
RMSE amplifies the contributions of the absolute errors between the predictions
and the true values.

5 Conclusion

To sum up, we can boldly say that LebiD2 is indeed a very important step in addressing
the Cold Start problem. Bringing together the model based approach and the trust
based technique has proven to be very effective in solving with the cold-start problem.
 New LebiD2 for Cold-Start   39

Also the accuracy is increased as can be seen with the RMSE test. However LebiD2
is time consuming since it is based on the model based architecture. We encourage
researchers to focus on the relationship between the user and Social network [7] in
order to design better recommender that solves efficiently the cold start (0% error
rate) in an acceptable time delay. Indeed Social Network is the next big thing in the
world of recommenders. Hence, we encourage researchers to focus more on the social
network relation of the users.

References
[1] L. J.M. Dali and Q. ZhiGuang, “Cold Start Mastered: LebiD1”,InternationalConferences on
Computational Science and Engineering (CSE2014), Chendu China, Dec 2014
[2] X. Su and T.M. Khoshgoftaar, “A Survey of Collaborative Filtering Techniques”, Advances in
Artificial Intelligence, 2009:4:2-4:2, January 2009..
[3] J. Golbeck, “Computing and Applying Trust in Web-based Social Networks”, PhD thesis,
University of Maryland College Park, 2005.
[4] P. Massa, P. Avesani, “Trust-aware recommender systems”, RecSys 2007.
[5] M. Jamal, M. Ester: TrustWalker: A Random Walk Model for Combining Trust-based and
Item-based Recommendation, KDD 2009.
[6] MovieLens data, http://www.grouplens.org/
[7] A. Sharma, D. Cosley, “Do social explanations work? Studying and modeling the effects of
social explanations in recommender systems”, WWW 2013.
Li-jun ZHANG*, Fei YU, Qing-bing JI
An Efficient Recovery Method of Encrypted Word
Document
Abstract: The recovery of encrypted Word document has great application importance
not only for the case of decrypting a user’s Word document without the forgotten
password but also for evidence acquisition in forensic justice circumstances. In this
paper, we studied the file structure, encryption principle as well as decryption key
derivation approach of a Word document, and then we present an efficient method
of decrypting this kind of file. After a practical test, we find that our method is able
to acquire the original plaintext document rapidly (within an average time of 1.5
minutes), which can almost meet the actual requirement of real-time decryption of
Word document.

Keywords: Word document; rainbow Table attack; rapid decryption; data forensics

1 Introduction

Word document in Microsoft office suite is one of the most widely used word processing
software while the security of document content and privacy protection have become
a basic demand for those document users. Word document employs encryption to
control the access privilege of its document and a person could only open and edit the
content of document when he enters the correct password in advance. This mechanism
provides the necessary security guarantee for user data. However, with extensive use
of passwords in a variety of encryption applications, the case of a forgotten password
appears frequently. Once the password of some important Word document is forgotten,
no one can open the document to view the content which usually brings a great loss
to document owners. On the other hand, the encrypted Word documents also make
criminal investigations in civil national security departments more difficult. So it is of
practical significance to study the recovery of encrypted Word document.
The earliest method of cracking encrypted Word document is through security
vulnerability, and one anonymous researcher [1] presented such an approach in 2004
to modify a document’s encryption protection in order to achieve the purpose of
obtaining access privilege. Later in 2005, Wu [2] pointed out the improper usage of
core RC4 encryption algorithm in Word document, i.e., it uses the same encryption

*Corresponding author: Li-jun ZHANG, Science and Technology on Communication Security


Laboratory, Chengdu, 610041, China, E-mail: 41049250@qq.com
Fei YU, Qing-bing JI, Science and Technology on Communication Security Laboratory, Chengdu,
610041, China
 An Efficient Recovery Method of Encrypted Word Document   41

key stream for different versions of Word file. This kind of implementation enables
it to be possible to decrypt the content by using much weaker exclusive or algorithm
[3]. However, in practice it is quite difficult to decrypt the document content by
exploiting these vulnerabilities since it needs different versions of the same file,
a requirement is usually extremely hard to satisfy. Therefore, the existing tool of
decrypting a Word file is through exhaustive search for the correct password. There
are two such representative softwares: one is Advanced Office Password Recovery
[4] released by Elcomsoft company and the other is Password Recovery Kit [5] from
Passware company. But this brute force mode is only valid for short passwords and
it cannot recover those a little longer passwords within acceptable time since the
space of candidate passwords becomes very large which results in a time-consuming
search process. So it is valuable to design a decryption approach independent of
password length. Chen [6] proposed an algorithm by using a time and space tradeoff
but their result is limited to find out the internal encryption key without giving the
final recovery of a plain text document.
In this paper, we first give a detailed analysis of encryption principle and storage
structure in a Word file, and then proposed a decryption method for this kind of
document by making use of a rainbow Table attack technique. The advantage of this
approach is that the decryption can be accomplished in a determined time. In an
actual test under the computer configuration with i5 dual-core CPU and 4GB memory,
it can effectively recover the plaintext of encrypted document in less than 2 minutes
with success rate exceeding 95%. Our result is helpful for practical demand in the
data forensics and forgotten password retrieval situations.

2 The Analysis of File Structure and Encryption Principle

Word file uses a special kind of file format called Microsoft compound document
which is a complex structured storage to contain a variety of meta data in different
formats such as text, image, audio and video. We found out the explicit format of
Word document by analyzing the open source code of “OpenOffice” software.

2.1 The Storage Structure of Word File

Logically, a compound document is a kind of file system composed of storages and


streams, where storages are similar to directories and streams are similar to files as
those in Windows operating system. So root storage is equivalent to the root directory
of the file system. Moreover, every stream is divided into several smaller data blocks
called sectors for concrete data storage. The logical storage structure is shown in
Figure 1.
42   An Efficient Recovery Method of Encrypted Word Document

root storage

storage 1 storage 2 stream 1

stream 11 stream 21 stream 22

Figure 1. The logical storage structure of Word file

Specifically, a typical Word document with images has the following five kinds of
streams. “Data” streams store the image data and this stream exists only when the file
has images. “1Table” streams store the content of data Table, and “CompObj” store
the common object. “Word Document” store text data which is the actual text and
format information while “Summary Information” store summary information of the
whole document.
Physically, an entire Word file consists of a file header structure and the subsequent
sectors. The size of every sector is the same and identified in the header. A sector’s
index starting with 0 is called sector identifier (SID for short). These sectors belonging
to one stream can be disordered and the corresponding SID array of a stream is called
a sector chain (SID chain for short). Thus the physical storage structure of a Word
document is shown in Figure 2.

header

sector 1 all streams

sector 2 data
blocks
……
sector n Sector
2 4 5 3
chains ID

Figure 2. The physical storage structure of Word file

2.2 The Encryption Principle of Word File

Encryption Algorithm. Word document with versions from 97 to 2003 use RC4
algorithm which supports 40-bit, 64-bit and 128-bit length of encryption key. RC4 is a
stream cipher algorithm whose operation is byte-oriented. It adopts an encryption key
 An Efficient Recovery Method of Encrypted Word Document   43

of variable length to derive the initial state and then generates a pseudorandom key
stream to produce the cipher text. In order to maintain compatibility, the default key
length is 40 bits whose encryption security is too weak under the current computation
capability [7].

Encryption Process. The 40-bit encryption key of RC4 algorithm in Word document
is generated from a user’s password and a salt value by computing several rounds of
MD5 hash value. This process is called key derivation implemented by the function
KDF(salt, password). After the encryption key is obtained, it is concatenated with the
block number “bnum” according to the physical position of data block in the Word
document. Then this concatenation value is calculated as input of MD5 algorithm to
derive the 16-byte initialization vector in RC4 algorithm. Finally, the RC4 algorithm
produces the key stream to encrypt every data block by exclusive-or operation and
outputs the corresponding ciphertext. The entire encryption process is shown in
Figure 3.

Original
Word File

salt

40-bit key
password KDF(salt,password) MD5(key||bnum)

128 bit enckey

RC4(enckey,block)

Encrypted
Word File

Figure 3. The encryption process of Word file

Vulnerability of Encryption Mechanism. The existing Word document recovery


method is trying to find out the user’s password. However, the password searching
space is increased exponentially with respect to the length of password. So the
password cracking time is unacceptable when the document has a long and
complicated password. For example, a password contains numbers, uppercase and
lowercase letters, as well as special characters with a total length greater than 8.
In this case, the Word file recovery is almost impossible by this password cracking
method. But from the encryption process we could see that the security strength is
44   An Efficient Recovery Method of Encrypted Word Document

totally determined by the 5 bytes data which derive the initialization vector of RC4
algorithm. So we can attempt to recover this 40-bit key which has the remarkable
advantage of being independent of password length.

3 The Basics of Rainbow Table Attack

3.1 The Principle of Rainbow Table Attack

A rainbow Table attack is a kind of time-memory tradeoff algorithm [8]. It makes use
of stored data in the offline precomputed phase to reduce analysis time of an online
attack phase. For an encryption algorithm E and the known plaintext P0, the attack
target is to obtain encryption key k satisfying Ek(P0) = C0 after given the ciphertext
C0. The time-memory tradeoff attack is divided into two phases:

(1) Precomputation Phase. Select m starting points S1, S2,..., Sm from the key space
K and define a reduce function R:C→K which maps ciphertext space C to key space
K. Let f(k) = R(Ek(P0)) and we apply this function to calculate m chains from every
starting point Si.

In fact, in order to improve the coverage of keys in key space when these chains are
generated, the reduction function R at every column of chains is different so is the
.
function f. If this function in every column is marked with a different color, then
these chains look like a rainbow. Hence they are named rainbow chains. After this
calculation of m chains is completed, only the starting and end point pairs (Si, Ei)
are stored in the Table.


(2) Online Attack Phase. Given the target ciphertext C0, we first apply reduce
function R on C0 to obtain value Y1,then use the function f to apply iteratively from
Y1until the result is matched with some end point Ej. Now we get a computation
chain:

This matched chain will be rebuilt from the starting point Sj until we find the desired
key. In practice, this correct key k = Xj(t-s) = ft-s-1(Sj) may not exist in the matched chain.
 An Efficient Recovery Method of Encrypted Word Document   45

This phenomenon occurs because the chain generated from Y1coincides with a
chain in the Table but this matched rainbow chain does not contain the correct key.
This case is called a false alarm [9].
A rainbow Table attack can be used to reverse one-way functions such as hash
functions and encryption functions. In practice, it is mainly applied for plaintext
recovery of hash value and unsalted encrypted password cracking cases.

3.2 Rainbow Table Attack for Word document

The type of rainbow Table in a practical attack is almost the perfect Table [10] which
means no end points are the same in the Table. In a rainbow Table attack, there
are several important parameters for constructing Table s such as the number of
rainbow Table s n, Table ’s chain number m and chain length t. These parameters
can be configured to be optimal by the following formulas according to success rate
p, storage space M and key space N. These formulas are:

n=-ln(1-p)/2,
m=M/n,
t=-(N/M)ln(1-p).

In the process of constructing and searching rainbow Table s, there are two essential
functions, namely the encryption function E involving key and reduce function
R. So for the concrete situation of Word file recovery, we should first define such
both functions. We denote the 40-bit decryption key by k which determines RC4
initialization vector. According to the Word encryption process and characteristics
of RC4 stream algorithm, the pseudorandom key stream ks is generated from k.
Moreover, we find that the consecutive 8 bytes plaintext at offset 0x400 in plaintext
Word file are fixed with all 0x00 whose block number is exact 0x01. Since the
ciphertext c = p xor ks and all the plaintext bytes are 0x00, in this case the ciphertext
c is the same as key stream ks. So we extract the consecutive 8 bytes data as the ks
at offset 0x400 and establish the target one-way function Ek(p) = c which maps the
40-bit key to 64-bit pseudorandom key stream ks. The reduce function is designed
as Ri(x)=(x+ti) mod 240 where x is a 64-bit number and ti is the column position in a
rainbow Table chain.
In Word document recovery, the key space N=240. If our desired success rate
s is 99%, we can calculate the relevant parameters by the precede formulas of
constructing perfect rainbow Table. Concretely, we need to generate n=4 Table s
with each Table containing m = 54,000,000 chains and chain length t = 36000. The
total storage space M of 4 rainbow Table s is about 3.2 GB.
46   An Efficient Recovery Method of Encrypted Word Document

4 Word Document Decryption

4.1 Decryption Key Acquisition

In order to decrypt a Word document, it is necessary to acquire decryption key k. We


read out the 8 bytes ciphertext at offset 0x400 in the Word file as the input of rainbow
Table attack. Using the 4 rainbow Table s, we could search and recover the correct
40-bit decryption key.

4.2 Recover the Word Document by Decryption Key

After obtaining the correct decryption key, we start to decrypt the encrypted part
in Word document and reconstruct the original plaintext file. By studying the file
structure, we found that not all the data blocks in the document are encrypted. Only
the streams of one Table, Data and WordDocument are encrypted whose data are text,
images and so on. Therefore, when a Word document is decrypted, the sector number
of ciphertext data block is firstly listed according to the file structure and then we
extract the corresponding data and decrypt them while the unencrypted parts remain
unchanged in order to recover the whole document. The sector number and sector
chain can be obtained from the file header which contains important parameters of
file version, sector size, total number of sectors and starting offset of every sector. This
compound document header is exactly at the beginning of file with size 512 bytes.
Here we present some necessary parameters in this header needed in decryption as
shown in Table 1.

Table 1. The parameters for decryption in file header

Offset Size Description (in hex)

0 8 fixed document file identifier (in hex): D0…E1


28 2 byte order identifier
30 2 size of a sector in power of 2
32 2 size of a short sector in power of 2
44 4 total sector number for sector allocation Table
48 4 SID of first sector of directory stream
60 4 SID of first sector of short sector allocation Table
64 4 total sector number for short sector allocation Table
68 4 SID of first sector of master sector allocation Table
72 4 total sector number for master sector allocation Table
76 436 first part of master sector allocation Table (109 SIDs)
 An Efficient Recovery Method of Encrypted Word Document   47

The decryption process of Word document is as follows.


(1) Restore the sector allocation Table (SAT) according to master sector allocation
Table (MSAT).
(2) Read out the starting sector number of every encrypted directory stream based on
the file header of the document, denote it by DirSid.
(3) According to starting sector number SID of every encrypted directory stream,
recover the sector chain for every directory data. We denote it by CSID whose
process is described in Figure 4.
(4) Extract every data block according to the sector chain CSID.

read DirSid
header CSID
parse SAT

Figure 4. Recover sector chain of every directory

(5) Derive decryption key of RC4 stream cipher based on 40-bit key and decrypt every
encrypted data block.
(6) Modify the encryption identifier in file header and add the unencrypted part to
reconstruct the plaintext Word document. This process is shown in Figure 5.

Encrypted decryption Original


read by CSID
Word File RC4(deckey,block) Word File

Figure 5. Reconstruct the plaintext Word document

4.3 The Comparison of Attack Efficiency

In a general attack of searching precomputed Table, every key will need 5 bytes and
the whole 240 key space will take at least 5 TB bytes storage which cannot be achieved
within an ordinary computer. For brute force attack method [9], the exhaustive search
of 40-bit key will take 40 days in the average time.
In our rainbow Table attack, we practically construct the specified 4 rainbow
Tables and implement key recover algorithm in an experimental environment with
Intel i5 dual-core CPU of 3.2G clock, 4GB memory and Windows XP operating system.
We attempted 100 encrypted Word document samples and successfully recovered 96
48   An Efficient Recovery Method of Encrypted Word Document

documents with time shorter than 2 minutes. So this recovery method for encrypted
Word files is very efficient in practice.

5 Conclusion

This paper proposed an efficient plaintext recovery method for encrypted Word
document. This method exploited the rainbow Table attack technique to find out
the correct 40-bit decryption key and then reconstructed the original plaintext file
according to the encryption principle and file structure of the Word document. After
the practical test, this method can decrypt the Word file cipher text efficiently which
could be very helpful for forgotten password retrieval and judicial data forensics.

Acknowledgment: This work is supported by Foundation of National Natural


Science (No.61309034). The authors also acknowledge the reviewers for their useful
opinions to improve this paper.

References
[1] Anonymous hacker. The problem of encryption functionality in Word document. Available at
http://college.Sxhifhway.gov.cn/document/ 20040112161826088.htm.
[2] Wu H. The Misuse of RC4 in Microsoft Word and Excel. Institute for Infocomm research,
Singaphore, 2005.Available at http://packets- torm.setnine.com.
[3] Trappe W. Introduction to cryptography with coding theory, Pearson Education press, 2006.
[4] Elcomsoft company software 2016. Advanced Office Password Recovery. An introduction is
available at http://www.elcomsoft.com, 2016.
[5] Passware company software. Passware Kit Enterprise and Passware Kit Forensic, available at
http://www.lostpassWord.com/index.htm, 2016.
[6] Chen Q, Fang H. Study on Word Document Fast Crack Based on Time-memory Trade-off
Algorithm. Computer Engineering, vol.36(16): 137-139, 2010.
[7] Team 509. The use of vulnerability of encryption algorithm in MS Word. Available at http://
rootyscx.net/documents/MSWord encrypt.pdf
[8] Hellman M, A cryptanalytic time-memory trade off. IEEE Transactions on Information Theory, IT,
1980, 6(4): 401-406.
[9] Avoine G., Junod P. and Oechslin P. Time-memory trade-offs: False alarm detection using
checkpoints. In Progress in Cryptology Indocrypt 2005, volume 3797 of Lecture Notes in
Computer Science, pp. 183-196, Springer-Verlag.
[10] Oechslin P.. Making a Faster Cryptanalytic Time-memory Trade-Off, Advances in Cryptology
proceedings of Crypto 2003, LNCS 2729, Springer-Verlag, 2007: 617-630.
Gao-yang LI, Kai WANG, Yu-kun ZENG, Guang-ri QUAN*
A Short Reads Alignment Algorithm Oriented
to Massive Data
Abstract: DNA sequencing technology has seen rapid development in recent years,
and both the sequencing throughput and read lengths are growing. Besides, new
properties such as paired-end sequencing are emerging. Therefore, it is of great value
to develop a sequence alignment algorithm for this new type of DNA data. In this
paper, an alignment algorithm is proposed. Instead of the Smith-Waterman algorithm,
a local alignment algorithm oriented to sparse mutation is used to accelerate seed
extension. Besides, instead of aligning short reads one by one, this software puts all
reads with similar seeds together to accelerate seed location. This paper uses human
genome reference sequences and short sequencing data from GenBank (40 times
coverage) to evaluate our algorithm. And we compare our work with Bowtie2 in terms
of speed and accuracy. The results show our algorithm has significant advantages in
alignment speed and space overhead with large scale data.

Keywords: alignment tool; local alignment algorithm; Next-Generation Sequencing

1 Introduction

With the continuous development of the Next-Generation Sequencing (NGS)


Technologies, the capabilities of gene sequencing have grown swiftly, while the
price is falling. In the beginning of 2015, Illumina launched HiSeq 4000, which
can sequence at most 1.5T nucleotide data in just 3.5 days (http://www.illumina.
com). Such advances have greatly widened the use of NGS in clinical medicine and
precision medicine. The raw data generated by the Next-Generation sequencer need
to be aligned to the reference genomes before downstream analysis.
However, the alignment process is a time-consuming task with intensive
computational requirements. In most cases of alignment, the use of high-performance
workstations or servers is a necessity (e.g. [1]). This increases the cost of sequencing
and makes the clinical application of NGS more difficult. Thus, an alignment tool that
is capable of handling massive raw data with cheaper computing resources is needed.

*Corresponding author: Guang-ri QUAN, School of Computer Science and Technology, Harbin
Institute of Technology (Weihai), Weihai, China, E-mail: grquan@hit.edu.cn
Gao-yang LI, Kai WANG, Yu-kun ZENG, School of Computer Science and Technology, Harbin Institute
of Technology (Weihai), Weihai, China
50   A Short Reads Alignment Algorithm Oriented to Massive Data

2 Methods

The mainstream of the current aligners follow the seed-and-extend paradigm


(e.g.BLAST [2]). First, short sub-strings (seeds) in the reads are aligned exactly or
with a few mismatches to some reference genome regions. In that step, BWT [3] index
(e.g.SOAP3-dp [4], SOAP3 [5, 6], BigBWA [7]) and hash index (e.g. MOSAIK [8], GMAP
[9]) are used to locate candidate positions. Then a local or global alignment algorithm
are used to complete alignment for candidate locations and generate final results.
Dynamic programming methods like the Needleman-Wunsch algorithm and the
Smith-Waterman algorithm are used as local or global alignment algorithm [10, 11].
Those methods can always find out the best alignment of two strings of nucleotide or
protein sequences.

2.1 A local alignment method oriented to sparse mutations

Assuming the lengths of the two strings are n, the computational complexity of
dynamic programming methods is O(n2), which is relatively high when n is big. But in
the process of resequencing, in most case the two DNA sequences are similar, and the
similarity can be used to make alignment complexity lower than O(n2).
Most mutation sites are short strings isolated from each other, like islands in the
sea. In this paper, for a k bps string containing mutations, if strings ahead and after it
are without mutation and longer than k bps, it considered that the string containing
mutations is isolated. Under such circumstances, strings without mutations are long
enough to locate isolated mutated strings. In accordance with the method in this
paper, the computational complexity of alignment depends on the length of mutated
strings. For two DNA sequences long n bps, if the longest isolated mutated string
longs p bps, the complexity is substantially O(np). The mutations are sparse and the
majority of mutated strings shorter than a specific constant. Therefore, the algorithm
can usually achieve good performance.
For two similar strings S1, S2 and natural number k, we define function f:

f(S1,S2,k):=[(S1«k)⊕S2]&(S1⊕S2)&[(S2«k)⊕S1] (1)

In function (1), “«” means left shift k bit, “&” means bitwise AND, and “⊕” means
bitwise XOR. For a natural number k, equation:

f(S1,S2,k) = 0 (2)

In most case shows the two strings have no mutated sub-strings longer than k. And
we define Md:

Md := Min (k) (3)


 A Short Reads Alignment Algorithm Oriented to Massive Data   51

When:

f(S1, S2, k) = 0 (k ∈ N) (4)

S1 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11
Oral Strings
S2 A1 A2 A3 A4 A5 A6 A7 T8 A9 A10 A11

Left Shift 1 S1<<1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12


bp S2<<1 A2 A3 A4 A5 A6 A7 T8 A9 A10 A11 A12

(S1<<1)^S2 X1 X2 X3 X4 X5 X6 X7 R1 X9 X10 X11

XOR
(S2<<1)^S1 X1 X2 X3 X4 X5 X6 X7 R2 X9 X10 X11
S2^S1 01 02 03 04 05 06 07 R3 09 010 011
f(S1,S2,1) 01 02 03 04 05 06 07 R4 09 010 011
Result
f(S1,S2,1) 01 02 03 04 05 06 07 08 09 010 011

Figure 1. When an SNP mutation occurred(T8), the smallest k that makes f(S1, S2, k) =0 is 1. Thus, the
Md value for SNP is 1.

S1 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11
Oral Strings
S2 A1 A2 A3 A4 A5 T6 T7 T8 A6 A7 A8

Left Shift 3 S1<<3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14


bp S2<<3 A4 A5 T6 T7 T8 A6 A7 A8 A9 A10 A11

(S1<<3)^S2 X1 X2 X3 X4 X5 R11 R12 R13 X9 X10 X11

XOR (S2<<3)^S1 X1 X2 X3 X4 X5 R21 R22 R23 09 010 011


S2^S1 01 02 03 04 05 R31 R32 R33 X9 X10 X11
f(S1,S2,3) 01 02 03 04 05 R41 R42 R43 09 010 011
Result
f(S1,S2,3) 01 02 03 04 05 06 07 08 09 010 011

Figure 2. When an insertion mutation occurred(T8), the smallest k that makes (S1, S2, k) = 0 is 1. Thus,
the Md value for an one nucleotide insertion is 1.

It is used as a measure of mutation degree. The more similar the two sequences, the
smaller the Md value. Figure 1 and 2 describe how the f function works. Figure 1 shows a
1 base pair SNP mutation occurs and the nucleotide A8(Red) in sequence S1 mutated into
T8 in sequence S2. When k is 1, all results in function: f(S1, S2, k) equal zero, and Md value
for S1 and S2 is 1. In the picture, [1] denote original value of sequence, {Ti} denote mutated
bases, {Ri} donate intermediate results, {Xi} donate intermediate results of no use, {0i}
donate zeros. First raw sequences left shift one nucleotide, the middle results XOR with
52   A Short Reads Alignment Algorithm Oriented to Massive Data

raw data, at last after bitwise AND operation, final result becomes a string of all zeros.
Besides, when S1 equals S2, Md is 0. If the length of the longest mutated substring is k,
the value of Md will not be more than k, and in most cases the value of Md will be k.
And the value will be far more than k when the two sequences under comparison are
very different. Thus, Md can effectively measure the degree of SNP mutation. Figure 2
shows an 3 base pair insertion mutation occurs and the nucleotide T8 (Red) in sequence
S2 is inserted before A8. Md is calculated to be 3. That value increases with the length of
inserted or deleted sequence. In our program, the four nucleotide bases A C G and T are
coded using binary numbers 00, 01, 10 and 11 respectively. Thus, 32 nucleotides can be
stored together in a 64-bit integer and calculated simultaneously.

2.2 A seed locating algorithm based on sorting

The reference genomes are very large, locating a sub-string in them is like looking for
a needle in a haystack. So nearly all aligners build indexes for reference genomes.
But now the question is both the volume of raw short reads and reference genome
are large. Many short reads have the same seeds, and it takes a great deal of time to
handle that seed if that seed has too many hits in the reference genome.
A solution to the problem is to put all reads with the same seeds together and
build an index for short reads. When hits are found for one seed, then hits are found
for all the same seeds.
Before the alignment, we sort all reads in alphabetic order. When they are sorted,
adjacent reads will have a similar prefix, which will be used as seeds. Meanwhile, all
reference genomes positions are sorted in alphabet order, too. Then, according to the
differences of prefixes, those positions will be separated into blocks and stored in a
quad tree. When a read is aligned, all hits for a seed can be accessed directly. Besides,
seed length is determined flexibly and the value will increase when hit number is
greater than a preset number. This can guarantee all possible genome positions are
searched and at the same time, a read can be aligned at a finite computational cost.
Finally, we extend seeds to get local alignments.
Sorting details. All reference gene positions and reads are sorted in alphabetic
order before alignment. A bucket sort algorithm is used to make sure sorting a linear
time costing process. As reads and reference positions are both sorted, when a read
is bigger than a reference record, it is also bigger than all the reference records before
that ref record and vice versa. Smaller than reference records is the same. Therefore,
once a traverse of both the sorted reads array and reference array were completed, all
exact matches will be discovered.
Variable length seed details. For each read (one mate in a pair), the first 12 bps
data will be used as a seed. Sometimes the seed hit number exceeds the computational
capacity, in such circumstance the seed will extend one bp and be longer. The
extension will continue until the hit number is below a preset value (5120 as default)
 A Short Reads Alignment Algorithm Oriented to Massive Data   53

or seed length is over 20 bps. A quad tree is used to store hit positions 5 for variable
length seeds. The four child nodes of a parent node represent the four conditions
(A C G and T) of an extra base. Because mutations and errors may exist within the
seeds, thus if the alignment of a read failed when using the first seed, other two 12
base pairs substrings will be used as new seeds.

2.3 Pairing Strategy

In the mapping process, we output all possible results for all reads. Those results
will be sorted by the original reads order. If two results respectively in mate one and
two are within the restricted distance, a pair is found. When more than one properly
paired results are found for the two paired reads, a result with least edit distance will
be selected. If no pair is found for a spot, a read will be re-mapping using other part of
seeds and repeat all the above steps.

3 Result

Based on the idea mentioned above, we present our short-read aligner called MASS to
accelerate the alignment.
The time we need to locate the seed in reference genomes can form scale effect.
The more data, the less the average time consumation. When the read number exceeds
100M, time needed for seeding and mapping will be less than 30% of total time cost.
Meanwhile, 45% of total time is used for sorting reads and the last 25% is used for
generating BAM results (Table 1). In our new case, the sorting speed is mainly affected
by I/O speed instead of CPU, so MASS is more suitable for deployment to a cheap
computer than one with many CPU cores.
In order to accelerate sorting, segment sequences are extracted and compressed
from the original fastq file (Table 1). MASS will be even faster if the segment sequences
were stored as raw data in independent files.

Table 1. Detailed analysis of time usage

Step Time Bottlenecka

separating the SEQ fields 2362 Hard Disk

Sorting reads 3439 Hard Disk

Mapping 3670 CPU and Hard Disk

Generating BAM result 3833 CPU

Total 13322 -
a
The time is analyzed by internal timer of MASS. The speed bottleneck is analyzed by KSysguard.
54   A Short Reads Alignment Algorithm Oriented to Massive Data

3.1 Result on real data

To assess the preference of MASS, we compared MASS with Bowtie2 [12]. The test
data set is SRR1819827. The reference genome is GRCh38. As the speed of MASS is
influenced by data volume, we tested 100% of raw data, 200% of raw data (58.8G
bps) and 400% of raw data (117.6G bps), respectively. Doubling will not influence the
final result due to the independence of each spot. Hardware environment used for test
is a PC with Intel E8400 (3GHzdual core CPU), 4G memory, two hard disks (forming
RAID0, 200MB/s read speed). All tests are ran on two thread to take full use of the
dual core CPU. Additionally, while Bowtie2 outputs SAM format, MASS stores result
in BAM format directly.

Figure 3. Comparison between Bowtie2 and MASS using real reads on peak memory and speed.
 A Short Reads Alignment Algorithm Oriented to Massive Data   55

When data volume is over 100Gbp, MASS is near ten times faster than Bowtie2
(Table  2). In the meantime, MASS requires for a comparatively small amount of
memory, which means it can be deployed in nearly any computer (Figure 3). At most
782 Gbp data can be aligned per day, which is enough to handle the data generate by
HiSeq 4000. The two aligners show similar mapping rate, for both the two aligners
failed to pair 1% of spots properly.

Table 2. Comparison between Bowtie2 and MASS

Volume (Gbp) Aligner Paired (%) Aligned (%) MEMa (GB) Gbp/h

29.4 Bowtie2 98.94 99.60 3.5 3.23


58.8 Bowtie2 98.94 99.60 3.5 3.23
88.2 Bowtie2 98.94 99.60 3.5 3.23
117.6 Bowtie2 98.94 99.60 3.5 3.23
29.4 MASS 98.94 99.64 1.38 23.5
58.8 MASS 98.94 99.64 1.41 29.1
88.2 MASS 98.94 99.64 1.43 31.7
117.6 MASS 98.94 99.64 1.46 32.6
a.
Peak MEM represent the peak memory used during the alignment.

3.2 Result on simulated data

To test the accuracy and sensitivity of MASS, we applied short read simulator wgsim
(https://github.com/samtools/) to generate 2 sets of 1M paired-end reads with length
100 and 300bp (Figure 4).

Figure 4. Correct read alignments as a function of incorrect read alignments under different mapping
quality cutoff.
56   A Short Reads Alignment Algorithm Oriented to Massive Data

The mutation rate is 0.2% and 15% of mutations are INDEL. An alignment is considered
correct when the position of both mates are within 20bp from the simulated position,
otherwise it is an error alignment.
For both the Bowtie2 and MASS, almost all simulated reads can be aligned.
The two aligner share similar accuracy and sensitivity. Some reads can be aligned
to different reference positions, which are same in edit distance with each other.
Under such circumstance, the aligners cannot determine which position is better, so
it will report a random position as final result. Accordingly, the prediction error rate
is mainly influenced by read length. The longer the read is, the lower the incorrect
alignment rate will be. Nearly 1.5% reads will be inaccurately mapped when read
length is 100 bps, and that value decreases to 0.6% and 0.4% when read length is
300bps and 500bps, respectively.

4 Discussion

In this paper an alignment algorithm orienting to massive NGS short reads has been
proposed. By employing a method based on the sorts for both reference sequences
and read sequence at the same time to build query index, we avoided loading and
analyzing the reference sequence repeatedly. Besides, through the avoidance of
massive random access of hard disk and memory, this method accelerated the seed-
locating process of aligning massive amount of short reads to large reference sequence.
According to what we are informed, there are few methods combining the shift and
XOR operation to handle local alignment problem. Through making full use of the
advantages of longer bits in modern computers, more nucleotide can be compared
in parallell. Besides, when the mutations are sparse, the cost of alignment increases
linearly with the sequence length, which accelerated the local alignment process.
The combination of the methods mentioned above improved the efficiency of
alignment and therefore facilitated human individual DNA sequencing and precision
medicine. And this method also has enlightening value in query and analysis of other
kind of sequencing data such as textual data. Future research will focus on optimizing
the model, making better balance between speed and accuracy and developing new
method for changing data type and hardware environment.

References
[1] J. Gonzalez-Dominguez, Y. Liu, and B. Schmidt, “Parallel and Scalable Short-Read Alignment on
Multi-Core Clusters Using UPC+,” PLoS One, vol. 11, p. e0145490, 2016.
[2] S.F. Altschul, W. Gish, W. Miller, E.W. Myers and D.J. Lipman. “Basic local alignment search
tool.” Journal of Molecular Biology 215.3, 1990, pp. 403-410.
 A Short Reads Alignment Algorithm Oriented to Massive Data   57

[3] T.W. Lam, R. Li, A. Tam, S. Wong, E. Wu and S.M. Yiu, “High Throughput Short Read Alignment
via Bi-directional BWT.” IEEE International Conference on Bioinformatics and Biomedicine IEEE
Computer Society, 2009, pp. 31-36.
[4] R. Luo, T. Wong, J. Zhu, C.M. Liu and X. Zhu. “Soap3-dp: fast, accurate and sensitive gpu-based
short read aligner.” Plos One, 8(5), 2013, pp. 59-59.
[5] C.M. Liu, T. Wong, E. Wu, R. Luo and S.M. Yiu. “SOAP3: ultra-fast GPU-based parallel alignment
tool for short reads.” Bioinformatics 28.6, 2012, pp. 878-9.
[6] C.M. Liu, T.W. Lam and T. Wong. “SOAP3: GPU-based Compressed Indexing and Ultra-fast
Parallel Alignment of Short Reads.” Workshop on Massive Data Algorithmics Massive, 2012.
[7] J.M. Abuín, J.C. Pichel, T.F. Pena, J. Amigo. “BigBWA: Approaching the Burrows-Wheeler Aligner
to Big Data Technologies.” Bioinformatics 31.24, 2015, pp. 4003-4005.
[8] W.P. Lee, M.P. Stromberg, A. Ward, C. Stewart and E.P. Garrison “MOSAIK: A Hash-Based
Algorithm for Accurate Next-Generation Sequencing Short-Read Mapping.” Plos One 9.3, 2014,
pp. e90581-e90581.
[9] T. D. Wu, J. Reeder, M. Lawrence, G. Becker, and M. J. Brauer, “GMAP and GSNAP for Genomic
Sequence Alignment: Enhancements to Speed, Accuracy, and Functionality.” Statistical
Genomics, 2016.
[10] Y. Liao, G.K. Smyth, and W. {Gonzalez-Dominguez, 2016 #111}Shi. “The Subread aligner: fast,
accurate and scalable read mapping by seed-and-vote. “ Nucleic Acids Research 41.10, 2013,
pp. 89-94.
[11] T.D. Wu, J. Reeder, M. Lawrence, G. Becker and M.J. Brauer “GMAP and GSNAP for Genomic
Sequence Alignment:
[12] Langmead, Ben, and S. L. Salzberg. “Fast gapped-read alignment with Bowtie 2.” Nature
Methods 9.4, 2012, pp. 357-9.
Yan-nan SONG*, Shi LIU, Chun-yan ZHANG, Wei JI, Ji QI
Based on the Mobile Terminal and Clearing System
for Real-time Monitoring of the AD Exposure
Abstract: considering the limitations of traditional pricing strategy and settlement
patterns, this paper studies the problem of system implementation of billing
preferential policy, real-time billing, real-time settlement, and system security. Based
on distributed message mechanism systems- Kafka, which is applied for an advertising
monitoring and billing system for mobile terminals, a clearing system for real-time
monitoring of the advertising exposure is researched and designed. This system
mainly includes four parts, an advertising monitoring subsystem, a real-time news
back subsystem, a real-time billing subsystem, and a real-time settlement subsystem.
This paper separates the real-time billing and settlement from the advertising delivery
system and control system, and applied to the bank settlement system and advertising
system as independent application system. Not only does the system provide real-time
monitoring and billing for advertising delivery management, but it also guarantees
efficient settlement security services. The experimental results show that the system
can realize real-time monitoring and settlement for advertising exposure effectively,
as well as satisfy the requirements of high throughput, low latency and high security.

Keywords: message mechanism; kafka; mobile terminals; advertising delivery


system; real-time monitoring and settlement

1 Introduction

As the Internet becomes more popular, so too, does advertising on the Internet [1].
Hawkins, the famous American media researcher defines Internet ads as electronic ads,
namely spread to consumers by an electronic information service. A narrow definition
of Internet advertising is summarized as: an information dissemination activity that is
determined by the advertiser to pay for methods, using the Internet, other media and
information transmission networks to display goods and services by sending messages
to a target audience. Displaying advertising through a mobile terminal, then the
mobile terminal sending a request to the server, the server side, advertising settlement
monitoring. Since the mobile terminal and the server are a distributed management,
communication between the terminals must follow RPC protocol [2].

*Corresponding author: Yan-nan SONG, School of Computing Science, Inner Mongolian University,
Hohhot, China, 512849571@qq.com
Shi LIU, Chun-yan ZHANG, Wei JI, Ji QI, School of Computing Science, Inner Mongolian University,
Hohhot, China
 Based on the Mobile Terminal and Clearing System    59

2 Design principle and requirement analysis

2.1 Design principle and calculation method

2.1.1 Design principles-Message mechanism


For the producer consumer model, its components are divided into: producers,
message queues and consumers. The basic principle of the message mechanism is
that when consumers request a message, the message is stored in the message queue,
and the message loop continues to take the first message from the message queue and
dispatches the message to producers. After receiving the message, producers make
different decisions depending on the message type. Traditional business processes
use a serial mode. If the middle link is slow, it will result in the next link waiting, and
the efficiency of the system will be very low [7]. Compared to the traditional business
processes, a messaging business is discrete, and it will deal with the message stored in
the message queue by divided it into two different processes. The producer generates
a message into the message queue, and consumers only need to fetch data from the
queue for consumption. It realizes the decoupling between producers and consumers,
and it is also called an asynchronous mechanism [5]. A synchronization architecture
lead easily to system crashes. Using asynchronous architecture can effectively satisfy
throughput requirements.
This system uses a producer-consumer model, establishes the producer,
consumer and bank settlement database, simulates the whole business process, and
uses a message mechanism as a way of message processing. When a large number
of requested messages appears, firstly they are placed in the message queue, and
then stored on the hard disk. When the news increases, simply increasing the hard
drive. During the process, the producer autonomously determinates the message
handling read capability from the hard disk [4]. At the same time, the system supports
concurrent processing. When the speed of processing data is relatively slow for the
producer, the message can be cached by message queue, otherwise, the producer
may not have to wait for the processing speed of consumers, and direct into a data
processing. This achieves high throughput requirements.

2.1.2 Billing Method


Internet advertising charges mode is divided into CPA and CPM. CPA’s pricing refers
to the actual performance by advertising, i.e. an effective response in order to bill,
but not limited amount of advertising. CPA is a common form of network. When a
user clicks on an ad, site owners will get the corresponding revenue. CPM (cost per
thousand) refers to the advertising process, how much it costs for one person to see an
ad. As per CPM is charged according to popularity of home (i.e. visitors) divided price
level, taking a fixed rate. International practice is to charge per CPM from $ 5-200 US
dollars. Foreign use CPA to charge, and domestic also use it [6].
60   Based on the Mobile Terminal and Clearing System

In this system, CPM is used. The system sets a daily maximum limit advertising
fees, and advertising billing and settlement banks are two processes. For advertising
fee deductions, the system determines a settlement point, starting from the point
of settlement, when the charge amount reaches the maximum daily advertising
spending amount, the amount of advertising that is served to reach goals, no longer
runs. Wait until the settlement point, the settlement amount will be transferred
to the appropriate bank account, therefore a regular design tasks across days to
process.

2.2 The requirements analysis

Many companies have begun to use Internet adverstising to increase profit. However,
traditional advertising value assessment is not always satisfactory, and investors
cannot get a relevant and convincing assessment of an ad’s worth. Moreover,
publishers also hope to facilitate the estimation of the value of advertising, and
negotiate with sponsors.
An advertising real-time monitoring billing system can be adapted to serve
the high-frequency Internet advertising billing, and it divides real-time billing and
settlement from advertising management and control systems as a stand-alone
application. It locates a bank account with the advertising system to provide relevant
real-time monitoring of safe and efficient billing and settlement services for the
development of Internet advertising, it has important theoretical significance.

3 Running environment of system

3.1 Software development environment

This system uses the JAVA application development language, the operating system
environment is a Windows 8.1, uses the MySQL database, and the development
environment is Eclipse software.
The construction of the environment mainly uses the Kafka system. Kafka is a
distributed system, and can be partitioned, and is a reproducible message system.
It provides a common message system function, but also has its own unique design.
Kafka, summaries the topic and the unit for the message. Kafka topical news
programs known as producers, and the viewer is called a consumer. Producers send
messages through the network to a Kafka cluster, the cluster provides information to
consumers. The client and server via TCP protocol communication. Kafka provides
a Java client, and provides support for multiple languages. Kafka provides service
monitoring to ZooKeepServer, so first configuration ZooKeepServer to enable a single
instance ZooKeepServer services [3].
 Based on the Mobile Terminal and Clearing System    61

3.2 The flow chart of the system

Figure 1 shows a model of Kafka. The model of producer and consumer solves the
strong coupling problem between producers and consumers by a container. There is
no direct communication between producers and consumers with each other. They
communicate through a message queue. After producers produce the data, they
needn’t wait for the consumer process. Throwing it to the message queue, consumers
look for the producers’ data, but directly from the message queue. The message
queue is equivalent to a buffer, balancing the processing capacity of producers and
consumers. Kafka message queue is used for producers and consumers of decoupling.

Figure 1. The model of Kafka.

3.3 System function design

3.3.1 The messaging model Kafka Producer


Kafka producer sent messages using an asynchronous send. Batch transmission
can very effectively increase the transmission rate. Kafka producer asynchronous
transmission mode allows batch transmission. The first message is cached in memory,
followed by a request to send out bulk. This policy can be configured, for example,
you can specify whennthe message buffer reaches a certain amount of time to go on
the issue, or cache a fixed time after sending out (such as 100 to send a message, or
send every 5 seconds). This strategy will significantly reduce the service side of the
I/O count. In this system, we simulate the Internet advertising message mechanism
and realize the high throughput performance of Internet advertising by programming.

3.3.2 Kafka Consumer spending message models


The first question of Kafka is that customer should pull message from the message
queue or message queue pushed messages to consumer, that is, pull or push. In this
regard, Kafka followed the traditional design of a common message system most:
62   Based on the Mobile Terminal and Clearing System

producer pushed the message to the message queue, consumer pulled message from
the message queue [8].
Some messaging systems, such as Scribe and Apache Flume, use push mode,
and the message is pushed to the downstream consumer. There are advantages and
disadvantages: the message queue is determined by the rate of message push. For
different consumer consumption, rate is not very good deal. Messaging systems are
dedicated to provide consumer with the greatest rate of most rapid consumption of
news.. Kafka was ultimately selected for the traditional pull model. In this system, we
went through the programming to simulate the consumer Internet advertising model,
and to achieve the demand of real-time clearing.
The benefit of pull model is that consumer can decide whether to pull the bulk
of the data from the message queue. Push mode must immediately decide to push
each message buffer after bulk or push without knowing the downstream consumer
spending power and consumer policy situation. If the consumer in order to avoid
collapse and push a lower rate, it will likely lead to a push up only fewer messages
and wastage. On the pull down mode, the consumer can decide these strategies
according to their spending power.
On the pull mode, there is a drawback that if the message queue is no consumption,
it will continue to lead to consumer poll in a loop until a new message arrives. To
avoid this situation, Kafka has a parameter that allows consumer know when a new
message arrives (of course it can know the number of messages reaches a certain
amount, so that you can send bulk).

3.3.3 The characteristics of system


a) Integrity: Function of this system design is more complete and realizes the whole
simulation process of Internet advertising from large producers to consumers.
b) Flexibility: The two modules can run independently and use to meet producers to
produce news and consumer spending news with good flexibility.
c) O penness: The solution can be used in other similar demand, so that the system
has good openness.
d) Practicality: The system uses algorithms and design practical, reliable with good
operating environment applicability and practical.

4 Experiment and result analysis

4.1 Design of experiment

Figure 2. shows a schematic structure of advertising, when each ads exposure


monitoring client requests distributed services, exposure monitoring service request
message logs centralized services to unified storage systems Kafka distributed
 Based on the Mobile Terminal and Clearing System    63

message, and then the real-time billing services consumer exposure to news, real-
time billing, and exposure to persistent data among a distributed database cluster.
The system uses hierarchical sub-division business process modules, based on SOA
design approach to achieve distributed to clustering manner of one hundred million
daily PV monitoring requirements and easy expansion to achieve highly reliable
deployment [9].

Figure 2. The model of system.

For this system, the pressure on the system uses the test queries per second rate that
is QPS. QPS is a particular query server within the specified time how much traffic
handled by the measure, the maximum throughput. Final data are as follows: one
second can respond to 20,000 requests [10].
1) Experimental Target: This experiment study for trafficker throughput efficiency through
real-time monitoring system of settlement, concurrent tolerance limits, experimental
testing capacity forecasting, load and so on to draw the relevant analysis.
2) Expected results: Through the system structural analysis, directory traversal
detection, detection of hidden files backup files and leak sensitive files, CGI
vulnerability scanning, user and password guess solution, cross-site scripting
analysis, SQL injection vulnerability and other aspects of server vulnerability
testing, obtain system security risks exist.

4.2 Experimental results and analysis

4.2.1 Experimental Results


a) System QPS / TPS:QPS / TPS original meaning is: ask your system can handle per
second / number of transactions, or throughput. In the application, we are more
64   Based on the Mobile Terminal and Clearing System

concerned about the number of applications per request can be processed. This is an
important indicator to measure system performance. (TPS) = number of concurrent /
average response time. QPS statistics by accessing the log statistics corresponding to the
amount of time divided by the corresponding time PV obtained. In performance testing
tools to test to get through. General statistics are often the peak of PV corresponding
QPS. Throughput efficiency test under this system a single point as follows:
QPS = 900034 requests / 60 seconds = 15000
b) Response Time:Response time (RT) refers to a request from the client start time,
the client receives a response from the results returned by the server end of the
elapsed time, the response time from the time of request, network latency and server
processing time of three parts composition. The system under laboratory network
conditions:
RT = 60s / 15000 + 0.872ms + 0.134ms = 5.006ms
c) LOAD load and CPU resource consumption rate:The average system load is defined
as operating within a particular time interval queue mean process. If a process
following conditions is met then it will be located in the run queue:
–– It is not waiting for the results of I/O operations
–– It does not take the initiative to enter the wait state (that is not called ‘wait’)
–– Is not stopped (for example: waiting for termination) 1 the ideal load value index
value is the number of CPU * Audit * 0.7, if it exceeds this value over the long
term than the system will need to be careful. Environmental testing of the system
is centos 7 virtual machine, the system under normal traffic load conditions as
follows Figure 3.

Figure 3. Environmental testing of the system.


 Based on the Mobile Terminal and Clearing System    65

4.2.2 Analysis of experimental results


For our system design, we certainly need to on-line test the next user can receive
much traffic before. We wanted to evaluate the maximum daily PV which our system
is able to support. But how to assess it, do you want to make a maximum daily PV
scenarios to test? In fact, based on existing experience and data, it can be summed
up the relationship between the peak QPS and date of PV. We daily QPS and PV charts
can be found daily curves are basically the same. Through mathematical modeling,
we can see the peak of each server.
QPS = ((total PV * 80%) / (24 * 60 * 60 * 40%)) / number of servers 1. 80% and
40% of these two Figures is not fixed parameters, meaning the formula is represented,
generating 80% of the total QPS mean PV in 40% of the time (12 hours). For different
scenarios have different parameters. So that we can get through the pressure sensing
applications QPS its peak, then calculate the specific date PV peak under the QPS
according to the formula, by this capacity to predict. Namely: the estimated PV =
pressure measured QPS * (24 * 60 * 60 * Percentage of time) /0.8 * number of machines
[11].
According to the impression of CPU utilization and load are statistics on the
current CPU usage. But in fact these two indicators are still very different. CPU
occupancy rate is well understood and the use of CPU time allocated ratio. The CPU
load is based on the average length of a period of time to wait for CPU processing
task queue [12]. This indicator under high load has a higher value than the reference
CPU utilization. Because the high-load periods, CPU occupancy rate is basically close
to 100%, it cannot reflect the extent of the machine load. In contrast, by the length
of the task queue statistics can reflect the current system load is serious, whether
controllable. System is a single processor:
When the load is less than 1, the system easy to run;
When the load is greater than 1, the task is waiting for CPU processing, CPU usage
cannot distinguish between load = 1 and load> 1 these two cases at this time.

5 Conclusions

In this experiment, we program a simulation model of Internet advertising producers


and consumers, combined with new Kafka technology to achieve high throughput
Internet advertising, real-time billing requirements. In theory, the Internet advertising
development mode, sponsors, promoters, and media has a certain reference value.
The disadvantage of this system is that the test for system security verification
is insufficient. In the follow-up study we should continue in-depth research, adding
denial of service, XSS attacks, CSRF attacks, message transmission channel attacks,
pollution parameters attacks, SQL injection exploits and other aspects of exploits on
system security.
66   Based on the Mobile Terminal and Clearing System

References
[1] Junlong Tu, “Tutorial online advertising” Peking University Press, 2005.
[2] Ruyi Li, “Internet advertising inquiry form”, Nanchang University, 2007.
[3] Wang Yan and Wang Chun, “A Design of Reliable Consumer Based on Kafka,” Computer
Engineering & Software, vol.37, No.01, 2016, p 61-66.
[4] Cuntang Wei, “Automation Key Technology to detect SQL injection and XSS attacks,” Beijing
University of Posts,2015.
[5] Hongxia Shu and Jihong Wang,“Design and implementation of distributed real-time message
mechanism operating system,” Computer Engineering and Design, vol. 29, May. 2008
[6] Feng Huang and Huarui Wu, “Analysis and Prevention of injection-based SQL J2EE
applications,” Computer Engineering and Design, No.10,vol.33, Oct. 2012.
[7] Botian Shi,Yujin Gao, and Kai Zhu, “Java-based object model messaging,” Computer
Engineering, No.15,vol.36, Aug. 2010.
[8] Zhang Li,Huimin Du and Liguo Zhang, “Based Distributed Storage regular expression matching
algorithm design and implementation,” Computer Science, No.3,vol.40, Mar. 2013.
[9] Lei Sun and Shujuan Jiang, “Construction based system vulnerabilities scene,” Computer
Engineering, No.20,vol.33, Oct. 2007.
[10] Han He, “Design and implementation of Internet-based mobile advertising platform,” Beijing
Jiaotong University, 2015.
[11] Rusu, Vlad, Marchand, Hervé1 and Tschaen,“From safety verification to safety testing,” Lecture
Notes in Computer Science, vol.2978, 2004, p 160-176
[12] Xie Xiong, Zhang Weishi and Cao Zhiying,“Safety verification of software component behavior
adaptation,” 2010 International Conference on E-Product E-Service and E-Entertainment.
Li-ming LIN, Guang-cao LIU*, Yan WANG, Wei LU
Star-shaped SPARQL Query Optimization on Column-
family Overlapping Storage
Abstract: Column families are widely used to store structure-free or semi-structure
data. However, traditional RDF data storage methods divide data into independent
triples, which makes SPARQL queries executed have a low performance. Here, we
propose a JoinFirst SPARQL-translation strategy for improving star-shaped SPARQL
queries performance. Experiments demonstrate this strategy is helpful when joins are
necessary in star-shaped queries. The speed is accelerated in an exponential scale.

Keywords: RDF; Star-shaped Query; SPARQL; Column family

1 Introduction

In the first decade of the 21st century, RDF models have been adopted by W3C as a
specification for conceptual description or modeling of knowledge. An RDF statement
is composed of <subject, predicate, object>, and an RDF statement set could be
represented as a labeled, directed multi-graph, where a statement corresponds to an
edge. So this model is capable of representing semi-structured or unstructured data,
which are widely seen in Web or Semantic Web. There are many RDF data sources,
such as Bio2RDF [1], which contains more than 10 billion RDF statements. And the
need to query RDF data sets have also been increasing. The most widely used RDF
query specification is SPARQL, which is recommended by W3C.
In Figure 1, examples of an RDF graph and the corresponding SPARQL query are
depicted, where an instance is represented as a node, and an edge as a relationship
between two nodes. Besides an relationship (foaf:knows) between two persons
(foaf:person1 and foaf:person2), this example also describes other information of
their own, such as foaf:firstName, foaf:surname, etc. From this examples, we could
find that the unit of an RDF dataset is an edge, which is also called as RDF statement.
So in traditional RDF storage solutions [2-5], data are stored as triples, with each
corresponds to one RDF statement. Then, SPARQL optimization technologies are also
restricted by such storage solutions.

*Corresponding author: Guang-cao LIU, Xiamen Great Power Geo Info. Tech. Co. Ltd., State Grid
Information & Telecommunication Group, Xiamen, China, E-mail: liuguangcao@sgitg.sgcc.com.cn
Li-ming LIN, Xiamen Great Power Geo Info. Tech. Co. Ltd., State Grid Information &
Telecommunication Group, Xiamen, China
Yan WANG, School of Computer & Information Engineering, Xiamen University of Technology,
Xiamen, China
Wei LU, School of Information, Renmin University of China, Beijing, China
68   Star-shaped SPARQL Query Optimization on Column-family Overlapping Storage

Figure 1. An Example of an RDF Dataset’s Graph Representation and a SPARQL Query

Researchers have proposed many solutions for SPARQL optimization, such as using
matrix set operations on a bitmap to reduce number of joins or intermediate results
[6], an index technique for group-by queries [7], and data dividing strategies on
semantic hash [8]. However, the key problem lies in their scattered storage of RDF
statements about one instance, the burdens of merging information of the requiring
instance also increase. So in [9], we propose an RDF data storage strategy to extract
RDF statements with frequent predicate pairs, and store them into column families.
Table 1 to 3 illustrates column families extracted from RDF statements. Here, we
introduce the extraction rule we use.
Rule 1 (Overlapping Rule): Overlapping of column-families only exists in the
structure not content, which means there is no redundant information among
column families.
The proposed strategy is based on mining of frequent predicate pairs, overlapping
could be found among column families. In traditional database theory, overlapping is
seldom seen, query optimizer could not work well under such circumstances. In this
paper, we design an efficient SPARQL optimization trick for database with column-
family overlapping to take full advantage of column storage.

Table 1. A column family with two predicates

ID foaf:firstName foaf:job

foaf:person5 Alex Dispatcher

foaf:person2 Konstantinos Customer service staff

Table 2. Another column family with one predicate

ID foaf:surName

foaf:person6 Valarakos

foaf:person4 Stergiou
 Star-shaped SPARQL Query Optimization on Column-family Overlapping Storage   69

Table 3. Another column family with two predicates

ID foaf:firstName foaf:surName

foaf:person1 Page Charles

foaf:person3 David Smith

2 Optimizations for Star-shaped Query

2.1 Description of a Star-shaped Query

The purpose of a star-shaped query is to get as much information about one instance
as possible, which often appears as subjects of several RDF statements. The example
in Figure 2 shows a query to obtain the firstName and surName of foaf: person1,
where the identifiers beginning with symbol “?” represents a variable. Here, we limit
that only the query, whose number of queried predicates is equal to or bigger than 2,
could be called as a star-shaped query.

Figure 2. An Example of a Star-shaped Query Represented as a Graph

2.2 Optimization

The optimization could be divided into two stages. In the first stage, the star-shaped
query is transformed into tuple filterings and projections, which avoids costly
operation of edge join on column storage. In the second stage, the column-family,
which does not contain all predicates in the star-shaped query, could be filtered out.
Then, the scans on these column-families could be reduced and performance will be
improved greatly.
70   Star-shaped SPARQL Query Optimization on Column-family Overlapping Storage

πfoaf:firstName, foaf:surName( πfoaf:firstName,foaf:surName (


(t1( πfoaf:firstName, ID(T1) πfoaf:firstName,ID(t7(T3))
 JOINt7.URI=t8.URI
πfoaf:firstName,ID(T3))) πfoaf:surName, ID(t8(T3)))
JOINt1.ID=t2.ID
(t2( πfoaf:surName, ID(T2)

πfoaf:surName, ID(T3)))) πfoaf:firstName,foaf:surName (T

Figure 3. SPARQL Query on Column-Families Represented as Relation Algebra


πfoaf:firstName,foaf:surName (
πfoaf:firstName,ID(t1 (T1))
Supposing there are several column families in the database, T1={ID, foaf:firstName},
JOINt1.ID=t2.ID
T2={ID, foaf:surName}, T3={ID, foaf:firstName, foaf:surName}, such as those in
πfoaf:surName, ID(t2(T2))) 
Table 1 to Table 3. A typical
πfoaf:firstName,foaf:surName ( UnionFirst query shown on Figure 3 aims to find all
π
<foaf:firstName,
πfoaf:firstName,ID (
foaf:surName>.
( πfoaf:firstName,foaf:surName
It firstly makes unions of all firstName in all Table (
t3(T1))
foaf:firstName, foaf:surName
(
s, andJOIN (
makes
t1 π unions of
foaf:firstName, (T
ID all )
1 surname in all Table s, then joins firstName and π surName (t7(T3))
foaf:firstName,ID
t3.URI=t4.URI
together. 
The symbol
πfoaf:surName, σ3ID=foaf:person1 JOINt7.URI=t8.URI
)))  (T1) selects a row in which ID=foaf:person1, πfoaf:firstName,
ID(t4(T
π
πfoaf:firstName,foaf:surName
filters columns leaving
foaf:firstName,ID (T )))
( 3foaf:firstName and ID columns, È is a union operation, π foaf:surName,rID(t8(T3)))
ID
JOIN
πfoaf:firstName,ID
t1
renames the row (
t1.ID=t2.ID (T3and
ast5t1, )) etc.
( t2( πfoaf:surName, ID(T2)
JOIN
For thet5.URI=t6.URI
sake that union and join obey distribution law, we could use distribution
π  ID(t6(T2))) 
law to transform
foaf:surName, the original
πfoaf:surName, query into that shown in Figure 4. The original UnionFirst πfoaf:firstName,foaf:surName (T
πfoaf:firstName,foaf:surName ID((T3))))
query with one join is transformed into JoinFirst query.
πfoaf:firstName,ID(t7(T3))
JOINt7.URI=t8.URI
π foaf:surName, ID(t8(T3)))
π
foaf:firstName,foaf:surName (
πfoaf:firstName,ID(t1 (T1))
JOINt1.ID=t2.ID
πfoaf:surName, ID(t2(T2))) 
πfoaf:firstName,foaf:surName (
πfoaf:firstName,ID(t3(T1))
JOINt3.URI=t4.URI
πfoaf:surName, ID(t4(T3))) 
πfoaf:firstName,foaf:surName (
πfoaf:firstName,ID(t5(T3))
JOINt5.URI=t6.URI
πfoaf:surName, ID(t6(T2))) 
πfoaf:firstName,foaf:surName(
πfoaf:firstName,ID(t7(T3))
JOINt7.URI=t8.URI
πfoaf:surName, ID(t8(T3)))

Figure 4. Result of Using Join Distribution Law On Original Query


 Star-shaped SPARQL Query Optimization on Column-family Overlapping Storage   71

According to Rule 1, if a column family (such as T1) is the subset of another column
family (such as T3), then the content of T3 will not appeared in T1 again. So, the join
result of T1 and T3 is null, and the join result of T2 and T3 is null. Additionally, the join
result of T1 and T2 is null in this case. Then three joins in Figure 4 could be filtered out,
and only T3 join T3 remains in Figure 5. Furthermore, we find that the join condition is
ID should be equivalent, then the join could be simplified as a projection on T3, which
is shown in Figure 6. Then rule 2 could be concluded from this process, and it can be
extended to scenarios with many joins.

πfoaf:firstName,foaf:surName (
πfoaf:firstName,foaf:surName (
πfoaf:firstName,ID(t7(T3))
πfoaf:firstName,ID(t7(T3))
JOIN
JOINt7.URI=t8.URI
πfoaf:surName,t7.URI=t8.URI
(t8(T3)))
πfoaf:surName, IDID(t8(T3)))

Figure 5. Reduction

πfoaf:firstName,foaf:surName (T3)
πfoaf:firstName,foaf:surName (T3)

Figure 6. The Final Result

Rule 2 (Heuristic Rule of Star-shaped Query): Under the circumstance of storing


RDF data with overlapping column families, the start-shaped query of SPARQL could
be equally transformed into a union of projections on column-families containing all
the queried predicates.
{?x Pred1 ?V1}. {?x Pred2 ?V2}. … {?x Predn ?Vn} → πID,Pred1,…Predn(T1) ∪…∪ πID,Pred1,…Predn(Tm)

3 Experiment and analysis

3.1 Setup

The dataset for experiments is a subset of Yago, which contains 1000000 triples,
741165 subjects, 35 predicates and 494512 objects. The effectiveness of optimization
is verified by adding the number of star joins one by one. Here, there are 3 solutions
compared in our experiments. The first is called triple solution, which stores RDF
triple in column families and each row contains only one predicates. The second is
called UnionFirst, which translates a star-shapred query into a union of several joins,
just like that shown in Figure 3. The last is called JoinFirst, just like that shown in
Figure 4. And we could use rule 2 to translate into only unions of projections, such as
the example shown in Figure 6. To treat the three solutions fairly, in triple solution,
an index on ID is added.
72   Star-shaped SPARQL Query Optimization on Column-family Overlapping Storage

3.2 Experiment

The star-shaped query template used in the experiment could be represented as


SELECT ?a WHERE {?a Pred1 ?v1}. {?a Pred2 ?v2}. … {?a Predn ?vn}. The purpose is
to query instances that having values on all predicates: Pred1,Pred2,…,Predn. In the
begining, we only find instances having value in Pred1, so there is no join. Then,
instances having values both in Pred1 and Pred2 are found, so the number of joins is
1. The queried predicates are added one by one, and the number of joins increases.
The comparison of three solutions are shown in Figure 7. The x-axis means the
number of joins, y-axis means query execution time. It should be pointed out that the
scale on y-axis is exponential.
From Figure 7, we could find that, initially the number of joins is 0 (which means
there is no join), and the execution time of JoinFirst is longest because joins are
unnecessary. As the addition of queried predicates, the performance of JoinFirst is
improved gradually. Especially in the case that the number of joins is 3 or 4, the speed
of JoinFirst is 10 or 100 times faster than the other two solutions.

Figure 7. Performance of Star-shaped Query on Yago

The second point should be noted is that the worst cases in UnionFirst and Triple
solutions are the time that the number of joins is 3. It is for the sake that at that time the
number of intermediate results is largest. And when adding the 4th join, intermediate
results could be reduced. However, the inflection point of the two solutions is not
easily to estimate with different queries. And even at that time, the speed of JoinFirst
is greatly faster than the others.
 Star-shaped SPARQL Query Optimization on Column-family Overlapping Storage   73

4 Conclusion

In this paper, we propose an optimization strategy for star-shaped SPARQL executed


on RDF data stored in column families. When translating SPARQL into execution
plans, the strategy firstly runs join operations, then unionifies the join results. Then,
unnecessary joins could be omitted, and the execution plans could be changed into
scans on column families containing all required families. In our experiments, the
performance is improved in an exponential scale.

Acknowledgment: The work is supported by Science and Technology Project of State


Grid Corporation of China  under Grant  SGITG-KJ-JSKF[2015]0012, National Natural
Science Foundation of China under Grant 61502504, Fujian’s Education & Scientific
Research Program (Scientific) of Young & Middle-age Teachers under Grant JA15365.

References
[1] F. Belleau, M. A. Nolin, N. Tourigny, P. Rigault, J. Morissette, “Bio2RDF: towards a mashup to
build bioinformatics knowledge systems,” J. Biomed Inform., vol. 41, Oct. 2008, pp. 706-716,
doi:10.1016/j.jbi.2008.03.004 .
[2] D. J. Abadi, A. Marcus, S. R. Madden, K. Hollenbach, “SW-Store: a vertically partitioned DBMS
for Semantic Web data management,” VLDB Journal, vol. 18, Apr. 2009, pp. 385-406, doi:
doi:10.1007/s00778-008-0125-y.
[3] A. Harth, J. Umbrich, A. Hogan, S. Decker, “YARS2: A federated repository for querying graph
structured data from the web,” Proc. ISWC/ASWC 2007, Springer-verlag, 2007, pp. 211-224, doi:
10.1007/978-3-540-76298-0_16.
[4] T. Neumann, G. Weikum, “RDF-3X: a RISCstyle engine for RDF,” Proc. VLDB Endowment, 2008,
pp. 647-659, doi:10.14778/1453856.1453927.
[5] C. Weiss, P. Karras, A. Bernstein, “Hexastore: sextuple indexing for semantic web data
management,” Proc. VLDB Endowment, 2008, pp. 1008-1019, doi:10.14778/1453856.1453965.
[6] P. Yuan, P. Liu, B. Wu, H. Jin, W. Zhang, “TripleBit: a fast and compact system for large scale RDF
data,” Proc. VLDB Endowment, 2013, pp. 517-528, doi:10.14778/2536349.2536352.
[7] L. Zou, M. T. Özsu, L. Chen, X. Shen, R. Huang, “gStore: a graph-based SPARQL query engine,”
VLDB Journal, vol. 23, Aug. 2014, pp. 565-590, doi: 10.1007/s00778-013-0337-7.
[8] K. Lee, L. Liu, “Scaling queries over big RDF graphs with semantic hash partitioning,” Proc.
VLDB Endowment, 2013, pp. 1894-1905, doi: 10.14778/2556549.2556571.
[9] Y. Wang, X. Du, J. Lu, X. Wang, “FlexTable: using a dynamic relation model to store RDF
data,” 15th international conference on DASFAA, Springer-verlag, 2010, pp. 580-594, doi:
10.1007/978-3-642-12026-8_44.
Zhen-yu LV*, Xu ZHANG, Wei MING, Peng LI
Assembly Variation Analysis based on Deviation
Matrix
Abstract: Variation analysis is very important to assembly design. In this paper, the
geometrical variation of each part is represented as a deviation matrix, the assembly
variation is calculated with homogeneous transformation and the distribution of
key characteristic is analyzed by Monte-Carlo simulation. Based on this method, an
assembly variation analysis system is developed via C++ programing and embedded
in Pro/ENGINEER environment. An assembly of engine crankshaft and connecting
rod is analyzed and the analysis result meets the exact variation distribution. Through
this new analysis system, the design cycle can be significantly reduced and both the
efficiency and accuracy are manifestly increased.

Keywords: variation analysis; deviation matrix; homogeneous transformation;


monte-carlo; Pro/E

1 Introduction

The quality of mechanical products is directly impacted by assembly accuracy, which


is the reason why assembly accuracy is so important both in the design stage and
manufacturing stage. The main task of assembly accuracy prediction is on variation
transfer analysis. Because the variation of each part in practical manufacturing
needs to meet tolerance requirements, the study of variation model is always
transformed into the study of tolerance model. Requicha [1] proposed a avariational
graph to represent geometric features, tolerances, and attributes in solid modelers.
To formalize a relation between n-hulls that modelize the geometrical behavior of
the mechanism, and specification hull that limits the deviations of parts, Dantan
[2] proposed a tolerance synthesis method based on quantifier notion and virtual
boundary. Hu [3] studied the corresponding rules between tolerance types and
geometric constraints based on variational geometric constraints network. Mao [4]
analyzed the shape, direction and location of the tolerance zone, classifying tolerance
types and explaining engineering semantics information.
In summary, currently existing research is all based on a geometric model. The
precondition of variation calculation is feature recognition of the assembly solid

*Corresponding author: Zhen-yu LV, Department of mechanical engineering, Beijing Institute of


Technology, Beijing, China, E-mail: zhenyu1412@foxmail.com
Xu ZHANG, Wei MING, Peng LI, Department of mechanical engineering, Beijing Institute of Technolo-
gy, Beijing, China
 Assembly Variation Analysis based on Deviation Matrix   75

model so that it is difficult to apply them in a CAD system. This paper builds a datum
flow chain of an assembly model and allocates assembly variation to each feature
coordinate system. Based on the small displacement torsor theory, the relative
variation between two feature coordinate systems is expressed as a deviation matrix.
Assuming that manufacturing variations of parts all meet tolerance requirements, a
method for converting tolerance into deviation matrix is proposed, which means that
the system can extract deviation matrices from tolerance marking instead of feature
recognition. Finally, the algorithm is applied in Pro/ENGINEER, the results show
that the algorithm can effectively extract a deviation matrix and calculate assembly
variation.

2 Methodology

Our approach is based on the theory of Datum Flow Chain outlined briefly in Section
A and discussed in [5]. Section B discusses the mathematical deduction of a deviation
matrix and proposes a region formula for programming it in a computer. A method for
calculating assembly variation is proposed in Section C. In order to be in according
with actual situations, Section D introduces a Monte-Carlo simulation and put it into
our algorithm.

2.1 Datum Flow Chain

For complex product assembly processes, in order to show the spread of assembly
variation clearly and quickly, a Datum Flow Chain (DFC) model is established in [5].
DFC is a directed acyclic graph contains nodes, solid lines, dashed lines and double
lines. It synthesizes the geometric features, constraints, tolerances and assembly
sequence information in a model.
In DFC modeling, each part in an assembly is decomposed into Assembly Features
(AF). Each AF has its own coordinate system to express the location and constraint
information. Between two coordinate systems, there is a Key Dimension (KD) in a part
or an Assembly Link (AL) between two parts. Despite these, KC (Key Characteristic)
[6] can severely impact the quality of a product. The KC of an assembly is shown as a
measurement in DFC. Figure 1 shows a DFC model of a simple assembly. There is a KC
the assembly should achieve between Part A and Part C. It depends on the coordinate
system S1 and S2. The variation of S1 is passed to S2 through this chain: A-B-C. Thus,
the variation of KC is calculated by analyzing the chain A-B-C.
76   Assembly Variation Analysis based on Deviation Matrix

AF
AF
KD
AF AL Part B

AL
AF
KD Part A
Part C
KC KD
S1

S2

Figure 1. Datum Flow Chain

There are two prerequisites before modeling an assembly DFC:


–– Each part or component is completely restrained in certain position and posture.
–– Each assembling step completely restrains the current part or component which
is independent of the steps later.

2.2 Deviation Matrix

Each feature is not in a theoretical location because of variations. The feature’s actual
coordinate system is different from the theoretical coordinate system. dx, dy and dz
represents the line deviation along theoretical coordinate axis X, Y and Z respectively
as shown in Figure 2a. dθx, dθy and dθz represents the angle deviation around
theoretical coordinate axis X, Y and Z respectively as shown in Figure 2b.

(a) (b)

Figure 2. Deviation of coordinate system


 Assembly Variation Analysis based on Deviation Matrix   77

The deviation matrix ( ∆ ) is calculated using the following formula through math
deduction:

 r11 r12 r13 dx 


r
r22 r23 d y 
∆ + I =  21 (1)
 r31 r32 r33 dz 
 
0 0 0 1

Where
r11 = cos dθ z cos dθ y
r12 = − sin dθ z cos dθ x + cos dθ z sin dθ y sin dθ x
=r13 sin dθ z sin dθ x + cos dθ z sin dθ y cos dθ x
r21 = sin dθ z cos dθ y
=r22 cos dθ z cos dθ x + sin dθ z sin dθ y sin dθ x
r23 = − cos dθ z sin dθ x + sin dθ z sin dθ y cos dθ x
r31 = − sin dθ y
r32 = cos dθ y sin dθ x
r33 = cos dθ y cos dθ x

According to the Small Displacement Torsor (SDT) theory [7], the trigonometric
functions above can be revised:
lim sin
= dθi d=
θi , (i x, y, z )
dθ → 0

θi 1,=
lim cos d= (i x, y, z )
dθ → 0

The final expression of deviation matrix is determined:

 0 − dθ z dθ y dx 
 
dθ z 0 − dθ x dy 
∆=
 − dθ y dθ x 0 dz 
 
 0 0 0 0
(2)

Table 1 shows the method to transform the different kinds of tolerance into deviation
matrices. Because tolerance is a region with upper and lower boundary in actual
product process, the associated deviations are also represented as regions:
dθi ∈ [min(dθi ), max(dθi )], (i =
x, y , z )
di ∈ [min(di ), max(di )], (i =
x, y , z )
78   Assembly Variation Analysis based on Deviation Matrix

Table 1. Entries Into 4×4 Deviation Matrices Associated With Various GD&T Control Frames

TYPES CHARACTERISTIC CONTROL FRAME GEOMETRY DEVIATIONS

Location Position @ MMC Same as below but


φTp M
replace T by
Position φTp TCp2 = S S2 + S P2 + (TP + Smax + E ( S )) 2

Concentricity T
d= d=
φTc x y
2
T
θ x d=
d= θy
L
dθ z 0
d z 0,=
=
Runout Circular Runout
TR A T
= T=
P T=
C TR
L = length of clindrical
Total Runout
tolerance zone.
(on surface ∥ TR A
to datum axis)
Total Runout d=
x d=
y 0
(on surface ⊥ TR TR
A θ x d=
d= θy
to datum axis) D
= S , dθ z
d z T= 0

d= d= 0
Planar Size Distance x y

2TS 2TS
=dθ x = , dθ y
Ly Lx
= S , dθ z
d z T= 0

Lx and Lz are length and


width of the tolenrance
zone.
Orientation Parallelism d=
x d=
y 0
Tpa A =dθ x
2TO
= , dθ y
2TO
Ly Lx
Perpendicularity = S , dθ z
d z T= 0
Tpe T=
O T=
pa T=
pe Ta
Lx and Lz are length and
Angularity width of the tolenrance
Ta A zone.

2.3 Assembly Variation Calculation with Homogeneous Transformation

A coordinate system is attached to a rigid object or a feature under consideration.


Homogeneous transformation matrix (HTM) [8] is a 4×4 matrix that describes either the
pose of a coordinate system with respect to the reference one, or the displacement of a
coordinate system into a new pose. Figure 3 shows the homogeneous transformation
between two parts.
 Assembly Variation Analysis based on Deviation Matrix   79

Figure 3. Homogeneous transformation

In the homogenous transformation matrix, the upper left 3×3 matrix represents the
orientation of the object, while the right-hand 3×1 column describes its position (e.g.
the position of its center of mass). The last row is always represented by 0, 0, 0, 1.
Thus, the general format of the homogeneous transformation matrix is defined as:

R P (3)
M = T 
0 1

Where R is relative rotation, P is relative position.


After determining the homogeneous transformation matrices of all assembly
links, all HTMs are multiplied successively from the head feature to the tail feature,
obtaining the nominal tail feature position with respect to the nominal head feature
position.
The actual tail feature position can be calculated using the following formula:
n
DTF , HF =Dtf , hf + ηTF , HF =(∏ M (i )) DHF + ηTF , HF (4)
i =1

Where DTF,HF is the actual tail feature position with respect to the actual head feature,
Dtf,hf is the nominal one, ηTF,HF is the assembly (KC) variation between tail feature and
head feature, M(i) is a homogeneous transformation matrix, DHF is the actual head
feature position with respect to the world reference coordinate system, n is the sum
number of features in the DFC.
If assembly variation is allocated to each feature, DTF,HF is also represented as:
n
DTF , HF (∏ ( M (i )( I + ∆(i )))) DHF
= (5)
i =1
Where ∆(i) is the deviation matrix of the ith feature.
According to (4) and (5), assembly variation is calculated as:

n n
(6)
ηTF , HF (∏ ( M (i )( I + ∆(i )))) DHF − (∏ M (i )) DHF
=
=i 1 =i 1
80   Assembly Variation Analysis based on Deviation Matrix

2.4 Monte-Carlo Simulation

Monte-Carlo methods are a broad class of computational algorithms that rely on


repeated random sampling to obtain numerical results. They are used to model the
probability of different outcomes in a process that cannot easily be predicted due to
the intervention of random variables [9]. For calculating assembly variation in this
paper, the basic steps of Monte-Carlo simulation are following:
Step 1: Define the variation domain of each part or component.
Step 2: Generate the deviation matrix of each feature randomly from a probability
distribution over the domain.
Step 3: Calculate assembly variation with homogeneous transformation matrices.
Step 4: Repeat the step 2 to step 3 until the satisfied samples are obtained, then
calculate the distribution of assembly variation.

Because the manufacturing deviation of each part must meet the tolerance requirement
and the manufacturing process is known, the domain and probability distribution of
deviation is determined. The simulation flow chart is shown as Figure 4.

Set samples: N
i=0

No
If i<N?

Yes
Generate deviation
matrix ∆ j

=DTF (∏ ( M ( j )( I + ∆( j )))) DHF

Dtf = (∏ M ( j )) DHF

ηi DTF − Dtf
=

i = i +1

Output
the distribution of η

Figure 4. Simulation flow chart


 Assembly Variation Analysis based on Deviation Matrix   81

3 Implementation and Case Study

A software module is developed by using Pro/toolkit functions on Microsoft Visual


studio 2008 platform based on the proposed algorithm above. Pro/Toolkit is a large
library of C functions that enable external applications to access the Pro/ENGINEER
database and retrieve information and data from the assembly geometrical model [10].
The module serves as a part of an Assembly tool which is an external menu-driven
application for Pro/E as shown in Figure 5 and the interface of variation analysis is
shown in Figure 6.

Figure 5. Assembly tool menu

Figure 6. Interface of variation analysis


82   Assembly Variation Analysis based on Deviation Matrix

An engine crankshaft and connecting rod mechanism assembly as shown in Figure


7 is used to illustrate the application of the effectiveness measures. This assembly
is composed of piston, piston pin, connecting rod, bearing bush, crankshaft and
cylinder. On the piston stroke, the positions of top dead center (TDC) and bottom dead
center (BDC) directly impact the stability of an engine. Thus the distance between the
upper surface of piston and the spindle axis of crankshaft is the key characteristic of
this assembly. It is marked as ‘DimKC’ in Figure 7. This study only analyzes the DimKC
variation in the situation that the piston is on the top dead center.

Figure 7. Assembly model

After choosing parts with tolerance in the DFC and setting the distribution of each
deviation, the assembly variation is calculated in the analysis module with the Monte-
Carlo simulation. The analysis result of DimKC distribution is shown in Figure 8.

Figure 8. Distribution of DimKC


 Assembly Variation Analysis based on Deviation Matrix   83

According to the distribution obtained with 10000 samples, the value of DimKC fits
the normal distribution. The mean value is 197.76mm and the stand deviation is 0.174.
The nominal dimension value of DimKC is 197.75mm and the allowable deviation is
±0.5mm. Thus the process capability index (Cp/Cpk) [11] can be calculated.

4 Conclusion

Variation analysis plays a crucial role in assembly design. This paper proposes a
variation analysis method based on deviation matrix. By using matrix to represent
geometric tolerance, the computerization of assembly design has been completed.
A variation analysis module embedded in Pro/E can significantly improve efficiency
at the design stage. The current algorithm considers that the tolerance requirement
is the only source of variation. In actual assembling, variation is influenced by many
factors, such as assembly fixture and part deformation. This should be addressed
through exploring a consistent format of deviation in the future research.

References
[1] Requicha A A G, Chan S C. Representation of geometric features, tolerances, and attributes in
solid modelers based on constructive geometry[J]. Robotics and Automation, IEEE Journal of,
1986, 2(3): 156-166.
[2] Dantan J Y, Mathieu L, Ballu A, et al. Tolerance synthesis: quantifier notion and virtual
boundary[J]. Computer-Aided Design, 2005, 37(2): 231-240.
[3] Hu Jie. Study on theories and methods of geometric tolerance design based on variational
geometric constraints network[D]. Hangzhou, Zhejiang: Zhejiang University, 2001.
[4] Mao jian. Study on the modeling of tolerance based on mathematical definition and form errors
evaluation[D]. Hangzhou, Zhejiang: Zhejiang University, 2007.
[5] Mantripragada R, Whitney D E. The datum flow chain: a systematic approach to assembly
design and modeling[J]. Research in Engineering Design, 1998, 10(3): 150-165.
[6] Whitney D E. The role of key characteristics in the design of mechanical assemblies[J]. Assembly
Automation, 2006, 26(4): 315-322.
[7] Asante J N. A small displacement torsor model for tolerance analysis in a workpiece-fixture
assembly[J]. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of
Engineering Manufacture, 2009, 223(8): 1005-1020.
[8] Cai Zixing. Robotics[M]. Beijing: Tsinghua University Press, 2001, 29-36.
[9] Rubinstein R Y, Kroese D P. Simulation and the Monte Carlo method[M]. John Wiley & Sons,
2011.
[10] Srikumaran S, Sivaloganathan S. Proving manufacturability at the design stage using
commercial modeling Software: Through feature mapping and feature Accessibility [J].
Computer-Aided Design and Applications, 2005, 2(1-4): 507-516.
[11] Kotz S, Johnson N L. Delicate relations among the basic process capability indices Cp, Cpk and
Cpm, and their modifications[J]. Communications in Statistics-Simulation and Computation,
1999, 28(3): 849-866.
Yan XU*, Rui CHANG, Ya-fei WANG
Design and Realization of Undergraduate Teaching
Workload Calculation System Based on LabVIEW
Abstract: Undergraduate teaching workload calculation is a concern for both
teachers and teaching managers. However, in NCWU (North China University of
Water Resources and Electric Power), teachers calculate their teaching workload in
a manual way, which is time-consuming, labor-consuming and error-prone. In this
paper, we have designed a calculation system based on LabVIEW (Laboratory Virtual
Instrument Engineering Workbench). LabVIEW is graphic programming, which
is more suitable to the calculation system than C language and other high-level
programming languages. With this system, we can calculate both one teacher’s and
many teachers’ teaching workload automatically, fast and accurately. The calculation
results can be exported to a specified-path spreadsheet. In this way, teaching
management efficiency can be greatly improved. Furthermore, by generating an
application file and setup file, the calculation system can operate on an ordinary
computer without a LabVIEW developing system which is economical and can reduce
memory loss.

Keywords: Teaching workload; LabVIEW; Front panel; Block diagram

1 Introduction

LabVIEW, which is the short form of Laboratory Virtual Instrument Engineering


Workbench, is a powerful and flexible instrumentation and analysis software system.
It is easy to use and has a friendly interface. It is characteristic of graphic programming
[1-3]. In LabVIEW, an efficient application program, such as data management and
scientific calculation, can be developed [4-7].
Undergraduate teaching workload in university provides the basis for work plans
and for determining teacher staff number and teacher allowance. Academic affairs
offices and each department’s teaching secretaries are responsible for teaching
workload management. Academic affairs offices are responsible for giving total
teaching plan, each branch department is in charge of a teaching work plan operation
and teaching workload calculation. Until now, teaching workload has been calculated
by manual way, which has five steps:

*Corresponding author: Yan XU, Electric Power College, North China University of Water Resources
and Electric Power, NCWU, Zhengzhou, China, E-mail: dlxuyan@ncwu.edu.cn
Rui CHANG, Ya-fei WANG, Electric Power College, North China University of Water Resources and
Electric Power, NCWU, Zhengzhou, China
 Design and Realization of Undergraduate Teaching Workload Calculation System    85

–– Each teacher fills in workload form term by term and hands it to teaching secretary
office;
–– The teaching secretary checks whether the workload calculation conforms with
the regulations;
–– The teaching secretary informs the teachers who have made mistakes to modify
their forms;
–– The teaching secretary puts the forms in order and inputs data in spreadsheet
one by one;
–– The teaching secretary prints a workload spreadsheet and finally submits it to the
academic affairs office.

In light of this, calculating teaching workload in a manual way is time-consuming,


labor-consuming and error prone.
An automatic calculation of teaching workload by a program can solve these
problems. The normal programming language is C language, which is powerful
when processing data. However, LabVIEW is more suitable for a teaching workload
calculation system for its graphic programming and simple programming. With these
important characteristics, we can design a teaching workload calculation system
using simpler programming and simpler program debugging than C language and
other advanced programming languages, and the operation interface is quite friendly.
In this paper, we build mathematical models according to a workload calculation
method introduced in NCWU file number [2000]232, named as “NCWU Undergraduate
Teaching Workload Calculation Method”. Programming with LabVIEW, the calculation
system can calculate one or many teachers’ undergraduate teaching workload. It is
accurate, time-saving, labor-saving and simple-operating, all of which can improve
teaching management efficiency greatly.
The virtual instrument program’s operation is based on LabVIEW software
platform, which would occupy amounts of computer memory. To use the calculation
system on a computer without LabVIEW developing system, we need to generate an
application file and setup file [8,9].

2 Calculation Methods and Mathematical Models

Undergraduate teaching consists of course teaching, graduation project and thesis,


curriculum design, internship and experiment, etc.

2.1 Course Teaching

Course teaching workload is comprised of course explanation workload (W11) and


homework correction workload (W12).
86   Design and Realization of Undergraduate Teaching Workload Calculation System

The essential course information comprises the course name, theoretical class
hour and student number, etc. Without considering course repetition, course
explanation workload is related to theoretical class hour, symbolized by H, and
student number, symbolized by N1.
Supposing the integer part and decimal part of N1/30 is separately marked as a
and b, K represents class number. The decimal of K can only be 0.5, b1 is a function of
b, K1 is a function of K:
0, b ≤ 0.3

b1 = 0.5,0.3 < b ≤ 0.7 (1)
1, b > 0.7

K = a ˗ b1 (2)

K1 = 1 + 0.1 × K (3)

Course explanation workload W11 is equal to the product of H and K1, that is:

W11 = H × K1 (4)

Homework correction workload, marked as W12, is the product of theoretical class


hour H, actual correction class number K’, and correction coefficient K2.
When teaching a course, the teacher is obligated to correct homework of one
class. So if a special teacher is arranged to correct a course’s homework, homework
correction class number is the total class number K minus one. On the contrary, if
homework is corrected by the same teacher as course explanation, the teacher
can gain only half the specially arranged teacher’s workload. So, the value of K’ is
determined by whether the homework correction teacher is special.
Correction coefficient K2 depends on course category, or homework category, as
described in Table 1:

Table 1. Homework correction efficient K2

Category K2 Remarks

A 0.23 Mathematics, physics, cartography, rational mechanics, mechanics of


materials, structural mechanics, foreign language
B 0.13 Professional courses, Basic courses of professional technology

C 0.05 Political courses, optical courses

By all accounts, homework correction, W12 can be expressed as:

W12 = H × K' × K2 (5)

Above all, homework correction workload subprogram has four inputs: class number
K, theoretical class hour H, homework category and whether the homework correction
 Design and Realization of Undergraduate Teaching Workload Calculation System    87

teacher is special. In LabVIEW, K2, which can express homework category, can be
realized by a drop-down list, as shown if Figure 1, so do ‘whether the homework
correction teacher is special’.
Course teaching workload, marked as W1, is the sum of course explanation
workload and homework correction workload, that is:

W1 = W11 + W12 (6)

Figure 1. ‘Homework category’ drop-down list properties.

2.2 Graduation Project and Thesis

Graduation project and thesis workload, marked as W2, is related to its credit C2,
student number N2 and correction coefficient K3. It is equal to the product of C2, 16
class hour per score, N2/(30 persons per class), correction coefficient K3, whose value
is described in Table 2.

Table 2. Graduation project and thesis correction coefficient K3

Graduation project and thesis category K3

Literature, finance, law 1.0


Science and engineering 1.1

In principle, one teacher can instruct no more than 14 students’ graduation design
and thesis, or the extra part’s workload will be halved.
If a teacher has not instructed any student, but has taken part in the oral defense,
he (she) can obtain two workloads per day.

C2  16  K3  N 2 30, N 2  14
(7)
W2 
C
 2  16  K
3
 14 30  C
2
 16 K
3
 
N
2
 14 30 , N
2
 14

C3  16  K 4  N3 30, N3  30
W3 
 
C3  16  K 4  C3  16  K 4  N3  30 30, N3  30
88   Design and Realization of Undergraduate Teaching Workload Calculation System

2.3 Curriculum Design and Internship

Curriculum design and internship workload can be marked as W3. To distinguish


them, we can use W31 and W32 as respective mark. Curriculum design and internship
workload is gained by instructing curriculum design or internship. It is related to its
credits C3, class number N3 and correction coefficient K4, as shown in Table 3:

Table 3. Curriculum design and internship correction coefficient K4

Category Curriculum design Internship

In school local nonlocal


K4 0.5 0.55 0.6 0.7

The value C2of 16


internship
 K  N 30correction
, N  14 coefficient varies with internship place.
3 2 2
Specially,
W2 
practice inC2hydropower
 16  K  14 30
3 station,
2 4 3

 C  16K Kis 0.75. 
if the practice is a surveying practice, geological practice or production
 N In14principle,
2
30 , N  14one teacher can instruct only one
2
standard class’s curriculum design or internship, otherwise, the extra workload will be
halved.
The expression of W3 is
C3  16  K 4  N3 30, N3  30 (8)
W3 
 
C3  16  K 4  C3  16  K 4  N3  30 30, N3  30
2.4 Experiment

Experiment workload, marked as W4, is gained by guiding experiments. It is related to


class number per time and repetitions, marked as Np and R respectively. W4 is equal to
product of experiment class hour H4, R, Np, batch experiment correction coefficient K5
and experimental method correction coefficient K6, which are shown in Table 4 and
Table 5 in detail.

Table 4. Correction coefficient K5

Np 1/2 1 2 ≥3

K5 1.0 0.8 0.6 0.5

Table 5. Correction coefficient K6

Experiment method Conventional instruments Computer simulation

K6 1 0.7

The calculation formula of W4 is:

W4 = H4 × NP × R × K5 × K6 (9)
 Design and Realization of Undergraduate Teaching Workload Calculation System    89

3 Results

3.1 Front Panel Design of the Main Program

The main program’s front panel is the main display panel, so it should be concise,
friendly and could provide enough important information for users. In this paper,
we design a system for undergraduate teaching workload calculation. The main
program’s front panel is shown in Figure 2:

Figure 2. Front panel of the main program

A teacher is allowed to take six courses and to instruct three experiments, three
curriculum design and three internship at most in a term. We use 1*6, 1*3, 1*3, 1*3
arrays as inputs of courses, experiments, curriculum design and internship. Likewise,
corresponding results can also be shown by 1*6, 1*3, 1*3, 1*3 arrays respectively.
Graduation project and thesis has nothing to do with array. Both the input arrays and
display arrays should be shown on front panel for users.
To keep the interface concise and clear, we divide it into several modules: the
teacher’s basic information, data export path choosing, course teaching, graduation
project and thesis, curriculum design, internship, experiment and total workload
display, etc. In this way, not only the total workload, but also detailed workload
information, such as workload gained by teaching a course, instructing graduation
project and thesis, curriculum design, internship and experiment can be inquired in
great detail.
To check the accuracy of each subprogram, we conduct experiments and show
results by Figures in each subprogram.
90   Design and Realization of Undergraduate Teaching Workload Calculation System

3.2 Course Teaching Workload

The front panel and calculation results of course teaching subprogram is shown in
Figure 3.

Figure 3. Front panel and results of ‘course teaching workload.vi’.

In this example, we suppose there is a professional course <LabVIEW>, student


number is 86, homework correction teacher is special, theoretical class hour is 80.
According to (1), (2), (3), for N1=86, so a=2,b=0.867, b1=1, K=a+b1=3, K1=1+0.1×K=1.3.
According to (4), we can calculate the course explanation workload as:
W11=H×K1=52.
Furthermore, according to Table 1, for professional course, K2=0.13; for special
homework correction teacher, K’=K-1=2. According to (5), homework correction
workload is W12=H×K’×K2=10.4.

3.3 Graduation project and thesis workload

The front panel and calculation results of course teaching subprogram is shown in
Figure 4.

Figure 4. Front panel and results of ‘graduation project and thesis workload.vi’.
 Design and Realization of Undergraduate Teaching Workload Calculation System    91

In this example, we suppose graduation project and thesis credit C2=12, student
number N2=9 and the graduation project and thesis’s category is science and
engineering, so correction coefficient K3 = 1.1. According to (7), we can calculate
graduation project and thesis workload W2 = C2×16×K3×N2/30=63.36.

3.4 Curriculum design workload

The correction coefficient is 0.5 in curriculum design, whose front panel and
calculation results is shown in Figure 5. As shown, credit C3=1, class number K=2.
According to Table 3, curriculum design’s correction coefficient K4=0.5.

Figure 5. Front panel and results of ‘curriculum design workload.vi’.

Thus, according to (8), curriculum design workload’s calculation formula and result
is W31= C3×16×K4+ C3×16× K4×( K -1)/2=12.

3.5 Internship workload

The value of internship correction coefficient varies with internship place, the choice
of which could be realized by means of a drop-down list in LabVIEW. The front panel
and results of internship workload calculation subprogram is shown in Figure 6.

Figure 6. Front panel and results of ‘internship workload calculation.vi’.

3.6 Experiment workload

The same as above, the choice of class number per time and experiment type can also
be realized by drop-down lists. Figure 7 that shows us the front panel and calculation
results of experiment workload.
92   Design and Realization of Undergraduate Teaching Workload Calculation System

Figure 7. Front panel and results of ‘experiment workload.vi’.

In this example, we suppose internship credit C3=1, class number N3=2, internship
place is a hydropower station, so K4=0.75.
According to (8), we can know: W32=1×16×0.75+1×16×(2-1)×0.75×1/2=18.
We suppose a teacher conducted three course experiments, an experiment hour
could be expressed by a 1*3 array, so do class number per time Np, R, K5, K6 and
calculation results. In back functional block diagram, the experiment workload
calculation subprogram is performed three times in a ‘for loop’. Total experiment
workload can be got by summing the experiment workload array.

4 Application program and setup program generation

Usually, we hope to operate the calculation system on a normal computer without


LabVIEW developing system. For this, we need to generate an application program
and setup program.
Firstly, a project including all the subprograms needs to be built.
Secondly, we need to generate application program, whose operating menu
is: “Tools”, select “generate EXE file according to VI”, then operate according to
suggestions.
Until now, the generated file can only operate on computers with LabVIEW
software. To operate LabVIEW program on a computer without LabVIEW developing
system, we could generate independent application program. Also, to facilitate the
user, we should generate setup program [8].
 Design and Realization of Undergraduate Teaching Workload Calculation System    93

5 Conclusions

In this paper, we have designed and realized a undergraduate teaching workload


calculation system for North China University of Water Resources and Electric Power
(NCWU) NO.[2000]232 teaching file: ‘NCWU Workload Calculating Method’ basing on
LabVIEW software platform. The system can operate on a general computer without
the LabVIEW developing system. With this system, we can calculate workload
of one teacher or more accurately by simple operation. Calculation results can
be automatically saved to a specified-path spreadsheet item by item. Apparently,
teaching management efficiency is improved greatly with this system.

References
[1] National Instrumnts, “LabVIEW Web Services,” http://software.oit.pdx.edu/math/labview/
www/services.htm, 2002.
[2] National Instruments. Measurement and Automation Catalog. 2002.
[3] National Instruments. “Taking Your Measurements to the Web with LabVIEW,” http://www.
ni.com/LabVIEW.
[4] National Instruments Corp. LabVIEW User Manual. Austin, Texas USA, 1998:27-444. 
[5] Cor J. Kalkman, MD, PhD. LabVIEW: A software system for data acquisition, data analysis, and
instrument control. Journal of Clinical Monitoring.1995(1) :51-58.
[6] Orabi, I. I. (2002). Application of LabVIEW for undergraduate lab experiments on materials
testing. In: Proceedings of American society of engineering education annual conference &
exposition, pp. 52-55.
[7] Peter A. Blume. The LabVIEW Style Book[M]. London: Prentice Hall, 2003.
[8] XU Yan.  Research on Pulse Wave Signal Processing Methods and Clinical Experiments[D].
CHONGQING UNIVERSITY 2007.
[9] XU Yan, HE Wei, LUO Kailiang, CHEN Zhangrong, YAO Jiange, YU Chuanxiang. The Research and
Practice of double-cuff method used to calculate pulse wave velocity [J]. Journal of Medical
University Of Chongqing. 2007(09):940-945s
Yan SU, Yi-Shu ZHONG
Smooth Test for Multivariate Normality of
Innovations in the Vector Autoregressive Model
Abstract: This paper presents a new approach to testing the multivariate normal
distribution of innovations in the vector autoregressive (VAR) model. The test is based
on the smooth test for uniformity on the surface of a unit sphere. The asymptotic
null distribution of the transformed residuals from the VAR model is obtained. An
algorithm is given to estimate the critical values of the test statistic by Monte Carlo
simulation. Moreover, the components of the smooth test can be used to detect
specific departures from the null hypothesis.

Keywords: VAR model; innovation; Residuals; Multivariate normal distribution;


Smooth test

1 Introduction

The multivariate time series yt follows a VAR model of order p, VAR(p), if

yt = c + ϕ1 yt–1 + ϕ2 yt–2 + ...+ ϕp yt–p + εt, (1)

E(εt) = 0, Cov(εt) = Σ, (2)

where c is a d × 1 constant vector and ϕi are d × d matrices for i = 1,..., p, ϕp≠0, and
εt is a sequence of independent and identically distributed random vectors, and the
covariance matrix ∑ is a positive definite matrix. The disturbances εt are labeled the
innovations in the VAR (p) model.
Many data sets involve several related time series, and we are interested in the
dynamic relationship among the series. The most commonly used vector time series
model is the VAR (p) model, particularly in economics and business, and the VAR (p)
model often provides a suitable framework for conducting tests of theoretical interest.
Classical theory on the VAR (p) model assume the innovations εt in (1) are
normally distributed. To avoid wrong conclusions in multivariate time series analysis,
the distributional assumption on the innovations should be checked. Let F be the
unknown distribution of the innovations εt and let F0 be the Nd(0,∑) distribution. We
wish to test the hypothesis:

*Corresponding author: Yan SU, School of Mathematics and Physics, North China Electric Power
University, Baoding China, E-mail: suyanuf@163.com
Yi-Shu ZHONG, School of Mathematics and Physics, North China Electric Power University, Baoding
China
Smooth Test for Multivariate Normality of Innovations in the Vector Autoregressive Model    95

H0:F = F0 (3)

In Pearson’s χ2 test, it is not clear how the number of the classes should be determined.
Applying the probability integral transformation, any completely specified continuous
probability density function can be transformed to uniformity. Let U(0,1) denotes the
uniform distribution on the interval (0,1). Neyman constructed the smooth test to be
asymptotically optimal for testing U(0,1) distribution [1]. Neyman’s idea is that the
null hypothesis density is embedded in a parameter smooth alternative, such that
when the vector of parameter η = 0, the alternative is the same as the hypothesized
distribution. Then testing for uniformity is equivalent to testing

H0: η = 0 against H1: η ≠ 0.

Let ώd denote the surface of a unit sphere centered at the origin in Rd and let U(ώd)
denote the uniform distribution on ώd. The goodness-of-fit test for the multivariate
normal distribution can be translated into the goodness-of-fit test for the uniform
distribution on ώd. Based on the smooth test for uniformity on the surface of a unit
sphere [2], Su and Huang proposed the smooth test for multivariate normality [3],
power simulation showed that the smooth test has good power against a wide
variety of alternatives.

Let εˆt be the residuals of the VAR(p) model. The asymptotic null distribution
of the transformed residuals is U(ώd). Therefore, the goodness-of-fit test for the
multivariate normal distribution of the innovations εt in (1) can be translated into
the goodness-of-fit test for U(ώd). Based on the smooth test for U(ώd), We propose a
novel test for the multivariate normal distribution of the innovations εt in (1). The
transformation based on Cholesky decomposition leads to transformed residuals
whose joint distribution asymptotically does not depend on the unknown parameter
∑ of the Nd(0,∑) distribution. Thus, the critical values of the test statistic can be
estimated by Monte Carlo methods with ∑ = Id, where Id denotes the d × d identity
matrix.
The paper is organized as follows. In Section 2, we introduce the VAR(p) model
and some lemmas. In Section 3, the smooth test for multivariate normal innovations
is proposed. The asymptotic null distribution of the transformed residuals is
obtained. In Section 4, the algorithm to compute the test statistic and the algorithm
to estimate the critical values are given. Section 5 concludes and discusses further
research.

2 The VAR(p)model and some lemmas

Let yt, c, ϕt be defined in (1) and let

xt = (1 yTt −1 yTt −2  yTt − p )T , β T = (c φ1 φ2 φ p ) d ×( dp +1) . (4)


96   Smooth Test for Multivariate Normality of Innovations in the Vector Autoregressive Model

Then (1) can be written as

yt = β T xt + ε t , t = 1, , n ,
where β t denotes the transpose of
T
βt . Let
=Y ( y= T
1 ,  , yn ) , X x1 , , xn )T , ε (ε1 , , ε n )T .
(=
Then the VAR(p) model(1)-(2) takes the form

Y Xβ +ε
= , (5)

E[vec(ε )]= 0, Cov[vec(ε T )]= I n ⊗ Σ, (6)


T

where Y and ε are n×d random matrices, X is a known n×(dp+1) matrix, and β is
an unknown (dp+1)×d matrix. Here, the sign ⊗ denotes the Kronecker product of
matrices.

Lemma1[4] Let the VAR(p) model be defined by (5) and (6). Let ε t ⊔ � N d (0, Σ) and let
y− p +1 , y− p + 2 , , y0 be given. Then the conditional maximum likelihood estimators
of β and Σ are

βˆ = ( X T X ) −1 X T Y , (7)
1
Σˆ = εˆT ε , εˆ= (ε1 , ε n )T= Y − X βˆ ,. (8)
n
Lemma2[4] Let the VAR(p)model be defined by(5) and(6),let β̂ and Σ̂ be defined by
(7) and (8), respectively. Let

PX = X ( X T X ) −1 X T (9)
and let the solutions of the determinant equation
I d − φ1 B − φ2 B 2 −  − φ p B p =
0
are greater than 1 in absolute value. Then
εˆ (ε1 , , ε n )=
(a) =
T
( I n − PX )ε .
p p
(b) βˆ → β , Σˆ → Σ , as n → ∞.

Lemma3[5] (The Cholesky decomposition). If G is a d×d positive definite matrix then


there exists a unique d×d lower-triangular matrix L with positive diagonal elements
such that G = LLT.
Let s = ( s1 , s2 , , sd ) denotes a typical point in Rd. For α = (α1 , α 2 ,α d ) a
T T

multi-index, define
sα = s1α1 s2α 2  sdα d , α = α1 + α 2 +  + α d , Dα = D1α1 D2α 2  Ddα d , (10)
Smooth Test for Multivariate Normality of Innovations in the Vector Autoregressive Model    97

αj
where D j denotes the αjth partial derivative with respect to the jth coordinate
variable. The collection of all spherical harmonics of degree m will be denoted
by Hm(ώd). Let σ be the normalized surface-area measure on ώd (so that σ(ώd)=1).
Let the inner product on Hm(ώd) be defined by
∫Ωd
p (τ )q (τ )dσ (τ ) , p, q ∈ H m (Ω d ) .
α 2-d
Lemma4[6] If d>2 then the set {D = s : α m and α1 ≤ 1}
α
is a vector space basis of Hm(ώd), where D is defined in (10) and s denotes the
Euclidean norm of s.

(d)
Definition1[7] Let U ⊔ � U (Ω d ) . An d×1 random vector ζ is said to have a spherical
d
distribution if ζ has a stochastic representation ζ = γ ⋅U (d) for some random variable
d
γ≥0, which is independent of U(d). Here = signifies that the two sides have the same
distribution.

Lemma5[7] If a d×1 random vector ζ has a spherical distribution then ζ ζ ⊔


� U (Ω d ) ,
where ⋅ denotes the Euclidean norm.

Lemma6[2] Let
= N k ,d dim[ H k (Ω d )] . Let B=
k {Vk , j (u ) ∈ H k (Ω d ),=
j 1, 2, , N k ,d }
be a CONB for Hk(ώd). Let B = {Bk : k = 0,1,, m} . Then B is a set of orthonormal
functions.
Let A = B\B0 and let us denote N = #(A), we have

N= d + ∑ k = 2 N k ,d
m
, (11)

where # denotes cardinality. The elements of A are arranged with k = 1,...,m. The set A
can be written as A = {hi(u):i = 1,...,N} with

h1 (u ) = V1,1 (u ) ,  , hN (u ) = Vm , Nm ,d (u ). (12)
Let f (⋅) be a probability density function on Ωd and let ad denote the surface-area of
Ωd. Let
1
f 0=
(u ) , u ∈ Ω d , (13)
ad
Then f 0 is uniform on Ωd. Consider the null hypothesis
H 0 : f (u ) = f 0 (u ). (14)
A smooth alternative probability density function can be defined by [2]
N
g N (u ,η ) = C (η ) exp{∑ηi hi (u )} , (15)
i =1

where η = (η1 ,η 2 , ,η N ) and h1 , h2  , hN are defined by (12).


T
98   Smooth Test for Multivariate Normality of Innovations in the Vector Autoregressive Model

(d ) (d )
Lemma7[2] (Smooth test for U(Ωd)) Let U1 ,,U n be a random sample from g N (u ,η )
defined in (15). Then (a) The score statistic Ψ N for testing H 0 : η = 0, H1 : η ≠ 0
is N
1 n
= ΨN
=i 1 =j 1
∑ W
= i
2
, Wi
n

hi (U (j d ) ) , (16)

(b) Under H 0 : η = 0 , Ψ N is asymptotically distributed as χ N random variable, where


2

χ N2 represents chi-square distribution on N degrees of freedom.

Remark1 Theoretical analysis and power simulation showed that the smooth
tests for U(ώd) based on spherical harmonics of degree at most m = 2 are generally
powerful. Let ΨN(B1), ΨN(B2) and B11 ∪ BB2)2 )denote ΨN in (16) constructed by B1,
Ψ NΨ(N(B
B2Ψand
N ( B
B11 ∪ B
B22 )
in Lemma6, respectively. ΨN(B1) can be used to detect the center
of mass of U(ώd), ΨN(B2) can be used to detect the moment of inertia of U(ώd).
B11 ∪ BB2)2 )combines ΨN(B1) and ΨN(B2) [2].
Ψ NΨ(N(B

3 Smooth test for multivariate normal distribution of innovations

Let Σ and Σ̂ be defined in (6) and (8), respectively. Let the Cholesky decomposition of
Σ and Σ̂ be
Σ= [ L(Σ)][ L(Σ)]T , Σˆ= [ L(Σ)][ L(Σ)]T , (17)
−1
respectively. Let L be the inverse of L and let εˆt be defined in (8). Let
[ L( ˆ )] εˆt , t =
zt =Σ −1
1, , n, Z = ( z1 , , zn )T , (18)

ξt( d ) z=
= t zt (ξ1t , , ξ= T
dt ) , t 1, , n . (19)
Theorem1 Let the conditions of lemma2 hold. Let the n × d matrix Z and the d -
vectors ξt , t ≤ n be defined in (18) and (19), respectively. Then
(d )

(a) The asymptotic distribution of zt is N d (0, I d ) and z1 , , zn are asymptotically


independent. The distribution of Z asymptotically does not depend on Σ in (6).
(b) The asymptotic distribution of ξt is U (Ω d ) and ξ1 , , ξ n are asymptotically
(d) (d) (d)

independent.

Proof By Lemma2(b), we have


P P
εˆt → ε t ⊔� N d (0, Σ) , L(Σˆ ) → L(Σ), n → ∞ , (20)
where L(Σ) and L(Σ ˆ ) are defined by (17). Thus
P
z=
t [ L(Σˆ )]−1 εˆt → z=
t [ L(Σ)]−1 ε t , n → ∞ , (21)
Smooth Test for Multivariate Normality of Innovations in the Vector Autoregressive Model    99

where ε t is defined in (6). Since ε t ⊔ � N d (0, Σ) , by (20) and (21), we have


zt � N d (0, I d ) . Therefore, the asymptotic distribution of zt is N d (0, I d ) which

we write as
a
zt ⊔� N d (0, I d ) . (22)
Thus, the desired results of (a) is proved. By (21)- (22), and Lemma5, the desired result
of (b) is obtained.
Let ξt = (ξ1t , , ξ dt ) be defined in (19). Let h1 , , hN be defined in (15) and let
(d ) T

 (εˆ ) =∑ W 2 , W = 1 ∑ h (ξ ( d ) ) . (23)
N n
 =Ψ
Ψ N N i i i j
=i 1 = n j1

Remark2 Consider the null hypothesis (3), where F0 denotes the N (0, Σ) distribution
with the parameter Σ unknown. By Theorem1,a the goodness-of-fit test for F0 can be
translated into the goodness-of-fit test for ξt ⊔ � U (Ω d ) , i = 1, , n .
(d)

Remark3 Theorem1 indicates that ξ1 , , ξ n are asymptotically independent


(d) (d )

 (εˆ ) can be
U (Ω d ) random vectors. Hence, the critical values of the test statistic Ψ
estimated by Monte Carlo simulation with Σ = I d .

Remark4 The Ψ  (εˆ, B ) are called components of the entire statistic


 (εˆ, B ) and Ψ
1 2
Ψ (εˆ, B ∪ B ) , the components are asymptotically independent. One advantage of
1 2
Ψ (εˆ ) is that the orthonormal system B can be chosen to give good power against
k
particular alternatives.

4 The algorithm to implement the test statistic

The algorithm to compute Ψ  (εˆ ) in (23) consists of the following steps:


1. Compute the values of εˆt , t = 1, , n and Σ̂ in (8), respectively.
2. Compute the value of Z in (18).
3. Compute the values of ξ1 , , ξ n in (19).
(d) (d)


4. Compute the values of Wi , i = 1, , n in (23).
5. Compute the value of Ψ (εˆ ) in (23).

 (εˆ ) .
The multivariate normality is rejected for large value of Ψ
The algorithm to estimate the critical values of Ψ (εˆ ) consists of the following

steps:
1. Generate ε1 , ,ε n from the multivariate normal distribution N d (0, I d ).
∗ ∗

2. Let ε =(ε1 , ,ε n ) . By Lemma2(a), compute


∗ ∗ ∗ T
ˆ ∗ (ε1∗ , , ε n∗ )=
ε= T
( I n − PX )ε ∗ ,
where PX is defined in (9).
100   Smooth Test for Multivariate Normality of Innovations in the Vector Autoregressive Model

3. Compute [ L(Σˆ ∗ )]−1 εˆi∗ , i =


Σˆ ∗ =[εˆ ∗ ]T ε ∗ / n , zi∗ = 1, , n,
ˆ ∗=
where Σ

[ L(Σ )][ L(Σ )] . ∗ T

=
4. Compute ξi∗ (d) ∗
z=
i zi∗ (ξ1∗i , , ξ=
∗ T
di ) , i 1, , n .
1 n
5. Compute Wi∗ = ∑ hi (ξ j(d)∗ ) , i = 1, , N .
n j =1 N
6. Compute Ψ =Ψ (εˆ ∗ )= ∑ Wi∗2 .
 ∗ 
i =1

Steps 1-6 are then repeated M times to give a sample of replicates Ψ  ,Ψ  , , Ψ


∗  .
∗ ∗
1 2 M
  
Let Ψ (1) , Ψ (2) , , Ψ ( M ) be the order statistics and let α ∈ (0,1). The critical values
∗ ∗ ∗

(the (1 − α ) - percentiles) for Ψ  (εˆ ) can be estimated from Ψ  ∗ ,Ψ


(1)
 ∗ , , Ψ
(2)
∗ .
(M )

4 Conclusion

Suppose that the innovations εt of a stationary VAR(p) model follow a multivariate


normal distribution with mean zero and positive-definite covariance matrix Σ . Then
the normalized vectors ξt in (19) should be approximately uniformly distributed
(d)

on Ω d . Based on the smooth test for U (Ω d ) defined by Lemma7, the test statistic
 (εˆ ) in (23) is constructed which possesses nice properties. For the given sample
Ψ
size n, the critical values can be estimated by Monte Carlo methods.
The assumption of elliptic symmetry plays an important role in robustness
studies. Elliptical distributions include the multivariate normal, the multivariate t,
the multivariate Pearson Type II, and many other distributions. Elliptical distributions
are useful tools for modeling multivariate time series data since they provide an
alternative when the normality assumption fails. Su suggested the smooth test for
testing the hypothesis of elliptical symmetry[8]. The smooth test for the multivariate
normal distribution of the innovations in the VAR(p) model can be extended to testing
the elliptical distribution of the innovations in the VAR(p) model.

References
[1] J. Neyman, “ “Smooth test” for goodness of fit,” Skandinaviske Aktuarietidskrift, vol. 20, pp.
150-199, 1937.
[2] Y. Su, X. K. Wu, “Smooth test for uniformity on the surface of a unit sphere,” Proceedings of the
2011 International Conference on Machine Learning and Cybernetics, IEEE Press, pp. 867-872,
2011.
[3] Y. Su, Y. P. Huang, “Smooth test for multivariate normality, ” Advances in Intelligent Systems
Research, vol. 124, pp. 1690-1696, 2015.
[4] J. D. Hamilton, Time Series Analysis. Princeton, New Jersey: Princeton University Press, 1994.
Smooth Test for Multivariate Normality of Innovations in the Vector Autoregressive Model    101

[5] R. J. Muirhead, Aspects of Multivariate Statistical Theory, Hoboken, New Jersy: John Wiley &
Sons, Inc., 2005.
[6] S. Axler, P. Bourdon and W. Ramey, Harmonic Function Theory, New York : Springer-Verlag, New
York, Inc., 2001.
[7] K. T., Fang, S. Kotz and K. W. Ng, Symmetric Multivariate and Related Distributions, London,
New York: Chapman & Hall, 1990.
[8] Y. Su, “Smooth test for elliptical symmetry,” Proceedings of the 2012 International Conference
on Machine Learning and Cybernetics, IEEE Press, pp. 1279-1284, 2012.
Bi-kuan YANG*, Guang-ming LIU
Research on the Privacy-Preserved Mechanism of
Supercomputer Systems
Abstract: With the development of HPC technology, supercomputers are carrying
more and more applications. Because supercomputers usually adapt shared storage
systems to provide storage services, users’ core data isolation and confidentiality under
the Multi-users service mode has become a problem which needs to be solved. In this
paper, according to the privacy-conservation’s present situation and the requirement
of super-computers, we induce a mechanism suitable for super-computers which has
low performance overhead and secure users’ isolation – Dynamic Group Visible Zone
(DGVZ). DGVZ adapts the Chroot mechanism and the file path mapping mechanism
provided by the Linux kernel. According to the experimental evaluation, it can
effectively preserve users’ privacy information with merely 2% to 3% performance
cost to supercomputers.

Keywords: supercomputer systems; privacy preserving; computer systems

1 Introduction

HPC applications have obtained rapid development with the escalation and
popularization of High-performance Computing (HPC) technology. They play
an increasingly important role in meteorology, aerospace, petroleum, biological
medicine, and other fields. The computing capability of a country has become a
national comprehensive strength of science and technology. With the expansion of the
application field of HPC and the popularity of the Internet, supercomputing centers
provide high-performance computing services to all users through the Internet.
However, currently supercomputer developers focus more on improving system
performance, and there is a lack of high degree of attention on the user data privacy
problem. The Users login process of the supercomputers as shown in Figure 1. For users
who pass the VPN authentication to enter the supercomputers, they first complete
the manipulations associated with running the programs, then they submit their jobs
to the system resource controller, finally the system resource controller will allocate
system resources for users’ running jobs. Under the assurance of abundant resources
supplied by the supercomputers, users can therefore access the system resources
concurrently to finish their jobs. For the convenience of users’ running their jobs, the

*Corresponding author: Bi-kuan YANG, School of Computer Science, National University of Defense
Technology, Changsha, China, E-mail: yangbikuan@126.com
Guang-ming LIU, National Supercomputer Center in Tianjin, Tianjin, China
 Research on the Privacy-Preserved Mechanism of Supercomputer Systems   103

supercomputers allow users to store their data files required by the jobs. Because of the
fact that jobs running in the supercomputers are always computing jobs of important
industries requiring low computing time and large calculation, so their data files
contain a lot of sensitive information, and moreover the supercomputers can easily
become the target of hackers to obtain the privacy data of vital industries. For the
consideration of providing high-performance computing services, supercomputers
may often deploy distributed file systems to provide shared high-performance massive
storage services, and the Lustre file system is a representative of this category. It has
the characteristics of high scalability and high performance, and it can support large
amounts of clients’ concurrent access, moreover, it has impressively high bandwidth.
70% of TOP 10 supercomputers, 50 % of TOP 30 supercomputers, and 40 % of TOP 100
supercomputers have already deployed this distributed file system. Because of these
facts above, latter research will be established on the foundation of Lustre file system.

Figure 1. Users’ login process of the supercomputers

Because users have access to the root directory of the Lustre file system, so each
user can view the file system’s global architecture, thus each user can look up the
subdirectories’ metadata information through the root directory. Generally all users’
data files are stored in the users’ working directory under the shared directory, so
users can view the metadata information of other users’ working directories, which
can easily cause privacy leaks.
This paper mainly introduces a privacy preserving mechanism based on the
“namespace” mechanism of the operating systems-dynamic group visible zone
104   Research on the Privacy-Preserved Mechanism of Supercomputer Systems

(DGVZ). Through the grouped division of the system’s abstract resources such as
file system, it can determine the group visible zone (GVZ) inside which users can
only view the resources belonging to themselves and the shared library. Through
the DGVZ we can establish a virtual running environment. When users log in
supercomputers, the system will load the user’s login process into the virtual
running environment. After users complete their jobs, the system will disconnect
itself from users. In users’ entire operation cycle, users’ jobs are always in the
mutual isolation interference in operation environment, thus users’ data privacy
can be effectively ensured.

2 Related Works

At present there are mainly two kinds of isolation technologies: hardware


virtualization and operating-system virtualization. Hardware virtualization
is mainly implemented by constructing a virtual machine isolation running
environment.

2.1 Virtualization

Virtual machines were introduced on IBM mainframes in the 1970s [1] and then
reinvented on x86 by VMware [2] in the late 1990s. Xen [3] and KVM [4] brought
VMs to the open source world in the 2000s. The overhead of virtual machines
was initially high but has been steadily reduced over the years due to hardware
and software optimizations. Hardware virtualization provides high isolation by
abstracting hardware resources, and it has been widely used in cloud computing
platforms. However, virtual machines created by hardware virtualization may
cause loss to the system’s performance. Moreover, along with the migration of users’
task in the supercomputer, the creation cost and the management cost of virtual
machines may have pretty affect on the system performance of the supercomputer.
So, hardware virtualization is not popular with the supercomputers.

2.2 Operating system level virtualization

Operating system level virtualization also has a long history. In some sense
the purpose of an operating system is to virtualize hardware resources so they
may be shared, but Unix traditionally provides poor isolation due to global
namespaces for the filesystem, processes, and the network. apability-based OSes
provided container-like isolation by virtue of not having any global namespaces
to begin with, but they died out commercially in the 1980s. Plan 9 introduced per-
 Research on the Privacy-Preserved Mechanism of Supercomputer Systems   105

process filesystem namespaces [5] and bind mounts that inspired the namespace
mechanism that underpins Linux containers. The Unix chroot() feature has long
been used to implement rudimentary “jails” and the BSD jails feature extends
the concept. Solaris 10 introduced and heavily promoted Zones [6], a modern
implementation of containers. Linux containers have a long and winding history.
The Linux-VServer [7] project was an initial implementation of “virtual private
servers” in 2001 that was never merged into mainstream Linux but was used
successfully in PlanetLab. The commercial product Virtuozzo and its open-source
version OpenVZ [8] have been used extensively for Web hosting but were also not
merged into Linux. Linux finally added native containerization starting in 2007
in the form of kernel namespaces and the LXC userspace tool to manage them.
Platform as a service providers like Heroku introduced the idea of using containers
to efficiently and repeatably deploy applications [9]. Rather than viewing
a container as a virtual server, Heroku treats it more like a process with extra
isolation. The resulting application containers have very little overhead, giving
similar isolation as VMs but with resource sharing like normal processes. Google
also pervasively adopted application containers in their internal infrastructure
[10]. Heroku competitor DotCloud (now known as Docker Inc.) introduced Docker
[11] as a standard image format and management system for these application
containers. There has been extensive performance evaluation of hypervisors, but
mostly compared to other hypervisors or non-virtualized execution. [12,13, 4].
These operating-system virtualization tools above have realized a relatively good
compromise of running environment isolation and performance. But in these
tools, Docker serves as a new product of the development of container technology
in recent years, and it relies highly on the running environment. But in order
to ensure user running stability, supercomputers generally adopted an earlier
version of the system environment, which may cause Docker cannot function well.
Because of the compute nodes in supercomputers generally take the lite version
of the operating system kernel in order to ensure the users’ tasks performance, so
even earlier tools such as OpenVZ may have problems of the environment support
and the management of the users’ process migration. So these tools also failed to
be widely used in supercomputers.

3 Related Concepts

Linux namespace is the Linux operating system kernel built-in lightweight


operating system virtualization implementation. It can build several isolated
user running environment, also known as “namespace” virtual environment. It
provides the isolation mechanism of the host domain, IPC, file systems, processes,
network, and so on. Through this mechanism, system resources such as PID, IPC,
network are no longer global, but belong to a particular namespace. Resources
106   Research on the Privacy-Preserved Mechanism of Supercomputer Systems

under each namespace are invisible and transparent for other resources under
the other namespaces. Linux namespace creates isolated running environments
through file path mapping and chroot mechanism. Chroot sets the specific
directory of the current process as the root directory of the namespace. The
process is limited in this root directory, and it can’t see the changes outside this
root directory, which may ensure the process is isolated. Through the file path
mapping we can bind the directories outside the namespace with the directories
inside the namespace, by which we can establish a virtual running environment
for the process. When users access the files inside the namespace, what they get
is the information of the actual files outside the namespace that have been bound
with these files. Through these measurements we can ensure the normal services.

4 Dynamic Group Visible Zone Framework

Dynamic grouping visible domain framework consists of the following several


parts: DGVZ-based management module, login node group visible zone (LNGVZ)
and compute node group visible zone (CNGVZ). The framework is shown in
Figure 2.
Login nodes mainly provide the functions of logging in, programming,
compiling and job loading. Login node-based LNGVZ provides every user a virtual
running environment which is customized according to the groups’ membership.
LNGVZ-oriented Normal Service Assurance Module provides the function that
users can normally enjoy services in the LNGVZ, which let them feel like they are in
original environments. In this environment users can complete programs, edit and
compile processes. Compute nodes mainly provide high-performance computing
capability through the system’s internal high-speed network. Compute node-
based CNGVZ provides every user a virtual running environment that likes LNGVZ.
For users within the same user group, except LNGVZ and CNGVZ are different in
the runtime library support needed by the corresponding node’s normal services,
they keep the same to ensure the user’s normally function. Different groups’ DGVZ
differs in the privacy data file required to ensure that users of specific group can
properly finish their computing jobs.
A DGVZ-based management module mainly has the following functions:
first, it can listen to users’ connection request in specific ports; second, it can
authenticate users’ identification; third, it can establish the LNGVZ for users and
load the user’s login process into the LNGVZ; fourth, it can dynamically create
CNGVZ in compute nodes for users and load the user’s jobs into the CNGVZ.
 Research on the Privacy-Preserved Mechanism of Supercomputer Systems   107

Figure 2. DGVZ Framework

5 The realization of dynamic grouping visible domain

5.1 Establishment of Login Node Group Visible Zone (LNGVZ)

LNGVZ is based on the composition of the principle of namespace – the file path
mapping and chroot mechanism. Namespace provides users with a virtual running
environment of copy of the current system resources. However, as the proc file system
serves as a pseudo file system, it can only be accessed by the root user. We transform
the normal directories to the processes’ root directories, and we specify these root
directories through chroot, thus we can limit users to see the upper directories.
108   Research on the Privacy-Preserved Mechanism of Supercomputer Systems

Normal Service Assurance Module transform the runtime library required by the
Login Node users to the directories under the root directory to constitute a virtual
running environment, which makes users can still enjoy the services of the Login
Node in the original environment. Meanwhile, it transforms the user’s data directories
stored on the Lustre file system to this virtual running environment. The operations
above constitutes the user’s LNGVZ. Files access inside the LNGVZ is actually the
files being mapped. The virtual running environment doesn’t contain actual files, so
it doesn’t have space costs. Users can’t get out of their LNGVZ because of the root
directories’ restriction. Any access to the outside domain may be blocked. In the
LNGVZ only user’s data files can be viewed. A LNGVZ is transparent to each other
and doesn’t have connection to other LNGVZs. So user privacy data’s isolation can be
secured in the Login Node.

5.2 Establishment of Compute Node Group Visible Zone(CNGVZ)

CNGVZ is also based on the file path mapping and chroot mechanism. We transform
the normal directories to the processes’ root directories, and we specify these root
directories through chroot, thus we can limit users to see the upper directories. Normal
Service Assurance Module transform the runtime library required by the Compute
Node users to the directories under the root directory to constitute a virtual running
environment, which means users can still enjoy the services of the Compute Node in
the original environment. Meanwhile, it transforms the user’s data directories stored
on the Lustre file system to this virtual running environment. The operations above
constitutes the user’s CNGVZ.
Through the global file system view provided by the Lustre file system, the
consistency of users’ data directories shared by the Login Node and Compute Node is
still kept by the LNGVZ and CNGVZ. Users can’t get out of their CNGVZ because of the
root directories’ restriction. Any access to the outside domain may be blocked. In the
CNGVZ there are only user’s data files can be viewed. A CNGVZ is transparent to each
other and doesn’t have connection to other CNGVZs. So user privacy data’s isolation
can be secured in the Login Node.

5.3 DGVZ-based Management Module

DGVZ-based Management Module is an important part that connects the LNGVZ


with the CNGVZ and ensures the user’s jobs effectively execute in the isolation
environment of the Login Node and the Compute Node. It’s established based on the
Slurm, and it consists of these parts below: User Connection Request Listen Module;
User Identification Module; Compute Node Resource Audit and Scheduling Module;
DGVZ Establishment and Maintenance Module; DGVZ Load Module.
 Research on the Privacy-Preserved Mechanism of Supercomputer Systems   109

The user Connection Request Listen Module is responsible for listening the
specific port such as SSH port 22 to monitor the request through the global daemon.
The global daemon is a background process that is independent of the terminal and
periodically handles some specific affairs or waits for some specific events. It becomes
independence through creating a new session when it is created by the parent process,
and it changes the working directory from the parent process, after that it resets the
file permissions mask to prevent the default permissions inherited from the parent
process cannot access the required files, finally it closes the useless file descriptors
inherited from the parent process to avoid a waste of the system resources. After
doing this, the global daemon running in the background is created. While running
in the background, if it detects aconnection request, it will call the User Identification
Module to authenticate the user’ ID.
A user Identification Module is established on the Linux built-in PAM modules. It
reads the uid, gid and the shell environment expression from the file /usr/bin/passwd,
and then it verifies the password according to the uid, after that it uses the PAM’s
shared library to validate the user’s identification according to the PAM configuration
file. If the user can satisfy all the requirements, the user’s login process will be loaded
to the specific Shell environment. If this login is the user’s first login, then the DGVZ
Establishment and Maintenance Module will be called later, or else the DGVZ Load
Module will be called later.
The compute Node Resource Audit and Scheduling Module uses the Cgroups
mechanism to limit the resource occupation of user’s process. There are two kinds of
processes in the Compute Node, including the system process and the user process.
By default there is no resource limitation to the system process, and user process’s
resource limitation is determined by the system when the user submits the jobs and
the resource requirement. We can restrict the processes’ resource upper limit by
writing the specific parameters to the Cgroups Configuration Files.
DGVZ Establishment and Maintenance Module is responsible for the establishment
and maintenance of the LNGVZ and the CNGVZ through chroot and file path mapping
mechanism. What LNGVZ differs with CNGVZ is only the runtime library required by
the corresponding nodes. The data privacy files provided by the Lustre file system
keep the same in the LNGVZ and the CNGVZ. When the establishment is completed,
the DGVZ Establishment and Maintenance Module will write the root directories
information to the file /etc/passwd and the file /etc/shadow, thus DGVZ Load Module
can finish the latter tasks.
The DGVZ Load Module loads the user process to the corresponding virtual
running environment according to the user’s identification. In the Login Node, it
reads the user item from the file /etc/passwd and the file /etc/shadow, after that it
changes the current path to the virtual root directory’s path, and then it changes the
ownership of the terminal for the convenience of the user, finally it load the user
login process into the corresponding LNGVZ. In the Compute Node, it also reads the
user item from the file /etc/passwd and the file /etc/shadow, after that it changes the
110   Research on the Privacy-Preserved Mechanism of Supercomputer Systems

current path to the virtual root directory’s path, and then it changes the ownership of
the terminal for the convenience of the user, finally it load the user compute process
into the corresponding CNGVZ.
Within the framework of the DGVZ, user’s using process is as follows. In the
LNGVZ user sends the task requests through srun command, and then the central
management process ‘slurmctld’ of the slurm in the DGVZ-based Management Module
audits the resource requirement and allocates the resource to the process, finally the
node management process ‘slurmd’ of the slurm in the DGVZ-based Management
Module loads the user process into the corresponding CNGVZ. The working framework
is showed in Figure 3.

Figure 3. DGVZ working framework

6 Experiment Evaluation and Analysis

The experimental setup consists of four interconnected nodes of TianHe-1A


supercomputers, each with a 2.93 GHz Intel Xeon 5675 processor, 24GB of RAM, 1PB
shared memory of Lustre Storage System, interconnected by the TianHe high-speed
Internet Network.
Aiming to improve user privacy, we test the DGVZ’s privacy preserving in the file
system, process, IPC, and the user/group. The test results are as Table 1. In the file
system level, we use ls command to check the information in the /home directory in
the Lustre file system, and we find that by the contrast to the original environment in
DGVZ other users’ metadata, such as working directories’ attributes, has been filtered.
In the process space level, we use ps command and we find that by the contrast to the
original environment in DGVZ other users’ process information has been filtered. In
the IPC level, we use ipcmq command to check the IPC’S message queue, and we find
 Research on the Privacy-Preserved Mechanism of Supercomputer Systems   111

that by the contrast to the original environment in DGVZ other users’ message queue
information has been filtered. In the user/group level, we check the file /etc/passwd
and the file /etc/shadow, and we find that by the contrast to the original environment
in DGVZ other users’ information has been filtered. So, through the DGVZ, we can
effectively improve user data privacy isolation in supercomputers.

Table 1. Table type styles

Experiment Results
Objects
Actions Original Environment DGVZ

File System Check the metadata information of other users × √

Process Space Check other users’ processes × √

IPC Check other users’ message queues × √

User/Group Check other users’ attribute information × √


Notes: In this experiment, √ represents the corresponding privacy has been preserved: × represents
the corresponding privacy hasn’t been preserved

Figure 4 and Figure 5 show experimental results that the comparison of the
standard Benchmark’s running in the original environment and running in the DGVZ
environment, and the contrast of performance of compiling the programs. Figure 4
shows that the time of compiling several programs in the original environment and
the time of compiling several programs in DGVZ environment. According to the
result, these applications’ total compiling time in original environment and DGVZ
environment is equal, and the difference is less than two percent.

104
101.7
102
100
100

98
96.5
Compiling Time
96 95

94

92

90
gmp NPB

Original Environment DGVZ

Figure 4. Compiling Time of gmp and NPB


112   Research on the Privacy-Preserved Mechanism of Supercomputer Systems

100 93.11 91.17


90.25 87.95
90
80
67.25 65.57
70 62.32 60.97
60
Disk Throughput
50
(MB/s)
40
30
20
10
0
Read Random Read Write Random Write

Original Environment DGVZ

Figure 5. Disk throughput, in MB/s, of read, random read, write, and random write operations

As is shown in Figure 5, the performance of disk I/O throughput in DGVZ is up to


97% of the performance of disk I/O throughput in original environment, and the
performance loss is between 2% and 3%.

7 Conclusion

DGVZ is a privacy-preserving isolated running environment abstraction mechanism. It


can ensure the performance of the system’s services are not affected, and meanwhile
it transforms the traditional shared-running mode in the supercomputers into the
isolated-running mode. By the contrast with the traditional virtualization method,
it reduces overhead in performance and management resulting from abstraction of
system resources, and it provides low-overhead service assurance, thus users can
conveniently enjoy the original services.

References
[1] R. J. Creasy. The origin of the VM/370 time-sharing system.IBM Journal of Research and
Development, 25(5):483–490,Sep 1981. ISSN 0018-8646.
[2] M. Rosenblum and T. Garfinkel. Virtual machine monitors:current technology and future trends.
Coputer, 38(5):39–47,May 2005. ISSN 0018-9162.
[3] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho,R. Neugebauer, I. Pratt, and A.
Warfield. Xen and the art of virtualization. In Proceedings of the Nineteenth ACM Symposium on
Operating Systems Principles, SOSP ’03, pages 164–177, New York, NY, USA, 2003.. URL http://
doi.acm.org/10.1145/945445.945462.
 Research on the Privacy-Preserved Mechanism of Supercomputer Systems   113

[4] A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. KVM: the Linux virtual machine monitor. In
Proceedings of the Linux Symposium, volume 1, pages 225–230, Ottawa, Ontario, Canada, June
2007. URL http://linux-security.cn/ebooks/ols2007/
[5] OLS2007-Proceedings-V1.pdf.
[6] R. Pike, D. Presotto, K. Thompson, H. Trickey, and P. Winterbottom. The Use of Name Spaces
in Plan 9. In Proceedings of the 5th Workshop on ACM SIGOPS European Workshop:Models
and Paradigms for Distributed Systems Structuring,pages 1–5, 1992.. URL http://doi.acm.
org/10.1145/506378.506413.
[7] D. Price and A. Tucker. Solaris Zones: Operating system support for consolidating commercial
workloads. In LISA,volume 4, pages 241–254, 2004.
[8] S. Soltesz, H. Potzl, M. E. Fiuczynski, A. Bavier, and L. Peter-son. Container-based operating
system virtualization: A scalable, high-performance alternative to hypervisors. In Proceedings
of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, EuroSys 07,
pages 275–287,2007.
[9] URL http://doi.acm.org/10.1145/1272996.1273025.
[10] OpenVZ. http://openvz.org/.
[11] J. Lindenbaum. Deployment that just works.https://blog.heroku.com/archives/2009/3/3/
[12] deployment_that_just_works, Mar 2009.
[13] Eric Brewer. Robust containers. http://www.slideshare.net/dotCloud/
Jiu-chuan LIN, Yong-jian WANG, Rong-rong XI*, Lei CUI, Zhi-yu HAO
Reliability Evaluation Model for China’s Geolocation
Databases
Abstract: Geolocation databases are widely used for mapping Internet devices’
IP addresses to physical locations. However, many databases available on the
Internet may not provide as accurate location information as they claim. Thus, the
determination on whether a given database is reliable is required. To assess the
reliability of a geolocation database, this paper presents a new method based on
a dynamic trust model. In this method, the reliability of a geolocation database is
dynamically adjusted based on the interaction between geolocation databases. We
conduct a set of experiments, and the results based on ground truth dataset show
that the proposed method can objectively assess the reliability of China’ mainstream
geolocation databases at the provincial granularity. Moreover, it provides a solution
to resolve inconsistencies between geolocation databases.

Keywords: Geolocation Database; Reliability Assessment; Trust Model

1 Introduction

With the emergence of network applications based on geographic location, IP


geolocation techniques have become the cornerstone of many Internet services. IP
geolocation is the process of mapping Internet devices’ IP address to the physical
location. The mapping methods fall into two types: an active approach and passive
approach. The active IP geolocation techniques [1-3], are typically based on delay
measurements, which is more accurate than a passive approach. However, due to the
existence of loops in the network, we cannot directly depend on the linear relationship
between delay and geographic distance to locate the target. Moreover, many other
factors limit the ability of location. Such as delay jitter, measurement overhead
and asymmetric path, etc. The passive IP geolocation techniques [4-9], also called
database-driven geolocation, infers  the target geographic location  by mining the
host name information, which is queried from DNS servers or Whois databases. The
advantages of this approach include low measurement overhead, low computation

*Corresponding author: Rong-rong XI, Institute of Information Engineering, Chinese Academy of


Sciences, Beijing, China, E-mail: xirongrong@iie.ac.cn
Jiu-chuan LIN, Yong-jian WANG, Key Laboratory of Information Network Security of Ministry Of Public
Security, the Third Research Institute of Ministry Of Public Security, Shanghai, China
Lei CUI, Zhi-yu HAO, Institute of Information Engineering, Chinese Academy of Sciences, Beijing,
China
 Reliability Evaluation Model for China’s Geolocation Databases   115

overhead and high speed, etc. Unfortunately, this approach relying on geolocation
database may resolve the IP address into incorrect geographical location.
Incorrect location information limits the ability of data-driven geolocation
technology. The main reason is that the database acquired from database provider
is always out-of-date, and it is impossible to update the database in time, so that
the geographic information that exists in the database is antiquated and thus is
unreliable. As a result, determination on whether a given database is reliable or
not is necessary. To quantify the reliability of geolocation databases, we propose an
evaluation method of geolocation databases based on dynamic trust model. The key
idea is that we analyze the reliability of IP location attribute values based on dynamic
trust model. We conduct the evaluation on five geolocation databases in China, and
the results show that the proposed model is able to evaluate the reliability of different
geolocation database effectively.
The rest of the paper is organized as follows. The next section presents the previous
related work. Section III describes our approach with a dynamic trust model. The
experimental results are shown in Section IV. Finally we conclude our work in Section V.

2 Related work

There exist several works focusing on geolocation databases and assessing its accuracy.
Shavitt et al. [5] discuss the accuracy, the strength and weaknesses of geolocation
databases by grouping IP addresses into PoPs. Their evaluation shows that there exists
a strong correlation among all databases and the vast majority of location information
is accurate. Huffaker et al. [6] conduct a systematic quantitative comparison of currently
available geolocation service providers, and find that the providers generally prefer
the IP-address-to-country mappings. However, it is difficult to make a rigorous formal
comparison at the country level. Poese et al. [7] study the accuracy of geolocation
database using ground truth. They conduct a comparison of several current geolocation
databases, and find that geolocation databases often successfully geolocate IP
addresses at the country level, rather than at a city level. Siwpersad et al. [8] compare
the location estimates of databases with an active measurement method, their results
show that the geographic resolution of geolocation databases is far coarser than the
resolution provided by active measurements for individual IP addresses. Gueye et al. [9]
compare the location of blocks of IP addresses with IP address location estimated based
on active measurements. They find that the geographic span of a block of IP addresses
is itself fuzzy. Meanwhile, it is difficult to choose the location of a block. Although there
is work conducts research on geolocation databases, they mainly focus on resolving the
global IP address space. Moreover, they tend to resolve the IP address space in Europe
and the United States, but these approaches and results are not suitable for China’s
IP address space. To evaluate the reliability of the current mainstream IP geolocation
116   Reliability Evaluation Model for China’s Geolocation Databases

database of mainland China, this paper focuses on analyzing IP geolocation data based
on dynamic trust model.
There are several popular worldwide geolocation databases such as Maxmind,
IP2LOCATION, IPInfoDB and HostIP. They can be used to resolve the global IP address
space. However, the particularity of China’s Internet and the lack of ground truth
data makes the accuracy of these databases is relatively low for China’s Internet. In
mainland China, IP2Location [10], QQwry [11], IP138 [12], Sina [13] and Taobao [14],
etc. are the most widely used databases. In addition, they are all freely commercially
available databases with periodic updates. In this paper, we will analyze the reliability
of geolocation database based on China’s mainstream databases.

3 Dynamic Trust Model

3.1 Notations for Model

The reliability of a geolocation database is determined by its accuracy of evaluation.


Therefore, we propose a qualitative behavioral model to model the current and historic
assessment behavior of a geolocation database. To facilitate the model description,
we firstly present the definition of agents and associated environments.

Agent: It is the data source in the model, and represents one geolocation database in
the system. Each database ai can be used as an agent to assess the reliability of any
other database aj. The set of N agentsis referred to as A:

A = {a1, a2, …aN}

Reputation: It refers to the reputation of geolocation databases generated by the


cumulative assessment. It is expressed by:

Reputation: rij(c)∈ [0,1]

Where rij(c) represents the evaluation of agent ai to agent aj in the period c.

Event:It refers to a case that agent ai is evaluating aj’s reliability for being cooperative.
The set of events can be expressed as:

Event: e ∈E = {xij(k)} xij(k)∈{1,-1, 0}

Where xij(k) represents the kth encounter between agent ai and agent aj. The value
of xij(k) fall into three cases depending on the evaluation results of the two agents.
1) The evaluation results of agent ai and aj are identical, then the behavior is defined
as agree, and the associated value is set to 1.2) The agent ai and aj have different
evaluation results, then the behavior is defined as disagree, and its value is set to
-1.3) The evaluation result of one agent is empty, implying that there is no interaction
 Reliability Evaluation Model for China’s Geolocation Databases   117

between agent ai and aj. In this case, the assessment behavior is defined as invalid,
and its value is set to 0.

History: It represents the aggregation results of all the evaluations of agent ai and aj
within a specific context.

History: Hij(c) ={Event*}

Where Hij(c) represents a history of events that ai has encountered with aj within the
context c.

Trust: It is a subjective expectation that an agent owns the future behavior of another
agent based on the history of their encounters. It is determined by:

T=E(r (c)| H(c))

3.2 Model Rationale

The reliability of an agent can be quantified as the probability estimation of two agent
shaving the same assessment results. Given that two agents ai and aj, and assuming that
they care about each others’ actions within a specific context c. To estimate ai’s reliability
in the view of aj, we require to observe all encounters involving agents ai and aj.
Let a binary random variable xij(i) represent the ith encounter between ai and aj. The
history of encounters that ai encountered with aj within the context c can be described as:

History: Hij = { xij(1), xij(2),… xij(n)}

Where n represents the total number of encounters between ai and aj in the past. Let
p be the number of approvals of aj by ai in the n previous encounters. θ refers to the
true proportion of number of approvals for aj by ai, then the estimator for θ based on
all encounters between ai and aj is determined by:
p
θˆij = (1)
n
According to statistics, it is known that the proportion random variable can be modeled
as a Beta distribution. Assuming that each encounter’s cooperation probability is
independent of other encounters between ai and aj, the likelihood of p cooperation
and (n – p) defections can be modeled as:

(
L(H | θˆ) = θˆ p 1 - θˆ )
(n - p )
(2)

Combining the prior and the likelihood, the posterior estimate for θ̂ is calculated by:

P(θˆ | H ) = Beta(c1 + p , c2 + n - p ) (3)


Then, the first order statistical properties of the posterior are calculated as below for
the posterior estimate of θ̂ :
118   Reliability Evaluation Model for China’s Geolocation Databases

c1 + p
E(θˆ | H ) =
c1 + c2 + n (4)
Note that p(x ij (n + 1) = 1 | H )is the likelihood for x ij (n + 1) = 1 , given the estimated
parameters from n previous encounters. Substituting in the (normalized) likelihood:
c1 + p
rij (n + 1) = p(x ij (n + 1) = 1 | H ) = E(θˆ | H ) = (5)
c1 + c2 + n
When agents ai and aj meet in the first time, their estimate for each other’s reputation
is uniformly distributed across the reputation’s domain. For the Beta prior, values of
c1=1 and c2=1 yield such a uniform distribution. Therefore we set c1= c2=1.
To combine the parallel evidence about agent aj, measurements of “reliability”
are required to weight all the evidences. Let n be the total number of encounters
between agents ai and aj, then the reliability measure can be established as follows:
n

∑ rij (k ) (6)
T = rij = k=1

4 Experiment

4.1 Geolocation Database

We use five geolocation databases in this paper, IP2Location, QQwry, IP138, Sinaand
Taobao, mainly due to their popularity and their expected reliability. Moreover, they
are all free commercially available databases with periodic updates. Until April 17,
2015, the basic statistics of these databases are shown in Table 1.

Table 1. Geolocation database statistics

Database Blocks Address Province Coverage City Coverage

No. coverage No. coverage

IP2Location 2120421 3582194176 31+3 100% 598 90.5%

QQwry 2445092 4294967296 31+3 100% 426 64.4%

Taobao 1740889 3376679673 31+3 100% 328 49.6%

Sina 2063668 3066991479 31+3 100% 329 49.7%

IP138 1842027 3341584303 31+3 100% 419 63.4%

Table 1 shows the number of blocks and IPs in each database. A report shows that
totally 332,039,158 [15] IPs are allocated on China’s Internet until December 30, 2014,
which is approximately in agreement with the number of IPes in each database shown
in Table 1. This means that these five databases almost cover all IPs of Internet in
China. Among them, the number of IPs address in QQwry database is the largest, in
which there are a large number of reserved addresses.
 Reliability Evaluation Model for China’s Geolocation Databases   119

Table 1 also provides the number of provinces and cities retrieved from database
locations. From the number of provinces, we can infer that all China’s provinces are
covered, including Hong Kong, Macao. However, we notice that the coverage at the
city level is a bit lower. This is due to the ambiguous definition of a city. There exist
both regional cities and county-level cities in China. In this paper, the acquired cities
fall into 285 regional cities and 368 county-level cities, with four municipalities, two
special administrative regions and Taiwan, a total of 660 cities. If a city is defined
in the view of local administrative region, then there are 333 regional administrative
regions in China, with four municipalities, two special administrative regions and
Taiwan, a total of 340 cities [16]. The city coverage will increase substantially.

4.2 Ground Truth Datasets

Ground truth dataset is collected from the dominant Internet Service Provider (ISP)
in China. The data acquisition system is linked to ISP using bypass access method. It
collects data at the spare time. We collect the Ground truth dataset from Hebei (a province
in China) branch of China telecom. The data acquisition system starts collection from
21:11:25 on September 12, 2014 to 05:50:46 on October 13, 2014. Finally, it acquires about
65 million distinct raw records. From these records, we extract 1963018 IP addresses, in
which 1189504 IP addresses are reserved addresses (accounting for 60.6%), 773514 IP
addresses are valid (accounting for 39.4%). In the following experiments, we analyze
the accuracy of each database based on the valid addresses.

4.3 Reliability Evaluation of Geolocation Database

We apply the dynamic trust model to evaluate the reliability of the geolocation
databases. The geolocation databases are used as independent agents. The
corresponding reliability are dynamically adjusted based on the consistency of the
resolution of the IP address block in ground truth dataset. Each geolocation database
is independent. During each encounter of address resolution, a geolocation database
i either approves or disapproves of geolocation database j’s resolution. There are
multiple encounters between geolocation databases i and another geolocation
database j with respect to multiple IP addresses. According to the consistency of these
encounters, we dynamically adjusted the reliability of each geolocation database.
To objectively assess the reliability of the 5 geolocation databases, we obtain the
address resolution data from the Hurricane Electric. Hurricane Electric  is  Internet
Backbone and Colocation Provider, a global Internet service provider providing IPv4
and  IPv6  services, as well as  data center and web hosting services. Resolution
locations obtained from the Hurricane Electric are regarded as trust subject,
Resolution locations obtained from 5 geolocation databases are regarded as trust
120   Reliability Evaluation Model for China’s Geolocation Databases

object. If the trust object has the same resolution results with trust subject, then the
behavior is defined as agree, and the associated value of their encounter is set to 1.
If the trust object has different evaluation results with trust subject, the behavior is
regarded as disagreeing, and the values of their encounter is set to -1. If the resolution
results of one trust object is empty, implying that there is no encounter between trust
subject and trust object. Then, the assessment behavior is defined as invalid, and to
corresponding value is set to 0.
To illustrate the dynamic trust model, we take 30 IP addresses as example to
illustrate the process of reliability evaluation of these geolocation databases. The
resolution results can be quantified as binary random variable xij (k). The results are
shown in Table 2.

Table 2. The quantified resolution results of 30 IP addresses

IP IP138 Ip2location TaoBao Sina QQwry

1 101.84.128.56 1 1 1 1 -1
2 124.127.76.67 1 -1 1 1 1
3 219.142.154.160 1 -1 0 1 1
4 106.109.0.73 -1 1 1 1 1
5 106.124.0.207 -1 1 1 -1 1
6 106.125.0.11 1 1 -1 -1 -1
7 106.127.0.169 -1 1 1 -1 -1
8 61.187.50.6 1 -1 1 1 1
9 61.186.9.130 -1 1 1 1 1
10 106.41.192.238 1 0 1 -1 -1
11 106.45.0.101 1 -1 1 1 -1
12 112.67.163.163 -1 1 1 1 1

13 218.77.227.248 1 -1 0 1 1
14 113.25.158.210 1 -1 1 1 1
15 117.35.142.47 1 -1 0 1 1
16 117.36.211.89 -1 0 0 1 1
17 119.41.119.34 1 -1 1 1 1
18 124.127.222.40 1 -1 1 1 1
19 219.142.243.46 -1 0 1 1 1
20 124.31.191.88 -1 1 1 1 1
21 139.189.59.86 -1 1 1 -1 1
22 139.189.30.19 1 1 1 1 -1
23 140.240.17.197 -1 1 1 1 1
24 222.223.54.10 1 1 -1 1 1
25 27.157.220.14 -1 1 1 1 1
26 27.184.187.11 1 1 1 -1 1
27 36.97.1.233 1 1 -1 -1 1
28 36.96.241.80 1 1 1 -1 1
29 36.98.0.102 1 1 1 -1 1
30 61.186.11.153 1 -1 1 1 1
 Reliability Evaluation Model for China’s Geolocation Databases   121

After the kth interactive evaluation, we can dynamically adjust the reliability of
each geolocation database according to the consistency of the resolution results using
formula (6), as shown in Figure 1.

Figure 1. The dynamic adjustment of the reliability of the location database

Figure 1 shows that the reliability fluctuates greatly in the initial stage. But with the
increase of interaction, the credibility tends to keep stable. It is worth noting that
the reliability of Taobao database is relatively high. It remains at around 0.85, and
the maximum value can reach up to 0.9. Meanwhile, we can see that the reliability of
IP2location is relatively low. It fluctuates in the range of [0.4, 0.66]. Overall, in terms
of China’s IP addresses, the reliability of Taobao, Sina, QQwry is relatively high (about
0.85), while the reliability of IP2location IP138 is relatively low (around 0.50). Another
thing should be noted is that the data used within this paper are sources of conflict
filtered IP addresses. If the original untreated IP addresses are adopted as data source,
the reliability of these five geolocation databases will be further increased.

5 Conclusion

This paper presents a reliability evaluation method of China’s mainstream geolocation


database based on dynamic trust model. This method treats the geolocation database
as a trust agent, dynamically adjusts the reliability of geolocation database by
122   Reliability Evaluation Model for China’s Geolocation Databases

observing the consistency of the resolution during the interaction. Experiments based
on ground truth dataset shows that the proposed method can assess the reliability
of current mainstream geolocation database at the province-level granularity. In the
future, we plan to improve the reliability assessment model with indirect trust, for
achieving a more accurate result than the currently used direct trust method.

Acknowledgements: This work is supported by Key Lab of Information Network


Security, Ministry of Public Security under Grant No. C16610.

References
[1] Katz-Bassett E, John J P, Krishnamurthy A, et al. Towards IP geolocation using delay and
topology measurements[C]//Proceedings of the 6th ACM SIGCOMM conference on Internet
measurement. ACM, 2006: 71-84.
[2] Eriksson B, Barford P, Sommers J, et al.A learning-based approach for IP geolocation[C]//
Passive and Active Measurement. Springer Berlin Heidelberg, 2010: 171-180.
[3] Dong Z, Perera R D W, Chandramouli R, et al. Network measurement based modeling and
optimization for IP geolocation[J]. Computer Networks, 2012, 56(1): 85-98.
[4] Guo C, Liu Y, Shen W, et al. Mining the web and the internet for accurate ip address
geolocations[C]//INFOCOM 2009, IEEE. IEEE, 2009: 2841-2845.
[5] Shavitt Y, Zilberman N. A geolocation databases study[J]. Selected Areas in Communications,
IEEE Journal on, 2011, 29(10): 2044-2056.
[6] Huffaker B, Fomenkov M, Claffy K. Geocompare: a comparison of public and commercial
geolocationdatabases[J]. Proc. NMMC, 2011: 1-12.
[7]Poese I, Uhlig S, Kaafar M A, et al. IP geolocation databases:unreliable?[J]. Acm Sigcomm
Computer Communication Review, 2011, 41(2):53-56.
[8]Siwpersad S S, Gueye B, Uhlig S. Assessing the geographic resolution of exhaustive tabulation
for geolocating internet hosts[M]//Passive and active network measurement. Springer Berlin
Heidelberg, 2008: 11-20.
[9] B. Gueye, S. Uhlig, and S. Fdida. Investigating the imprecision of ip block-based geolocation.
In PAM’07: Proceedings of the 8th international conference on Passive and active network
measurement, pages 237–240, 2007.
[10] “IP2Location”, http://www.ip2location.com/, 2015
[11] “IPcn,” http://www.ip.cn/, 2015
[12] “Taobao,” http://ip.taobao.com/, 2015
[13] Mui L, Mohtashemi M, Halberstadt A. A computational model of trust and reputation[C]//
System Sciences, 2002.HICSS.Proceedings of the 35th Annual Hawaii International Conference
on. IEEE, 2002: 2431-2439.
[14] China Internet Network Information Center, “Statistical Report on Internet Development in
China,” Jan. 2015.
[15] The latest administrative divisions at county and its code (2012), The National Bureau of
statistics of the People’s Republic of China, http://www.stats.gov.cn/tjsj/tjbz/xzqhdm/201201/
t20120105_38315.html
[16] Hurricane Electric’s BGP data, http://bgp.he.net/report, 2015
Jie TAN*, Jian-min PANG, Shuai-bing LU
Using Local Library Function in Binary Translation
Abstract: In order to improve the execution speed of a binary translation system,
this paper proposes a method to use local available library function codes called
Jecket, which uses the executable file’s symbol Table s and program linking Table s
to substitute local available codes for library functions in the translation stage, thus
eliminating translation overhead and reducing a lot of generated codes. Firstly, this
paper analyzes the argument lists and returned values of library functions which
need localizing, then analyzes library function’s names and load addresses of library
files which dynamically linking executable files depended on. Secondly, directly
translates all invocation library function’s instructions to code blocks of parameters
parsing, local invocation instruction and return value restoration. Finally, when
executing the generated codes, local functions will be directly called to complete the
computation. The experiment based on QEMU and nbench benchmarks shows that
after using Jecket algorithm, the speedup is up to 20.9 times.

Keywords: binary translation; QEMU; library function

1 Introduction

Binary translation [1] has been widely used in software security analysis [2], software
reverse engineering, system virtualization and other areas, and has been an essential
technology of software transportation. Dynamic binary translation is a just-in-time
compiler [3-5], which dynamically generates the required codes when running the
target programs, it can detect code detection and locate better [6]. Some techniques
improve the efficiency of dynamic binary translation such as hot-path optimization
[7], register mapping [8], multi-thread optimization [5] and so on, but these techniques
can’t increase the low efficiency significantly.
Because of program locality, only 20% of the codes can take up 80% of the
execution time [9]. The quality of codes generated by binary translation is the most
important factor of the binary translation’s efficiency. Traditional optimization
techniques focus on optimizing basic blocks from the intermediate code layer to
target instruction layer, while ignoring the shared codes between different platforms
[10]. In this paper, in order to take full advantage of the characteristic that the current

*Corresponding author: Jie TAN, State Key Laboratory of Mathematical Engineering and Advanced
Computing, Zhengzhou, China, E-mail: jessie_tanjie@hotmail.com
Jian-min PANG, State Key Laboratory of Mathematical Engineering, and Advanced Computing,
Zhengzhou, China
Shuai-bing LU, National Key Laboratory of Science and Technology on Information System Security,
Beijing Institute of System Engineering, Beijing, China
124   Using Local Library Function in Binary Translation

executable files use a dynamic linking library, to replace the dynamic link library
shared code, thus eliminating translation overhead and reducing a lot of generated
codes, system efficiency will be improved.
Applications heavily use dynamic libraries for application upgrades and
maintenance under Linux, due to the dynamic library’s position-independent code
(PIC), which can be loaded into any address, the text segments of a dynamic library
can be shared in multiple processes without relocation, thus improving the system
performance. A dynamic library function which is localized means that the replace
the dynamic library function on source machine with that on target machine. In this
way, due to using the local optimized library functions directly, without translating
the source machine library functions, it can avoid the particularity of dynamic library
function implemented on different systems, on the other hand, for applications which
are used frequently it can reduce the cost of translation.
The paper first presents the upper bound of code optimization using a formal
method, then proposes a Jecket method to use local libraries and proves that the
method on translated code blocks has reached the optimization upper bound in
theory. Finally, implements the algorithm in the popular dynamic binary translator
QEMU, and use nbench benchmarks to verify that the Jacket algorithm can improve
operational efficiency significantly.

2 Related Work

Jens presented a formal model of dynamic binary translation [11], firstly, Jens
presented a formal representation which is closer to real machines according to
Turing machine, then, gave the principle of binary translation, finally presented the
representation of the code optimization upper boundary. Some important theories
presented as follows:
Let M ( S , I , γ ) be a machine, where
S: denotes the set of states of the machine
I: denotes the set of machine instructions
γ : I × S → S is the interpretation function for machine instructions over a
machine state.
Then let Ms ( S , I , γ ) be the emulated or source machine, and let Mt ( S , I , γ ) be
the host machine, based on this, the binary translation may be expressed as:
Searching for a map ϕ so that the source platform to interpret the instruction i
in current state s, there is always i ' = ϕ (i ) executed in a state s ' corresponding to
the target platform. The map ϕ is one of the core tasks in binary translation.
Since the map ϕ is not unique, in order to get optimal form, the mapping function
ϕ is simplified by various intermediate representation optimization methods. For
Yirr-Ma system, Jens has proven its optimal form of code optimization including
 Using Local Library Function in Binary Translation   125

machine state mapping and instructions mapping. Jens has also illustrated that it can
increase the cost by 290% in optimal conditions.
Jens is committed to optimizing the mapping of machine status and translating
the instruction blocks, while ignoring the simulation from the semantic layer,
resulting that optimization cannot be mapped from semantic blocks of a higher
semantic level. From the perspective of the program equivalent transformation, this
paper gives a binary translation measurable upper bound of the optimal, as to guide
the engineering applications on the theory and practice.

3 Binary translation equivalence and optimal upper bound

Program equivalent transformation means on the premise of the program function


unchanged, to make a transformation of the program’s structure, execution mode,
memory mode, program mode. The purpose is to improve the implementation pro-
gram’s execution efficiency, security and portability, etc., including transform from
low-level language to low-level language, such as binary translation, performance
tuning of math library; transform from high-level language to low-level language,
such as the typical high-level language compiler; transform from serial programs to
parallel programs, such as high-performance platform software parallel migration.
Program transformation equivalence, can only determine the equivalence of program
transformation functionality semantics on the given system, there is no uniform
definition of equivalence now. Binary translation is a command-level program
equivalent transformation.

3.1 Definition of Binary translation equivalence

According to Jens’s binary translation model, combine with the program


transformation correlation theory, then give out the equivalence definition of binary
translation in instruction level and semantic level.
The source platform Ms interprets the instruction i in current state s,
corresponding to the target platform Mt executed i ' = ϕ (i ) in a state s ' .

Definition 1. Instruction-level equivalent.


When γ ( s, i ) γ=
= '(ϕ ( s ), ϕ (i )) γ '( s ', ϕ (i ))
is satisfied, binary translation
instruction mapping and state mapping are equivalent.

This definition describes the case of strong equivalence in instruction-level simulation,


is the strong equivalence mode of program equivalent. In some binary translators,
interpreting instructions for source platform instruction by instruction meets the
requirement of instruction-level equivalent. However, this strong equivalence
126   Using Local Library Function in Binary Translation

definition limits the level of optimization functions, which can only be optimized
for each instruction, but cannot be optimized according to the instructions of basic
blocks or function semantic.

Definition 2. Semantic-level equivalent.


For any instruction sequence t = {i1 , i2 , i3 } , when
= γ ( s, t ) γ=
'(ϕ ( s ), ϕ (t )) γ '( s ', t ')
is satisfied, binary translation instruction mapping and state mapping are equivalent.

Semantic-level equivalent requires that the target platform Mt implements the


translated target platform instruction sequence ϕ (t ) in the initial state, then gets the
final state corresponding to state which implemented t on source platform. It doesn’t
require a corresponding intermediate state after each instruction implemented, so the
optimization which based on the basic block, super block and function-level can be
implement.
Equivalence definition is similar to the mathematical concept of isomorphism, it
can reflect the consistency on computability of source and target platforms. According
to the definition of equivalence, then give the upper bound of binary translation
optimization.

3.2 The upper bound of binary translation optimization

Binary translation optimization is to simplify mapping function ϕ under the


premise of program equivalence, so as to use target instructions as fewer as possible
to denote the instructions of source platform. Let t denote the instruction number
of instruction sequence t ,ψ represents optimization function, if ψ (ϕ (t )) < ϕ (t ) ,
then the optimization function is effective to mapping function ϕ .
Theorem: suppose that the source file B generates executable file Es by compiling,
then executable file, which is compiler-generated from B for the target platform
instruction set, is the upper bound of binary translation optimization, namely,
ψ (ϕ ( Es )) ≥ Et .
Proving: let δ s and δ t denote the compilation process of platform Ms and Mt,
respectively, Es = δ s ( B ) , Et = δ t ( B ) , then executable file Est that generated by binary
translation can be expressed as = Est ϕ= ( Es ) ψ (ϕ (δ s ( B))) .
So, the process of binary translation is the one which program transforms from
source file B to executable file of Mt platform, similarly, δ t is a transformation of B to
Et , and ψ × ϕ × δ s ⊂ δ t .
Binary translation process is a special case of the compilation processes, so the
upper bound of binary translation optimization is a direct compilation process, that
is ψ (ϕ ( Es )) ≥ Et .
 Using Local Library Function in Binary Translation   127

The theorem is obvious usually due to the complexity of binary translation


system and the particularity of different platforms, the efficiency of binary translation
is 1/5 of the local codes, or even lower. So, taking full advantage of the local code can
greatly improve the efficiency of binary translation.

4 The Jecket arithmetic

This paper uses the Jecket algorithm to encapsulate a local library function, simulate
the source platform parameter and return rules in the target platform, achieve the
function that call the local library functions in the runtime of target library files.
Mainly three stages in the following are included: library function identification,
instruction translation of call library functions, local library function call executes.

4.1 Library function identification

In order to support dynamic linking, ELF dynamic link libraries and executable
programs which use dynamic link libraries have the procedure linkage Table s (PTL),
PLT Table increases an indirect memory access for function call, that jumps to PLT
Table first during the function call to obtain the entry address of function, then jumps
to the real function entry from PLT Table.
The function call is implemented by specific instruction. When calling a
dynamic library, the branch target address of instruction which function called is the
corresponding PLT entry address of function, meanwhile, it is also the corresponding
offset of relocation Table entry. Therefore, traverse the relocation Table to obtain all
the corresponding offset of relocation Table entries, then scan the dynamic symbol
Table and dynamic string Table to obtain the function name.
Therefore, when the binary translation system loads binary codes of source
machine, the binary codes to obtain function names of dynamic library functions and
PLT entry address can be analyzed. Then select the functions need to be replaced,
in order to establish the hash Table using PLT entry addresses for the index. When
translation functions call the instructions, browse the hash Table by branch target
addresses of instructions. Checking the hash Table will be able to obtain the name of
pre –selected functions. The identification of dynamic library function is completed.
According to the.sym Table s and.plt Table s of executable files, the correspondence
between the call address of instructions and the name of library functions can be
analyzed. Figure 1 shows the algorithmic procedure to obtain the information of
functions through the executable files symbol Table s.
128   Using Local Library Function in Binary Translation

function: load executable file, identify function


information of dynamic link library
input: the symbol header of executable file (sym)
output: library function information list (func_array)
For each elf_sym in. sym
extract sym’s name, size, type
if the type is STF_FUNC, the name contains
libc, the size of non-zero, add sym to symbol
array of library function(func_array)
End for

Figure 1. Algorithm
function: translateof extracting
library library
function function information
invocation instructions
input: codes of executable files (.text), library function list
and current execution environment env
output: corresponding instruction block that invocate target
4.2 Instruction
library function. translation of library function call
For each insn in.text
switchin
Generally insn
the translation stage, if functions call the dynamic-link library functions,
case: call insn
firstlyif extract all parameters
the destination with
address of callparameter passing
instruction is rules of source platform according
func_array
togenerate
the parameter
the locallists of library
parameter functions,
passing then use the extracted parameters to call
instructions
generate
library the local
functions library functiontoinvocation
corresponding instructions
target platform. After function returns, obtain the
generate the library function return value storage instructions
return value of the function, and return the return value to corresponding registers or
break ;
variables
case … in accordance with the rules of source platform.
default 2: shows the general instruction translation algorithm of calling library
Figure
functions.… When localizing library functions, the information of library functions
break
that End
needs forto be obtained in advance. Since the library functions are open and the
documents are very abundance, it is very convenient to obtain information of such
functions. The algorithm aims at the condition that source instructions is the function
call instructions, when target functions that called in func_array, the parameter
passing instructions and call instructions are generated according to the function
parameters void
and helper_sin(CPUX86State
returned value of func_array,
*env) the function returned processing
instructions{are generated finally.
double *function
Special library doublep =call
(double *)
requires complex processing mechanism, when
(&env->xmm_regs[0]) ;
the library function parameters that called are structure pointers, the realization of
*doublep = sin(*doublep) ;
source and destination
} platform structure is not exactly the same, such as the data
type of the field is different. Apply for structure place of target platform before calling
the target library function. After the function returns, store the value of temporary
structures in position that corresponding to source platform structures.
output: library function information list (func_array)
For each elf_sym in. sym
extract sym’s name, size, type
if the type is STF_FUNC, the name contains
libc, the size of non-zero, add sym to symbol
 Using Local Library Function in Binary Translation   129
array of library function(func_array)
End for

function: loadfunction:
executabletranslate
file, library
identifyfunction
functioninvocation instructions
informationinput:
of dynamiccodeslink oflibrary
executable files (.text), library function list
and current
input: the symbol headerexecution environment
of executable env
file (sym)
output:
output: library functioncorresponding
informationinstruction block that invocate target
list (func_array)
library function.
For each elf_sym in. sym
Forsym’s
extract each insn
name, in.text
size, type
switch insn
if the type is STF_FUNC, the name contains
libc, the case:
size of call insn add sym to symbol
non-zero,
if the destination address of call instruction is func_array
array of library function(func_array)
End for generate the local parameter passing instructions
generate the local library function invocation instructions
generate the library function return value storage instructions
breakfunction
function: translate library ; invocation instructions
case … files (.text), library function list
input: codes of executable
and current executiondefault
environment: env
output: corresponding…instruction block that invocate target
library function. break
For each insn in.text End for
switch insn
case: call insn
if the destination address of call instruction is func_array
generate Figure 2. Algorithm of translating function invocation instructions
the local parameter passing instructions
generate the local library function invocation instructions
generate the library function return
void value storage instructions
helper_sin(CPUX86State *env)
break ;
4.3 Execute the local library calls
{
case … double * doublep = (double *)
default : (&env->xmm_regs[0]) ;
… In order to realize conveniently and debugging
*doublep = sin(*doublep) ; facilitate, calling the library functions
breakduring executing, } just like use the helper function mechanism of QEMU. The helper
End for
function extracts and calculates the parameters, writes the results into the position
corresponding to source platform CPUState. Figure 3 shows the helper code example
that call of the sin function.

void helper_sin(CPUX86State *env)


{
double * doublep = (double *)
(&env->xmm_regs[0]) ;
*doublep = sin(*doublep) ;
}

Figure 3. An code example of helper to call sin library function

Because of the complexity and diversity of library functions, some issues need to be
solved during library function encapsulation such as parameter obtaining, return
value obtaining, format string analysis and so on. Library function encapsulation is a
challenge. In the case of calling library functions several times by source executable
programs, library functions localize can improve the operating efficiency significantly.
130   Using Local Library Function in Binary Translation

5 Experiments

The performance test of Jecket is provided below, and compared with QEMU. Using
nbench tests to test the performance of CPU and memory system that Jecket algorithm
implemented before and after in qemu-1.7.2 version.
Let TQEMU and TSQEMU are the execution times of executable programs for
QEMU and SQEMU respectively, let be the speedup of SQEMU versus QEMU.
1
TSQEMU TQEMU
S speedup =
=
1 TSQEMU
TQEMU

5.1 Experimental environment

The purpose of experiment is to verify the correctness and measure the efficiency
of SQEMU, nbench tests have correctness verification module. Table 1 shows the
experimental environment, Table 2 shows the test cases.

Table 1. Binary translation environments

Ms Mt

OS Fedora 2.6.27.5-117.fc10. NeoKylin 3.8.0


i686
CPU Intel(R) Core(TM)2 Quad CPU Q9500 @ 2.83GHz Domestic processor

Compiler gcc-4.3.2 gcc-4.5.3

Table 2. Nbench-2.2.3 test cases

test cse tasks

NUMERIC_SORT Sorts an array of long integers


STRING_SORT Sorts an array of strings of arbitrary length
BITFIELD Executes a variety of bit manipulation functions
FP EMULATION A small software floating-point package
FOURIER A numerical analysis routine for calculating series approximations of waveforms
ASSIGNMENT A well-known task allocation algorithm
IDEA A text and graphics compression algorithm
HUFFMAN A relatively new block cipher algorithm
NUEURAL NET A small but functional back-propagation network simulator
LU DECOMPOSITION A robust algorithm for solving linear equations
 Using Local Library Function in Binary Translation   131

The instruction count of generated codes is reduced 27.73% according to the statistics
of nbench test programs. As it shows from Figure 4, after using the Jecket algorithm,
the speedup on nbench test programs is up to 60-70 times. NEURAL NET uses library
function exp frequently, after Jacket package, calls the local library functions directly,
in this way, it reduces a large number of code generation and translation time and
improves the execution efficiency greatly. For the tests that don’t use local library
functions such as NUMERIC SORT, BITFIELD, FP EMULATIOIN and so on, function
localization does not affect its execution efficiency. The test result shows that for
applications which contain library function calls, Jecket algorithm makes greatly
speedup.

Figure 4. Speedup on nbench of Jecket to QEMU

6 Conclusion

This paper firstly gives a formal representation of binary translation as a way of


program transformation, then gives a upper bound of binary translation optimization,
discusses solutions approaching to the upper bound of optimization and proposes a
Jecket algorithm that uses local library function to replace the translation processes
and execution codes, and proves that it can achieves the upper bound of optimization
by using Jecket library functions. Finally, this paper implements the Jecket algorithm
by QEMU binary translation system and tests the acceleration of nbench tests
with Jecket algorithm. The experiment shows that for the applications whose core
areas contain the library functions, the algorithm can increase system efficiency
significantly.

Acknowledgment: This work is supported by National Natural Science Foundation of


China under Grant No. 61472447, and also supported by the National High Technology
Research and Development Program of China (863 Program) No. 2009AA012201.
132   Using Local Library Function in Binary Translation

References
[1] ALTMAN E, KAELI D, SHEFFER Y.:Welcome to the opportunities of binary translation [J]. IEEE
Computer, Vol, 33(3). pp: 40-45. 2000.
[2] SHAN Zheng, GUO Hao-ran, PANG Jian-min. BTMD: A framework of binary translation based
Malcode detector [C]// 2012 International Conference on Cyber-Enabled Distributed Computing
and Knowledge Discovery. IEEE, pp: 39-43.2012.
[3] CHERNOFF A, HOOKWAY R. DIGITAL FX! 32 running 32-bit× 86 applications on Alpha NT [C].
Proceedings of the USENIX Windows NT Workshop on the USENIX Windows NT Workshop.
USENIX, pp: 37-42.1997.
[4] CRISTINA C, VAN EMMERIK M. UQBT: adapTable binary translation at low cost. IEEE Computer.
Vol, 33(3). pp: 60-66.2000.
[5] LIAO Yin. Dynamic binary translation modeling and parallelization research. University of
science and technology of China, 2013.
[6] JIA Ning, YANG Chun, WANG Jing, et al. SPIRE: improving dynamic binary translation through
SPC-indexed indirect branch redirecting. ACM SIGPLAN Notices.ACM, Vol 48(7).pp1-12. 2013.
[7] JEFFERY A. Using the LLVM compiler infrastructure for optimized, asynchronous dynamic
translation in Qemu. South Australia 5005 Australia, University of Adelaide Honors Thesis.
2009.
[8] LIAO Yin, SUN Guang-zhong, JIANG Hai-tao, et al. All registers mapping method in dynamic
binary translation. Computer Applications and Software. Vol 28(11), pp21-48. 2011.
[9] HISER J D, WILLIAMS D, MARS J, et al. Evaluating indirect branch hadling mechanisms in
software dynamic translation systems. In Intl. Symp. on Code Generation and Optimization,
California: San Jose. pp 61-73.2007.
[10] SUN Ting-tao, YANG Yin-dong, YANG Hong-bo, et al. Return instruction analysis and
optimization in dynamic binary translation. 4th International Conference on Frontier of
Computer Science and Technology, Shanghai, IEEE Computer Society. pp 435-440. 2009.
[11] Jens T. Specification-driven dynamic binary translation. Brisbane, Australia. Queensland
University of Technology.2004.
Kai CHENG, Fei SONG, Shiyin QIN*
Type Recognition of Small Size Aircrafts in Remote
Sensing Images based on Weight Optimization
of Feature Fusion and Voting Decision of Multiple
Classifiers
Abstrac: Automatic small size aircraft recognition has many important applications
in both civil and military fields, such as monitoring air transportation and analyzing
enemy deployment. In this paper, a type recognition method of small size aircrafts in
remote sensing images is presented based on weight optimization of feature fusion
and voting decision of multiple classifiers. According to the characteristic analysis of
small size aircrafts, the active contour model and PCA method are employed to extract
contours of various aircraft targets. Moreover, a comparative analysis of multiple
invariant moments for every aircraft is carried out through effective evaluation metric
of moments so as to select stable features of moments for further weighted fusion
and achieve satisfied weighted fusion result through weight optimization. Finally, a
synthetic recognition scheme is designed and implemented by voting of recognition
results from multiple classifiers. Experiment results demonstrate the rapidity and
validity of our proposed type recognition method.

Keywords type recognition of small size aircrafts, remote sensing image, weight
optimization, voting decision

1 Introduction

Automatic target recognition in remote sensing images has been a hot topic in the field
of computer vision and pattern recognition in recent years. Aircraft type recognition
is an important application of target recognition and has broad prospects in civil and
military fields [1]. The goal of aircraft type recognition is to identify target types in
images which contain detected aircrafts. With the rapid development of the earth
observation technology, lots of high resolution remote sensing satellites were put
into commercial operation successively. Along with the explosive growth of remote
sensing image data, aircraft type recognition relying solely on visual interpretation

*Corresponding author: Shiyin QIN, School of Automation Science and Electrical Engineering, Beihang
University, Beijing, China, 100191, E-mail: qsy@buaa.edu.cn
Kai CHENG, School of Automation Science and Electrical Engineering, Beihang University, Beijing,
China, 100191
Fei SONG, Chinese Institute of Electronics, Beijing, China, 100036
134   Type Recognition of Small Size Aircrafts in Remote Sensing Images

has been unable to meet demand [2]. How to recognize the aircraft type quickly and
accurately in the massive remote sensing image data is an urgent problem to be solved
nowadays [3-6].
Compared with large size target, the small size aircraft usually suffers from
blurred shape and limited feature information, which makes the type recognition
faced much more difficulties and challenges. Therefore, the recognition accuracy
and reliability of small size targets are typically less than that of large size aircrafts.
According to these difficulties, many scholars have put forward methods to solve
this problem. Zhu and Ma applied an adaptive weight multi-classifier fusion method
to improve recognition accuracy [7]. Rough set theory and directed acyclic graph
support vector machines (DAGSVM) method were also used to select invariants
with better performance and speed up the recognition process [8]. Dimensionality
reduction with principle component analysis (PCA) was applied to extract and select
the feature subset to achieve a higher recognition rate [9]. A hierarchical recognition
scheme by incorporating suitable weights into features and classifying aircrafts at
different levels was proposed to solve the problem of lack of feature information [10].
Most of these methods are based on the perspective of feature selection or classifier
combination, which have been proved to be useful for type recognition tasks.
This paper studies on the type recognition of small size aircrafts. By analyzing
characteristics of small size aircrafts, we propose a novel recognition approach using
weight optimization feature fusion and multiple classifiers combination method. The
advantage of our approach lies in that it could make full use of the feature information
to deal with blurred shape and background disturbance. Experiment results verify the
feasibility and validity of our proposed type recognition method.

2 Analysis of size characteristics for different types of aircrafts

In the process of remote sensing imaging, the pixel sizes of aircrafts change with focal
distance variation. The size will be smaller and the shape, texture and contour of
aircrafts will become unapparent when the images are zoomed in, which will have a
significant impact on type recognition results.
In practice, large size aircrafts can provide more discriminant information and
reserve clear details, such as texture information, which makes it easier to recognize
compared with small size targets. With the decrease in size, detailed information of
targets is easy to be lost and some kinds of features are no longer suitable for the
recognition task. From Figure 1 we can see that the small size aircrafts could only
get the approximate contour for type recognition. By analyzing results of experiment,
we find that it’s difficult to extract clear and accurate contour information when the
target is less than 1500 pixels. Therefore, aircrafts containing less than 1500 pixels are
defined as small size targets, the rest are defined as large size targets. Aircrafts with
different size are shown in Figure 1.
 Type Recognition of Small Size Aircrafts in Remote Sensing Images   135

Figure 1. Comparative demonstration of aircraft with different size

Despite the common properties of small size, different type aircrafts present distinct
shape characteristics owing to different sizes of fuselage, wingspan and so on. Table 1
shows the statistical shape characteristics of different small size aircrafts. It is apparent
that the ratio of fuselage to wingspan of most aircrafts are close to one except for B2
type, and the size of F22 type aircrafts is the smallest. Analysis of size characteristics
for different types of aircrafts can provide more information to subsequent feature
selection step, and enhance improve the type recognition accuracy.

Table 1. Shape characteristics of small size aircrafts

Aircraft type Average fuselage Average wingspan Ratio of fuselage Average contour
pixels pixels to wingspan pixels

C17 32 30 1.067 145


C130 26 29 0.896 108
F22 24 21 1.143 84
F111 35 33 1.061 161
B2 20 32 0.594 96
B52 31 36 0.861 112
E3 28 27 1.037 101

Due to the effect of illumination, the intensity of remote sensing images may change
dramatically, which means color features are not suitable for recognition in such
situations. Recognition features for aircrafts can generally be divided into two
categories: contour and shape. Contour feature is more accordance with human
visual cognitive habits compared with others, but it requires high quality contour of
the extracted target. Moreover, we cannot construct a stable contour feature when the
target is too small. From Table 1 we can see that the number of contour pixels is too
small to represent features. Therefore, contour feature is more suitable for large size
aircrafts owing to its clear edges and long contours. Shape feature is able to describe
the local properties of the targets without the need of clear lines or contours, which is
the most applicable feature to type recognition of small size aircrafts.
To verify analysis results, we compare the description ability of shape feature and
contour feature on aircrafts with different size in Table 2, evaluated by accuracy rate.
We select shape context [11] as contour feature and invariant moments in section 4 as
136   Type Recognition of Small Size Aircrafts in Remote Sensing Images

shape features. It can be seen that the contour feature can represent large size targets
better than shape feature but it cannot describe small size targets precisely. Hence,
in this paper, shape feature is used for type recognition and pertinent methods will
be put forward according to size characteristics of aircrafts in the following sections.

Table 2. Comparison of feature description capacity

Accuracy Feature Shape feature Contour feature

Size

Small size aircrafts 91.43% 31.43%


Large size aircrafts 77.14% 85.71%

3 Segmentation and orientation normalization of small size


aircrafts

3.1 Aircraft segmentation using active contour model

Segmentation results often suffer from image blurring and poor contrast, especially
for small target segmentation. It’s necessary to employ appropriate preprocessing
algorithm to reduce the interference and enhance the robustness and adaptability of
segmentation results.
Aircraft is usually on parking aprons, most of which are out of flatness and
with many distractors inside. In this paper, bilateral filter method is applied to
smooth the aircraft images. Compared with the traditional Gaussian filter, it could
preserves the target edge information and filters the background simultaneously
[12]. Figure 2b shows the results of bilateral filter method. Because of the influence of
illumination, the intensity of pixels on aircraft is sometimes the same as backgrounds.
By introducing the logarithmic transformation function, a nonlinear histogram
transformation algorithm proposed in [13] is adopted for highlight aircrafts, which
has the effect of contrast enhancement. Figure 2c shows the results of nonlinear
histogram transformation.
From the Figure 2 we can see that the aircraft in preprocessed images have a
clearer outline compared with those in original images, and the segmentation of
small size aircraft can be regarded as a single class segmentation problem. According
to the characteristic of preprocessed images, in this paper, the CV (Chan-Vese) active
contour model is used for aircraft targets segmentation [14]. The basic idea of the
active contour model is to use a continuous curve to represent the edge of the target.
CV active contour model makes use of the global information of internal and external
regions of the contour for evolving curve [14]. The zero level set function is used
 Type Recognition of Small Size Aircrafts in Remote Sensing Images   137

Figure 2. Results of bilateral filter and nonlinear histogram transformation

to represent the closed contour curve by introducing Heaviside function and Dirac
function:
1, φ ≥ 0 dH
= H (φ ) = , δ
0, φ < 0 d φ (1)
where ф is the level set function. Thus the energy function E can be expressed as:
2
= E (C ) λ1 ∫ f ( x, y ) − c1 H (φ ( x, y )) dxdy

2
+ λ2 ∫ f ( x, y ) − c2 (1 − H (φ ( x, y ))) dxdy

+ µ ∫ δ (φ ( x, y )) ∇φ ( x, y ) dxdy
Ω (2)

where Ω is the image domain; f(x,y) is the pixel value at (x,y); λ1, λ2 is weight
coefficient; c1, c2 represents the average intensity of internal and external regions
of the contour respectively. The evolution equation of level set function ф can be
obtained via variational method:

∂φ   ∇φ  2 2 
= δ (φ )  µ div  − λ f − c1 + λ2 f − c2 
∂t   ∇φ  1 
    (3)

An initial contour circle with radius of 1/3 width of the image is firstly placed in the
vicinity of the image center and the level set function is initialized as signed distance
function. Algorithm 1 introduces the pseudo-code of CV active contour model for
segmentation of small size aircrafts.
138   Type Recognition of Small Size Aircrafts in Remote Sensing Images

Algorithm 1: CV Active Contour Model for Segmentation

Input : Image Img , number of iterations iterNum, weight coefficient λ1 , λ2 , µ


Output : Contour segmentation image
Initialization: length and width of the preprocessed image l , w
set Row=l /2, Column =w/2, Radius = min ( l ,w ) /3
for i = 1 to Row
for j = 1 to Column
φ (i, =
j ) Radius − (i − Row) 2 + ( j − Column) 2
end
end
for i = 1 to iterNum
Calculate the result of Heaviside function H by Eq. (1)
c1 = sum( H .* Img ) / sum( H )
c2 = sum((1 − H ).* Img ) / sum(1 − H )
Update level set function by Eq.(3)
end
obtain the result by calculating the contour line corresponding to zero

Figure 3 shows segmentation results using a CV active contour model. We can detect
targets whose boundaries are not necessarily defined by gradient or with very smooth
boundaries, which is suitable for segmentation of small size aircrafts. It’s obvious that
the high quality small size aircrafts can be segmented, disregarding the blurred edge
and background disturbance. Further experiment results will show the robustness
and adaptability of CV active contour segmentation method.

Figure 3. Results of CV active contour model segmentation


 Type Recognition of Small Size Aircrafts in Remote Sensing Images   139

3.2 Orientation normalization based on PCA method

In practice, aircrafts have different orientations in remote sensing images, which will
degrade the discriminative ability of extracted features. In order to recognize aircrafts
more accurately, it’s necessary to normalize all of the aircrafts to a fixed direction,
i.e. the northern direction. In this paper, the orientation normalization based on
principal component analysis (PCA) method is proposed to find the principal axis of
the aircraft. The image coordinates are treated as the dimension-reduced data and the
flow of orientation normalization is as follows.
1) Form a matrix by rows using aircraft contour pixel positions. With N contour pixels,
an N-by-2 matrix could be obtained as data samples.
2) Calculate the covariance matrix of data samples. The dimension of the covariance
matrix is 2×2, which contains the information of the contour pixel position of
aircraft.
3) Calculate eigenvalues and eigenvectors of covariance matrix. The meaning of
eigenvectors is the mapping relationship between the original coordinate axis
and new one. The eigenvalues represent the variance of data points in the new
coordinate axis.
4) The coordinates under new axis are mapped directly by multiplying the original
coordinates and the corresponding eigenvectors. And by analyzing the mean value
of the coordinate points, we could shift aircraft to the image center and find the
principal axis.

As the aircraft are generally symmetrical about their central axis, the input data will
have the largest variance under this axis, or the perpendicular one. So the principal
axis could be obtained through covariance matrix decomposition method.
The results of aircrafts after orientation normalization are shown in Figure 4. We
align the aircraft upright to their principal axis respectively for type recognition. It can
be seen that all of the orientation normalized targets are basically toward the north,
which demonstrates the robustness of our method.

Figure 4. Results of PCA orientation normalization


140   Type Recognition of Small Size Aircrafts in Remote Sensing Images

4 Feature extraction, selection and weighted fusion

4.1 Feature extraction and selection

Feature extraction is a key step to aircraft type recognition. As mentioned in section 2,


shape feature is the most applicable feature to type recognition of small size aircrafts,
which can be classified into two types, invariant moments and statistical features of
aircraft targets [10]. The latter is designed and extracted based on the properties of
aircrafts, such as symmetry, compactness and so on. Although have the advantages of
reasonable geometry meaning and less calculation, these features are easily affected
by blurred shape and image interference, which are not suitable for small size targets.
Invariant moments have been proved to be simple and practical in target
recognition. There are many kinds of invariant moments. Hu moment is the first
proposed invariant moment in 1962 [15]. It is invariant to scale, rotation and translation,
but is sensitive to noise. Zernike moment is orthogonal, and it can construct arbitrary
higher moment to include more detailed information. But it does not have the affine
invariant property [16]. MSA moment is affine invariant, but its ability to describe
target information is not as good as others [17].
Above the three kinds of invariants have both strength and weakness. The more
stable the invariant moments are, the more accurate the results will be. If we can select the
invariant moments with high stability, the fusion feature will have better discriminative
ability. Therefore in this paper, the max relative change rate of invariants under four
different image transformation conditions is used to evaluate stability of features. Four
image transformation conditions include original images, half of the size images, 90
degree clockwise rotation images and combined transformed images, containing scale
translation and rotation transformation. The max relative change rate R is defined as:
F − F1
R = max i
i = 2,3,4 F1
(4)
where F1 presents feature values calculated on the original images; F2, F3, F4 presents
calculation result under last three conditions respectively. R represents the stability of
invariants under transformations.
Table 3, IV and V shows the results of analysis of Hu moments, Zernike moments
and MSA moment respectively. Columns H1 to H7 in Table 3 represent seven different
Hu moments. Columns A1 to A6 in Table 4 represent six Zernike features which from
4th-order to 24th-order and the increment number of order is 4. Row M1 in Table 5
represent MSA feature. Data in each Table represent the mean values of the invariant
moments in different conditions.
According to Table 3, IV and V, we can conclude that the first five Hu moments, first
four Zernike moments and the MSA moment are much more stable than others, among
which the maximum value of max relative change rate is 8.45%. Therefore, we select
above 10 dimension invariants as the type recognition features for small size aircrafts.
 Type Recognition of Small Size Aircrafts in Remote Sensing Images   141

Table 3. Consensus analysis of Hu moments in different image conditions

Feature Image H1 H2 H3 H4 H5 H6 H7
value source
Feature
name

Original 0.0013 1.076e-06 1.209e-14 1.050e-14 9.063e-29 9.746e-18 7.623e-29

Half the size 0.0013 1.076e-06 1.213e-14 1.045e-14 8.982e-29 9.725e-18 7.605e-29

Rotate 90° 0.0013 1.084e-06 1.213e-14 1.045e-14 8.982e-29 9.725e-18 7.605e-29

Combined transformation 0.0013 1.090e-06 1.307e-14 9.863e-15 8.406e-29 6.737e-18 6.171e-29

Max relative change rate 0.14% 1.30% 8.11% 6.07% 7.25% 30.87% 19.05%

Table 4. Consensus analysis of Zernike moments in different image conditions

Feature Image A1 A2 A3 A4 A5 A6
value source
Feature
name

Original 0.0071 0.1236 0.0415 0.0249 0.0164 0.0066


Half the size 0.0075 0.1245 0.0430 0.0265 0.0189 0.0088
Rotate 90° 0.0075 0.1245 0.0430 0.0265 0.0189 0.0088
Combined transformation 0.0077 0.1167 0.0381 0.0281 0.0215 0.0121
Max relative change rate 8.45% 5.58% 8.19% 12.85% 31.10% 83.33%

Table 5. Consensus analysis of MSA moment in different image conditions

Feature Image Original Half Rotate 90° Combined Max relative


value source the size transformation change rate
Feature
name

M1 0.8012 0.8004 0.8007 0.7783 2.86%

4.2 Weighted feature fusion with weight optimization

The small size aircraft usually suffers from blurred shape and limited feature
information. Thus using only one kind of feature can’t describe characteristics of
target information comprehensively and cannot meet the requirement of recognition
accuracy obviously. So as to make full use of the information of multiple features,
a new method of weighted feature fusion with weight optimization is proposed in
this paper. By quantifying the recognition ability of different features and optimizing
weights, our method is more suitable for type recognition of small size aircrafts and
can improve accuracy rate significantly.
142   Type Recognition of Small Size Aircrafts in Remote Sensing Images

The determination of weight coefficient is the key problem of weighted fusion


method. The tradition method is to set fix weight by feature’s recognition performance
on the training set, which is simple to implement, but is not effective in complex
conditions and have poor recognition stability. Therefore, adjusting weights by
dynamic optimization strategy is an effective method to improve recognition accuracy.

Algorithm 2: Weight Optimization towards Feature Fusion Method


Input : Training set, dimension of features feaDim, number of iterations iterNum
Output : Optimized weights w
for i = 1 to feaDim
(0)
wi = 1 / feaDim
end
make feature fusion with initial weights w(0) and get recognition rate q (0)
for i = 1 to iterNum
select 2/3 of the total examples randomly to construct sub training set Φ
for j = 1 to feaDim
extract jth feature f j on Φ and get recognition rate R j
end
determine the index of the largest three recognition rate l1 , l2 and l3
(i ) ( i −1)
=wl1 wl1 + 0.05
(i ) ( i −1)
=wl2 wl2 + 0.03
(i ) ( i −1)
=wl3 wl3 + 0.01
normalize all kinds of the features to range [ -1, 1]
normalize the fusion weights by w(ji ) =w(ji ) / (∑ j w(ji ) ) for all i
make feature fusion with ith weights w( i ) and get recognition rate q ( i )
end
select the weights corresponding to the highest recognition rate in q (i ) as w

Our weight optimization algorithm is shown in Algorithm 2. In our method, each feature
is assigned an initial weight, and then updated in the iteration algorithm to promote
adaptability. It can be seen that weight optimization is applied on a random training
subset, which takes advantage of the idea of resampling method. Because of the
randomness of training samples in each iteration, the optimized weights will have better
adaptability and can make use of different features sufficiently. Further experiments also
verify the validity and effectiveness of our weight optimization fusion method.

5 Type recognition approach to small size aircrafts

5.1 Analysis and design about three sort of elementary classifiers

Because of the inherent shortcomings of classifiers as well as limited feature


information of small size aircraft, the result of type recognition will be unstable if
only using a single classifier. So as to improve recognition stability and accuracy, we
utilize a combination method to synthesize the results by elementary classifiers.
 Type Recognition of Small Size Aircrafts in Remote Sensing Images   143

Different from general two-category classification problem, the type recognition


of aircraft can be regarded as a multi-classification task. At present, most commonly
used multi-classification algorithms generally belong to statistical learning method,
such as softmax regression, artificial neural network (ANN), support vector machine
(SVM) and so on [9]. In this section, above several classifiers will be analyzed and
designed for the type recognition of small size aircrafts.
Softmax regression is a classification method that generalizes logistic regression
to multiclass problems. Similar to logistic regression, the loss function of softmax
regression is logarithmic loss function, and it is minimized by the gradient descent
algorithm [18]. We apply this method in type recognition to predict the probabilities
of aircraft types, given the fusion features mentioned in section 4 as input. The model
not only keeps high classification accuracy but also reduces the time complexities,
which is suitable for type recognition of aircraft targets.
An artificial neural network is a mathematical model of information processing.
With enough experimental data, it is able to approximate arbitrarily classification
functions, which satisfying the high accuracy requirements. When applied to type
recognition task, the number of input and output nodes is the same as dimension
of features and the number of aircraft types respectively. Therefore, the number of
hidden layers and their own nodes have great influence on the recognition results.
Experimental results are shown in Table 6.
It can be seen that the performance of neural network with two hidden layers
is always better than that with one hidden layer. A model containing three or more
hidden layers cannot improve the recognition accuracy further while it rises training
time and may cause over-fitting. According to the results from Table 6, we design an
artificial neural network classifier with two hidden layers and the number of nodes in
each layer from input to output is 10, 6, 8 and 7 respectively.

Table 6. Result comparison of different neural network structures

Number of hidden layers Number of nodes in first Number of nodes in Accuracy rate
hidden layer second hidden layer

1 8 / 65.71%

1 12 / 68.57%

2 4 4 71.42%

2 6 8 75.71%

2 6 10 72.85%

A support vector machine solves the small-sample learning problems effectively by


using structural risk in place of experiential risk, which is suitable for type recognition
of aircrafts. There are many ways for the construction of multiple SVM classifiers,
144   Type Recognition of Small Size Aircrafts in Remote Sensing Images

among which directed acyclic graph (DAG) has been proved to be the most efficient
method for multi-classification, which utilizes the advantages of other methods
and has great generalization ability [19]. Due to the small dimension of recognition
features, linear kernel is employed in DAG-SVM method. As a result, the speed of type
recognition is greatly improved without reducing the accuracy.

5.2 Synthetic decision through voting of elementary classifiers

A combination method of elementary classifiers can be divided into two categories,


cascade mode and parallel mode. The latter has been widely used in virtue of ability
to prevent misclassification caused by the error of single classifier. In this paper,
each classifier is trained and used to recognize aircraft types respectively, and all
classifiers are combined to get the final recognition results. Since all of the classifiers
use the same fusion features as the input, weighted fusion method is unnecessary
and it is difficult to determine the optimized weights. Therefore, elementary classifier
combinations can be implemented by a majority voting decision.

Table 7. results of accuracy rate of different classifiers

Number of all targets Number of correct recognition Accuracy rate

Softmax regression 70 57 81.43%

ANN 70 52 75.71%

SVM 70 61 87.14%

Table 7 shows the results of accuracy rate of different classifiers. Owing to the
recognition accuracy rate of the SVM method is the highest, we take the recognition
result of SVM classifier in the case of a tie vote. Therefore, the synthetic combination
decision is as follows:
1) If the results of more than two classifiers are identical and they will be regarded as
final recognition results.
2) If the results of three classifiers are different, namely in the case of an equality of
votes, take the result of SVM classifier as final recognition result.

A synthetic combination with a majority voting decision can reduce the complexity
and improve the stable ness of recognition simultaneously compared with a weighted
fusion method. Further experiments also demonstrate the validity of synthetic
combination method in improving accuracy rate. Comprehensive scheme for proposed
type recognition method are shown in Figure 5.
 Type Recognition of Small Size Aircrafts in Remote Sensing Images   145

Table 8. Examples of seven types of aircraft target

Type Samples Type Samples

C17
B2

C130 B52

F22
E3

F111

Figure 5. Comprehensive scheme for type recognition

6 Comprehensive experiments and comparative analysis

In order to evaluate the proposed method, a database of seven types of military


aircraft images is collected from Google Earth, including C-17, C-130, F-22, F-111, B-2,
B-52, and E-3.
The database contains 210 aircraft images, namely 30 images for each type. All of
the aircraft have less than 1500 pixels, which represent small size aircrafts. 20 images
for each type of aircraft are selected as training samples, the remaining 10 images as
test samples. Examples of these aircrafts are shown in Table 8. Each row includes four
samples of that type.
The process of experiments is as follows:
Firstly, the original image are preprocessed by image graying, bilateral filtering
and nonlinear histogram transformation. In the bilateral filter, we set parameters
to ensure that similar pixel values have much greater impact on the filtering results
146   Type Recognition of Small Size Aircrafts in Remote Sensing Images

compared with similar spatial distance. And the parameter of nonlinear histogram
transformation is set to make the output histogram inclined to uniform distribution.
The comparison of accuracy rate about whether to use the preprocessing method are
shown in Table 9, which verify the reliability of preprocessing method.

Table 9. Comparison of accuracy rate about whether to use the preprocessing method

Number of all targets Number of correct reco- Accuracy rate


gnition

Original images 70 47 67.14%

Preprocessed images 70 63 91.43%

Secondly, we apply the CV active contour model to segmentation, which is described


in detail in section 3 and the iteration number is set to 1000 to minimize the energy
function. And then the orientation normalization method using principal component
analysis is employed to find the principal axis. We compare the CV active contour
model with the Canny method and structured forests method proposed in [20], which
is one of the state-of-art contour extraction method. It’s apparently that there are lots
of non-aircraft disturbance by other two methods, and the segmentation based on CV
active contour model obtains the best results in Figure 6.
Thirdly, invariant moments are extracted in training sets, including Hu, Zernike
and MSA moments. And then, we apply the weight optimization method stated in
section 4 to get the optimized fusion weights. We compare the accuracy rate of single
feature, the direct combined feature (the weights of all features are set to the same)
and proposed weighted fusion feature. The results are shown in Table 10. It can be
seen that the direct combined feature is not better than all of single features, but the
accuracy rate of the weighted fusion feature is the highest in all kinds of features.

Figure 6. Comparison of results of different segmentation methods


 Type Recognition of Small Size Aircrafts in Remote Sensing Images   147

Table 10. Recognition accuracy of different features

Number of all targets Number of correct recognition Accuracy rate

Hu moment 70 46 65.71%
Zernike moment 70 37 52.85%
MSA moment 70 17 24.29%
Direct Combined 70 56 80%
Weighted Combined 70 63 91.43%

Finally, three kinds of classifiers are trained and the small size aircrafts are recognized
based on majority voting decision as mentioned in section 5. Table 11 shows the
results of accuracy rate and average computation time of our method. It can be seen
that our proposed method have achieved high performance with short computation
time, and the accuracy rate of the synthetic combination method is higher than any
single classifier, which reflects the rapidity and validity of our method.

Table 11. Statistical results of our proposed type recognition method

Number of all targets Number of correct recognition Accuracy rate Computation time (s)

70 63 91.43% 3.124058

7 Conclusion

As an important strategic target, the automatic recognition of small size aircraft


targets is important in both civil and military. In order to solve this problem well,
this paper provides a novel idea based on synthetic combination of multiple
classifiers. The contours of aircraft targets are extracted by active contour model and
PCA method firstly. Moreover, stable features of moments are selected for further
weighted fusion through weight optimization method. Finally, a synthetic recognition
scheme is designed and implemented by voting of recognition results from three
sort of elementary classifiers. Experiment results demonstrate that our proposed
type recognition method achieve satisfied performance both in recognition rate and
recognition speed. In the future work, further research will be explored on feature
fusion and classifier combination method to improve recognition accuracy and
efficiency.

Acknowledgment: This work was partly supported by National Natural Science


Foundation of China (Grant Nos. 61273350 and U1435220), Beijing Science and
Technology Project of China (Grant No. D16110400130000-D161100001316001)
148   Type Recognition of Small Size Aircrafts in Remote Sensing Images

References
[1] Ge L, Xian S, Kun F, et al. Aircraft Recognition in High-Resolution Satellite Images Using Coarse-
to-Fine Shape Prior [J]. Geoscience and Remote Sensing Letters, IEEE, 2013, 10(573-7.
[2] Li H, Jin X, Yang N, et al. The recognition of landed aircrafts based on PCNN model and affine
moment invariants [J]. Pattern Recognition Letters, 2015, 51(23-9.
[3] Xing C, Li Y, Zhang K. Aircraft recognition based on convex-concave analysis [C].Proceedings
of the Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on,
2012: 1391-5.
[4] Wang L, Xing C, Yan J. Aircraft recognition based on nonparametrical statistics [C].Proceedings
of the Natural Computation (ICNC), 2011 Seventh International Conference on, 2011: 1602-6.
[5] Xu C, Duan H. Artificial bee colony (ABC) optimized edge potential function (EPF) approach to
target recognition for low-altitude aircraft [J]. Pattern Recognition Letters, 2010, 31(1759-72.
[6] Haibin D, Lu G. Elitist Chemical Reaction Optimization for Contour-Based Target Recognition in
Aerial Images [J]. Geoscience and Remote Sensing, IEEE Transactions on, 2015, 53(2845-59.
[7] Zhu X, Ma B, Guo G. An adaptive-weight regularization method for multi-classifier fusion
decision [C].Proceedings of the Mechatronics and Control (ICMC), 2014 International Conference
on, 2014: 343-6.
[8] Lei H, Ying-jun M, Lei G. A Rough Set-Based SVM Classifier for ATR on the Basis of Invariant
Moment [C].Proceedings of the Communications and Mobile Computing, 2009 CMC ‘09 WRI
International Conference on, 2009: 620-5.
[9] Donghe W, Xin H, Wei Z, et al. A method of aircraft image target recognition based on modified
PCA features and SVM [C].Proceedings of the Electronic Measurement & Instruments, 2009
ICEMI ‘09 9th International Conference on, 2009: 4-177-4-81.
[10] Hsieh J W, Chen J M, Chuang C H, et al. Aircraft type recognition in satellite images [J]. Vision,
Image and Signal Processing, IEE Proceedings -, 2005, 152(307-15.
[11] Belongie S, Malik J, Puzicha J. Shape matching and object recognition using shape contexts [J].
Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2002, 24(4): 509-22.
[12] Elad M. On the origin of the bilateral filter and ways to improve it [J]. IEEE Transactions on Image
Processing, 2002, 11(10): 1141-51.
[13] Ming G, Shiyin Q. Correction of contrast distortion image based on nonlinear transform of
histogram [J]. Journal of Beijing University of Aeronautics and Astronautics, 2016, 03): 514-21.
[14] Chan T F, Vese L A. Active contours without edges [J]. IEEE Transactions on Image Processing,
2001, 10(2): 266-77.
[15] Ming-Kuei H. Visual pattern recognition by moment invariants [J]. IRE Transactions on
Information Theory, 1962, 8(2): 179-87.
[16] Khotanzad A, Hong Y H. Invariant image recognition by Zernike moments [J]. IEEE Transactions
on Pattern Analysis and Machine Intelligence, 1990, 12(5): 489-97.
[17] Heikkila J. Multi-scale auto-convolution for affine invariant pattern recognition [C].Proceedings
of the Pattern Recognition, 2002 Proceedings 16th International Conference on, 2002: 119-22
vol.1.
[18] Duan K, Keerthi S S, Chu W, et al. Multi-category Classification by Soft-Max Combination of
Binary Classifiers [C].Proceedings of the Multiple Classifier Systems, International Workshop,
Mcs 2003, Guilford, Uk, June 11-13, 2003, Proceedings, 2003: 125-34.
[19] Platt J C, Cristianini N, Shawe-Taylor J. Large Margin DAGs for Multiclass Classification [J].
Advances in Neural Information Processing Systems, 2000, 12(3): 547--53.
[20] Dollar P, Zitnick C L. Structured Forests for Fast Edge Detection [C].Proceedings of the Computer
Vision (ICCV), 2013 IEEE International Conference on, 2013: 1841-8
Bo HU, Yu-kun JIN, Wan-jiang GU, Jun LIU, Hua-qin QIN*, Chong
CHEN, Ying-yu WANG

Research of User Credit Rating Analysis Technology


based on CART Algorithm
Abstract: In order to solve the existing problem which user credit rating analysis
technology can not meet the need of electricity expends channel integrated analysis
system, this paper puts forward a user credit rating analysis technology based on
CART algorithm, builds user persona, designs and realizes the analysis model of user
credit rating.

Keywords: Credit Rating; CART Algotithm; Decision Tree; Data Mining

1 Introduction

With the multipolar development of payment channel and payment way, the
original power agency business address payment mode –the “single mode” [1] was
broken. Then the Integrated payment access management platform [2] has risen.
With the users of network payment expends unceasingly, the network space breach
happened frequently, credit problems for network users need to be solved. So to
design a reasonable mechanism to avoid the risk of electricity recovery, reduce the
management risk of the power supply enterprise, has the practical significance.
With the rapid development of society, the information technology such as
computer, network and communication is developing rapidly. People using the
integrated payment access platform for the number of bills to pay also greatly improved.
The demand drives a new technology of application - data mining application in
the integrated payment platform [3]. Data mining research covers many fields and
methods. Decision tree algorithm belongs to the category of machine learning, it is a
technology of setting up classification model [4,5]. As a result of the model structure
and generates rules simplicity, decision tree has a high degree of automatic control,
so all the time since it is a popular classification technique. There are many decision
tree algorithm [6], including CART (Classification and Regression Trees) algorithm.
CART algorithm is a nonparametric statistical method, mainly used for classification,
can handle both continuous variables and classification. Its primary goal is to

*Corresponding author: Hua-qin QIN, Beijing Kedong Electric Power Control System Co. Ltd, Beijing
100085, China, E-mail: 252765032@qq.com
Bo HU, Yu-kun JIN, Wan-jiang GU, Jun LIU, State Grid Anshan Electric Power Supply Company,Anshan
114000, China
Chong CHEN, Ying-yu WANG, Beijing Kedong Electric Power Control System Co. Ltd, Beijing 100085,
China
150   Research of User Credit Rating Analysis Technology based on CART Algorithm

construct an accurate classification model used to forecast, research the classified


variable cause the classification phenomenon and the interaction between variables.
By establishing the decision tree and decision rules of type category forecast the
unknown object, namely some relevant variables through type unknown object value
judgement can be made to its type.
This paper is based on the CART algorithm for hierarchical power users, users of
different credit rating with different electricity recovery measures can solve the risk
of electricity recovery.

2 Decision Tree CART Algorithm

The decision tree can be imagined as a tree, as shown in Figure 1. Every node in the tree
are corresponding to a sample attribute, this property is based on some split criterion
to choose the best split attribute, according to the splitting attribute to the original
data set is divided into two parts. Each branch corresponds to a split conditions, its
connection node fully comply with the requirements for the division. Each leaf node
starts from the root node passes category labels of data set which from the middle
nodes split. Each tree builds the decision tree are corresponding to a number of rules,
each rule is split from the root node splitting attributes and conditions corresponding
to record information in turn. Once the decision tree is built, the corresponding set of
rules will be uniquely identified, thus it can be set of rules for the classification of the
data sets with the same set of properties prediction.

root node
Split conditions 1 Split conditions 2
middle
leaf node
node
Split conditions 3 Split conditions 4

leaf node leaf node

Figure 1. Decision tree models

CART was put forward by Leo Breiman as early as 1984. Now it is already widely
applied in various fields. The difference between this algorithm and C4.5 algorithm is
the Gini index as a standard to choose the best split attribute, which is done with the
method of recursive.
Gini index is a kind of dimension explains the pure degree of a sample, the small
value mean it is pure. As the known sample set T, the Gini index is:
 Research of User Credit Rating Analysis Technology based on CART Algorithm   151

N
Gini (T )= 1 − ∑ pi2 (1)
i =1

PS: pi- belongs to one kind of risk probability; N- is the number of categories.
If data set T choosees A split as split condition, the original data set T into T1 and
T2 two subsets, the divide Gini value indicates the following:

T1 T
Gini
= split ( A ) (T ) Gini (T1 ) + 2 Gini (T2 ) (2)
T T

PS: Gini(T1) and Gini(T2) are calculated by the expression (1);


In the process of decision tree to construct, the choice of division of most
impurities changes before and after the best split attribute, its formula is as follows:

Gini=
( A, T ) Gini (T ) − GiniSplit ( A) (T )
(3)

PS: Gini(T) and GiniSplit(A)(T) are calculated by the expression (1) and (2);
CART model construction flow chart is shown in Figure 2:

Start

Training sample
input

Select the optimal threshold

Choose the best split


attribute based on the Gini
value

Select the best split attribute as


a root node

Extract the "if-then" rules set

End

Figure 2. CART algorithm flow chart


152   Research of User Credit Rating Analysis Technology based on CART Algorithm

3 Use CART Algorithm to Build User Credit Rating Analysis Model

3.1 Typical Behavior Analysis of Payment User Group

The typical behavior analysis of payment user group is based on the data which is
offered by the investigation research and power supply company, analyzes and
synthesizes the investigation research results and data, and to prepare for the
established data model of user group. To analysis the behavior, firstly need to use
feature weight optimization method to adjust the weights in the individual user
persona optimization, to get the best individual adjusted user persona, then clustering
and modeling the optimal individual user persona, finally get the group user persona
and data model.
The payment user group typical behavior analysis is mainly based on the payment
data supplied by questionnaire survey and the power supply company, including
analysis content:

Age Name Sex Address Payment Habits

Analyze users ages and User id Number 0/1 Analyze the effect of Analyze the user’s
influence with respect to instead;User instead;User capture expends place payment
the manner of payment property property of payment methods

3.2 The Build of User Persona

The user persona as a kind of effective tool which sketches target user, contacts the
user demands and the design direction, user persona has been widely used in various
fields. In the process of actual operation tend to use the most simple and close to life
words join the user’s attributes, behavior, and expectation. As a representative of the
actual user of virtual, user persona of the user role is constructed from product and
market [7]. The formed user roles need the primary audience and target group with
representative performance of product. User persona needs to be based on real data.
When there are several user persona, the priority needs to be considered and the user
persona is in constant modification.
The core work of user persona is label for the user. In the questionnaire has
already defined the user labels. These labels are succinct, simple, convenient label
extraction and cluster analysis.
Establishing users’ electricity payment, user persona can be divided into three
levels: first level is the investigation and analysis of user communities, second level
is the data analysis of individual description, third level is the development and
application after the abstract data model (Figure 3).
 Research of User Credit Rating Analysis Technology based on CART Algorithm   153

Data modeling and


application
development

Representational
individual qualitative
description

Group of quantitative
statistical analysis

Figure 3.User persona level

3.3 Build a Model

After the behavior analysis to the users’ payment needs to build a user group typical
preference modeling, by the CART algorithm get similar properties users together as
a class, and in the different class the users’ attributes are different, build a model
to each type of user. CART algorithm model of tree growth stage using differences
between GINI index as a standard branch, parent-stand in the selection phase K -
fold cross-validation chosen as the validation method [8,9]. Other set of parameters
for: this model choose is not to weight variable, namely, not to emphasize the
importance of a variable, also do not according to the frequencies of a variable’s
value and increase the importance of this variable’s value; the highest tree depth is
set to 5 layer; probability is now used in the training class of probability distribution,
and use the cost matrix correct prior probability; the data collected of variable’s
value has no missing value, so the processing parameters of missing value here
don’t have to be set up; when the number of objects in the parent node is less than
2% of the total number of objects, or the number of objects in the child node is less
than 1% of the total number of objects, stop the tree growth; choosing the Prune
tree, which when the accuracy of a model built by a node’s branches is not obvious,
it will automatically deletes all branches under this node. This model is based on
the CART algorithm, after the execution of training set data to the CART algorithm
data flow, it will build the CART decision tree model. This model can be applied
to prediction of future data and software on the analysis of user preferences and
payment channel of the construction of the decision.
154   Research of User Credit Rating Analysis Technology based on CART Algorithm

4 The Realization of The User’s Credit Rating Analysis Technology

4.1 Data Training

After the user persona sets up work, can get the attributes collection related to the
classification. With the preprocessing of quantification and normalization on these
user factors, then converting to digital information as the input vector of the network.
Through the mining system intelligent configuration, adopt two hidden layer, each
hidden layer has a network with 20 nodes [10,11]. The output of the network is the
user’s arrearage risk judgment corresponding to the input factor. In the training
guide, if the user has arrearage, the risk sets 1, otherwise 0. Network learning factor is
η = 0.3. The Initialized to inertia coefficient is α0 = 0.5, αmax = 0.9.

4.2 Comparing with other methods

The CART algorithm is based on the order algorithm of continuous attributes. In order
to explain the CART algorithm is optimal, comparing it with rough set [12,13]. The
primary difference between it and the algorithm based on rough set is different criteria
properties select, as shown in Figure 4 and Figure 5. As we can see from Figure 4 and
Figure 5, the number of leaf nodes of decision tree is eight, which generates eight rules,
and in Figure 5 the number of leaf nodes of decision tree is seven, which generates 7
rules. By comparison, the CART algorithm can reduce the complexity of the decision
tree generated, and generates less rules, effectively improves the classification effect,
and because of the reduced number of nodes in the decision tree, it also reduces the
storage space.

A1
1
D=N 2 3

A2 A2

2 2
1
D=P
A3 A4
A3
1 2
1 2 1
2 D=P
D=N D=N
D=N D=P
D=P

Figure 4. Approximation precision classification decision tree


 Research of User Credit Rating Analysis Technology based on CART Algorithm   155

A2
2
3
A3
A1
1 2 2 2

A4 D=N D=P
A4
2 1 1
2

A1 D=N D=P

1 3

D=N D=P

Figure 5. The CART decision tree

4.3 Higher Consumption - Higher Credibility Mode

The System through the excavation, found that mode of users power is higher
consumption with higher credibility. The specific content of the model shows in
the public electricity consumption behavior, electricity of enterprise users has a
substantial and at the same time duration longer model with a good reputation, the
risk of arrearage is relatively small. Through the study and survey on user groups
analysis, conclusions are as follows: with increasing power consumption and high
demand of users which the difference value of expected ammeter value is higher
and jump, companies are probably doing work or production on a large scale, the
electricity protection is very important, therefore, this kind of users for the production
efficiency, tend not to default on their costs.

5 Conclusion

In the study of data mining, mining algorithm modeling is a very important link. This
paper discusses the operation principle of the CART algorithm, and the user’s credit
rating model establishment and implementation, and get the higher consumption -
higher credibility mode. Thus, user credit rating analysis technology based on CART
algorithm has certain practicality [14].
156   Research of User Credit Rating Analysis Technology based on CART Algorithm

References
[1] Huang Guiping. Diversification options for payment of electricity [J].Power Management and
Sale, 2010, (01):8-9.
[2] Liu Tao, Zhao Xingyuan, Liu Bin. Research and Implementation of Integrated Pay Management
Platform Based on SOA[J].Network Security Technology and Application,2012,(5):63-65.
[3] Yao M. and Wang W. H., Research on Generalized Computing Model, Proceedings of the 1998
International Conference on Neural Networks and Brain Science, 1998.
[4] Wang Mengxue. Summary of Data Mining [J].Software Guide, 2013, 12(10):135-137.
[5] Quinlan J R. “Induction of Decision Trees”, Machine Learing, 1986.
[6] Leo Breiman, Jerome H. Friedman, Richard A. Olshen and Charles J. Stone, Classification and
Regression Trees, USA, Wadsworth, Inc. 1984.
[7] Zhu Ying. Study on power customer's credit assessment methods [D].Shanghai: School of
Electronics and Electrical Engineering of Shanghai Jiao Tong University, 2010:4-6.
[8] David Feldman. Mortgage Default: Classification Trees Analysis [J]. The Journal of Real Estate
Finance and Economics, 2005, 30(4), 369-396.
[9] Sergios Theodoridis, Konstantinos Koutroumbas al et. Pattern recognition [M]. China Machine
Press, 2006.
[10] Krishnan R. Sivakumar G. and Bhattacharya P., “Extracting Decision Trees from Trained Neural
Networks”, Pattern Recognition, 1999.
[11] Etheridge H L, Sriram R S. A Comparison of the Relative Costs of Financial Distress Models:
Artificial Neural Networks, Logit and Multi-variate Discriminant Analysis [J].Intelligent Systems
in Accounting, Finance and Management, 1997, 6(3): 235-248.
[12] Zhao Weidong, Sheng Zhaohan, He Jianmin. Application of Rough Sets to the Designing
of Decision Trees [J]. JournaL of Southeast University (Natural Science Edition), 2000, 30
(4):132-137.
[13] Shi Yaming, He Jianmin. Application analysis of credit evaluation based on rough set [J].Modern
Management Science, 2005, (5):12-15.
[14] Wu Qiu hua. Research on small business credit scoring based on rough set-CART model
[D].Heilongjiang: School of Economy and Management of Harbin Institute of Technology,
2013:13-42.
Cong-ping CHEN, Yan-hua RAN*, Jie-guang HUANG, Qiong HU,
Xiao-yun WANG

Research of the Influence of 3D Printing Speed on


Printing Dimension
Abstract: In the process of 3D printing, the forming dimension of the extrusion fluid
material will directly affects the printing precision. Under the given materials, the
printing speed is one of the important parameters affecting the forming dimension
of the material. Through numerical simulation, a dynamic three-dimensional
model of the 3D printing process is established, and compared the spreading and
forming dimension of the gelatin material under different printing speeds based on
the volume of fluid (VOF) method, obtaining the influence of printing speed on the
gelatin forming dimension. The results showed that both of the forming height and
width of materials decrease with the increase of the printing speed, and the impact of
the printing speed on the forming width of materials is more obvious.

Keywords: 3D printing; printing speed; forming dimension

1 Introduction

A layer by layer superimposed method has been used to construct entity in the 3D
printing, and compared to reducing material manufacturing, it has faster processing
speed, higher forming precision and higher degree of personalization. It is gradually
applied in machinery, electronics, aerospace, biotechnology, medical and other
fields, and has been successfully used to manufacture a variety of entities [1].
According to the forming principle, 3D printing can be divided into three categories
including laser molding, printing molding and extrusion molding, wherein extrusion
molding method usually uses the pressure pulse to extrude the high viscosity melt
material from the nozzle to the substrate where the material deposition and curing.
Owing to its wide material adaptability and controllability, this method has been
widely used in 3D printing. The typical extrusion-based 3D printing principle is
shown in Figure 1.

*Corresponding author:Yan-hua RAN, College of Mechanical & Power Engineering, China Three Gorges
University, Yichang, China, E-mail: mechencp@163.com
Cong-ping CHEN, Jie-guang HUANG, Qiong HU, Xiao-yun WANG, College of Mechanical & Power
Engineering, China Three Gorges University, Yichang, China
158   Research of the Influence of 3D Printing Speed on Printing Dimension

pressure controller

Compressed air

material
Micro-syringe

Substrate
XY Motion Platform

Z Motion Platform

dV Figure 1. Schematic diagram of extrusion-based 3D printing


Q (1)
dt
However, the quality of extrusion-based 3D printing is influenced by many factors,
r dp4 among which the precision of printing dimension is one of the main factor that affects
Q  (2)
the quality of printing. In recent years, a great deal of researches have studied the factors
8 dz
that affect the forming precision. For instance, Zhang [2] presented experimental
investigations(3)
on influence of important process parameters on dimensional accuracy
V  ahl 4
of FDM. The Taguchi method and orthogonal test method are used in experiments and
obtain optimum level of process parameters under the single-objective situation. Grey
l
 Taguchi method (4) is adopted to obtain optimum level of process parameters for multi-
t response to minimize percentage change and finally get the results and forming a
complete set of process of FDM based on parameter optimization, which has guiding
r4 dp significance (5)
on both process accuracy and selection of relevant parameters. Gao [3]
 
2h dz studied the dimensional accuracy of the molded parts in the FDM melt accumulation
process. According to the molding process, in many factors that affect selection
factors, using robust design method to design the parameters of experiment and seek
to ensure that the optimum combination of parameters of molding parts size precision,
and provides the basis for the later equipment and process parameter optimization.
Chen et al. [4] studied several influence factors of filament width in FDM molding
process, and emphatically analyzed the relationship between extrusion speed and
filament width, through experiment the relationship of output voltage and filament
width is given, and then a proper compensation method of FDM process prototype
size is discussed. Yang et al. [5] proposed a slice parameter optimization design
method based on orthogonal experiment in order to improve the printing precision
and molding quality, chose the optimal parameter ratio by orthogonal experiment
and verified the effectiveness of the proposed method by examples.
This paper mainly studies the influence of 3D printing speed on printing size of
material. First, geometric model of the material extrusion-spreading is established,
on this basis, through the VOF method, using gelatin material for printing objects to
 Research of the Influence of 3D Printing Speed on Printing Dimension   159

pressure controller
compare deformation phenomena in gelatin extrusion and spreading process under
different printing speeds, to obtain the
Compressed air influence
pressure of printing speed on the spreading
controller
morphology.

material
Compressed air Micro-syringe
pressure controller

2 Modeling of forming process

material
Micro-syringe
Compressed air
3D printing extruded material exhibitsSubstrate
in a linear
pressure fashion,
controller
pressure and the width and height
controller
XY Motion Platform

material
of the material are two indicators generally used toMicro-syringe
describe the forming size of the
filament. Ignoring the effects of gravitySubstrate
of the
Z Motion material itself, the spout outflow rate
Platform
Compressed air
Compressed air Platform
XY Motion
can be expressed as
dV Z Motion Platform
(1)
material
Micro-syringe

material
Q Substrate
Micro-syringe
(1)
dt
XY Motion Platform
where V is dV the material deposition volume, (1) t is the printing time. According to the
Q
r4 ddp Z Motion Platform
characteristics
Q  t of the viscous fluid flowing
(2) in the pipeline, using Poiseuille equation,
 dz
outflow8rate can also be expressed asSubstrate Substrate
r 4dVdp XY Motion
XY Platform
(1)Motion
(2) Platform
Q Q  (2) (3)
V  ahl8 d4tdz Z Motion Platform
Z Motion Platform
where μ is dynamics viscosity, dp/dz is total pressure drop. The cross-section the
fluidVQformed
 dV rl4 on
ahl 4dp (3)
  dVthe substrate can be approximately
(1)(4) (2) regarded as a semiellipse, so the
Q  8Qt dz
 (1)
volume Vdtextruded dt in the unit time t is expressed as
l
Vrr
4 
4 ahl 44 (3)
dp dp
t
(4)
(3)
aQ Q     r dp (2)
(5) (2)
where 28a hisdz8filament
dz
 dz width, his filament height, and l is filament length on the
substrate.
l dp the printing process, the substrate generally do uniform motion
r4 During
a   (4)
(5)
 2ahl
inV the V
horizontal
 t dz 4 direction, then the
4hahl (3) horizontal
(3) velocity υ of the substrate can be
expressed as
rl4 dp
 l
a    (4)
 (4) (5)(4)
2t h dz
t
According to (1) to (4), the filament width can be expressed as
r 4 rdp
4
dp (5) (5)
a a   (5)
2h2
dz h dz

3 Geometric modeling and parameter setting

In the 3D printing process, we need to move the substrate (or print nozzle) to obtain
the structure and shape what we want to print. The following taking an example
of the movement of the substrate shows the typical 3D printing process, in which
the nozzle inlet drive pressure is 0.2 (MPa), the flow channel radius r = 0.3 (mm),
the substrate size is 4(mm)×4(mm)×1(mm), the distance between nozzle outlet and
surface of substrate is 0.6(mm), and the total horizontal displacement of substrate
160   Research of the Influence of 3D Printing Speed on Printing Dimension

s = 3(mm). The printing material is gelatin, and the related physical parameters are
shown in Table 1.

Table 1.
Parameter Numerical

Density ρ (Kg/m3) 1206

dynamics viscosity μ (Pa·s) 1.05

Surface tension coefficient σ (N/m) 0.07

Thermal Conductivity kl [W/(m·K)] 4.91

Thermal Conductivity ks [W/(m·K)] 132

Specific heat CPl [J/(Kg·K)] 1721

Specific heat CPs [J/(Kg·K)] 1458

latent heat of solidification L (J/Kg) 110238

Melting point Tm (K) 300

Contact angle α (°) 90

4 Simulation results and analysis

Firstly, the typical 3D printing process is simulated at the example of substrate


movement speed υ = 0.02 (m/s), and the software is FLOW-3D10.1. The results are
shown in Figure 2. The material is extruded from the nozzle at 80 ms under the
pressure and contacts with the substrate, meanwhile, the substrate starts to move
horizontally at a constant speed 0.02 (m/s), then the material starts to continuously
spread on the substrate until it is240 (m/s) when 9 (MPa) negative pressure is applied
and the substrate moving 2 (mm) in uniform downward movement speed 0.025 (m/s)
to make the material being pulled off from the nozzle.

Front view

Plan view

Time 80ms 120ms 180ms 240ms

Figure 2. Material extrusion and pull off process diagram in printing process
 Research of the Influence of 3D Printing Speed on Printing Dimension   161

Width

Height

υ 0.015m/s 0.02m/s 0.025m/s 0.03m/s

Figure 3. Fluid extrusion morphology at different substrate speed

To study the influence of different substrate movement speed on the printing results,
we make a simulation to research the case of the movement speed of the substrate
being 0.015 (m/s), 0.025 (m/s), 0.03 (m/s), respectively, and the final shape of the
above four speeds (including 0.02 (m/s)) is shown in Figure 3. For comparison, the
simulation results are processed by the post-processing software Ensight10.0, and the
width and height data of the forming materials in each case are obtained and plotted,
the results are shown in Figure 4. Combined with Figure 3 and Figure 4, we can find: 1)
under certain substrate motion speed, the width of the extrusion filament is uneven,
wherein the height and width at the beginning and end of the printing process are
relatively larger, which mainly due to the horizontal movement speed of substrate at
the beginning and end of the movement is relatively slow, resulting in accumulation
of relatively large materials. However, it must be pointed out is that the length of the
actual printing filament will be significantly greater than the length of the printing
filament formed at the beginning and end of the printing. That is the example given
herein, the geometric parameters of the intermediate section of filament should be
considered as the parity data; 2) at lower speed(such as υ = 0.015 (m/s)), the filament
width is significantly greater than that of the flow channel diameter, with the
increasing of the printing speed, the filament width gradually decreased and is less
than the diameter of the flow channel, which is due to that the speed (υ = 0.015 (m/s))
used compared to the physical properties of the material and pressure is too small,
thus making the material accumulate too much.
Further, in order to obtain the average height and width of the filament at
different speeds, the mean of multiple data of the height and width in each stationary
spreading stage are obtained, when υ = 0.015 (m/s), the mean height H = 0.35 (mm),
the mean width W = 0.59 (mm); when υ = 0.02 (m/s), H = 0.33 (mm), W = 0.47 (mm);
when υ = 0.025 (m/s), H = 0.32 (mm), W = 0.44 (mm); when υ = 0.03 (m/s), H = 0.32
(mm), W = 0.42 (mm). This shows that the speed of the substrate has an influence on
the printed filament height and width, and both of them decrease with the increase of
speed, while the impact on the width is more obvious. Therefore, in actual 3D printing
process, a reasonable substrate speed needs to be set up to obtain a reasonable
dimension of the filament.
162   Research of the Influence of 3D Printing Speed on Printing Dimension

 Height distribution Width distribution

(m / s)
0.015

(m / s)
0.02

(m / s)
0.025

(m / s)
0.03

Figure 4. Height and width distribution of the material along X direction at different substrate speed

5 Conclusion

In 3D printing process, the printing speed has an impact on material forming height
and width and both of the two dimensions decrease with the increase of speed, but
the influence of the substrate speed on width is significantly greater than height.
In actual printing process, a reasonable substrate speed needs to be set up to avoid
spreading width too large or significant shrinkage situation to get filament with ideal
dimension and uniform shape.
 Research of the Influence of 3D Printing Speed on Printing Dimension   163

Acknowledgment: The research was supported by the National Natural Foundation


of China (No. 51475266, 51005134).

References
[1] Long Deyang, Process Parameters Research Of The Desktoop FDM, D. 2014.
[2] Zhang Yuan, Study on Process Precision Of Fused Deposition Modeling, D. 2009.
[3] GAO Shanping, Optimization Analysis of Dimensional Accuracy of FDM Shaped Parts Based on
Robust Design, J. Modern Manufacturing Technology and Equipment. 2016.0386
[4] CHEN Yaping, YE Chunsheng, HUANG Shuhuai, Factor s Analysis of Extruded Filament Width in
FDM Molding Process, J. SPECIAL FORMING. 2007.02.028.
[5] Yang Jiquan, Xu Luzhao, Li Cheng, Wang Jingxuan, Yin Yanan, Research of Process Parameters
Optimization of Molding Quality Based on FDM, J. JOURNAL OF NANJING NORMAL UNIVERSITY.
2013, 13(2):1-6.
Fan ZHANG*, Fan YANG
Research on Docker-based Message Platform in IoT
Abstract: Internet of Things (IoT) represents the next step towards the digitisation
of our society and economy, where objects and people are interconnected through
communication networks and report about their status or the surrounding
environment. A potential obstacle for the achievement of IoT technology has to do
with issues linked to the capacity to handle a large diversity and very large volumes
of connected devices with different protocols, and the need to provide a general
message platform with high availability and be able to fast scale system so that more
kind of devices could be plugged into the system. In this paper, based on Docker
containerization technology, we propose a general message platform for IoT. By
introducing clustering technologies in business layer, service layer and data access
layer, we could secure the high availability of the platform. In an attempt to decouple
the business logic, we bring RabbitMQ to deal with the diverse data collected from
different kinds of devices. Considering the scaling out of our platform, we use Docker
virtualization technology to achieve fast application deployment.

Keywords: Message Platform; Docker; High Availability

1 Introduction

A common scenario in IoT is that devices with unique identifiers transfer data [1] over
a network to the cloud. The cloud based data center deals with the receiving data
[4]. Given the specific business logic, the data center sends back specific command
messages to the devices. Traditionally, one socket server application is built to handle
with connections to the devices. All the business logic is put into the server layer. With
more devices with different protocols plugged into the system, the business logic in
server layer becomes very complexed and hard to maintain. Apart from decoupling
business logic in the application, we also need to secure the high availability,
scalability, fast deployment of the platform.
In this paper, we propose a generalized message platform to communicate
with client devices. All the devices connect to platform via http connection or long-
lived TCP connection, which covers most scenarios in IoT system. As in shown in
Figure  1, in the access layer, we define two message handling modular. The msg-
gate modular is keen on content filtering. White lists are provided in this layer. With
msg-gate modular, spam messages from clients are rejected and sensitive messages

*Corresponding author: Fan ZHANG, Beijing institude of Radio Metrology and Measurements, Beijing,
China, E-mail: zhangfan1212@126.com
Fan YANG, Beijing institude of Radio Metrology and Measurements Beijing, China
 Research on Docker-based Message Platform in IoT   165

delivered to clients are filtered. Common security strategies are added to the msg-
gate modular [3]. To decouple the concrete message handling business logic from
the platform, message queue is introduced. Based on different themes, Messages
through msg-gate modular are published to the message queue. By subscribing
corresponding topics, app server clusters can consume messages they concern. MQ
[2] technology improves the platform design in terms of scalability. Apart from real-
time upload message handling, a generalized message platform should also meet
the demand of the real-time message delivering. To achieve this goal, RPC interfaces
are defined in msg-logic modular. Via invoking RPC interfaces, app servers can send
messages to online clients at any time. In the platform architecture, data storage
is designed to cache the online user data and sending messages which are not
acknowledged by clients.

Clients

Business Logic
App0 …… AppN

App-serverN

………… msg-gate Access Layer


App-server1

App-server0 msg-logic redis

MQ db

Figure 1. Generalized Message Platform Architecture

2 Message Platform design

As is shown in Figure 2, the generalized message platform is composed of four


layers, client layer, reverse proxy layer, application layer and data access layer. To
achieve high availability, in each layer distributed architecture is designed. In client
layer, client devices firstly get virtual IP address from round-robin DNS server, and
then client requests are directed to a specific Nginx server with the corresponding
IP. In reverse proxy layer, Nginx is used to distribute the load among application
servers. As we mentioned in section I, the application layer filters the coming
messages, publishes the messages to the MQ cluster and offers RPC interfaces to
the app servers.
166   Research on Docker-based Message Platform in IoT

Domain
Name
Client DNS
IP
Address

Reverse
nginx nginx Proxy Layer

msg-gate msg-gate
Application
Layer

msg-logic msg-logic

Data Access
DB Redis Layer

Figure 2. Message Platform With High Availability Architecture

In most IoT scenarios, the message platform should be capable of handling lots of
concurrent requests. To balance server load, the system distributes requests to different
nodes within the server cluster, with the goal of optimizing system performance. This
results in higher availability and scalability. In client layer, round robin DNS method
is introduced to support a growing number of device requests.
Round Robin DNS [5] is a technique of load distribution or load balancing to
address requests from clients according to an appropriate statistical model. In our
implementation, round-robin DNS works by responding to DNS requests with one out
of a list of potential IP addresses corresponding to several servers that host Nginx
services. The order in which IP addresses from the list are returned is the basis for
the term round robin. With each DNS response, the IP address sequence in the list is
permuted. Usually, basic clients attempt connections with the first address returned
from a DNS query, so that on different connection attempts, clients would receive
service from different Nginx servers, thus distributing the overall load among Nginx
servers.

www.cyberpipe.com

Client DNS
10.50.10.11
10.50.10.11 10.50.10.12

nginx nginx1 Reverse Proxy Layer

Figure 3. Round Robin DNS in Client Layer


 Research on Docker-based Message Platform in IoT   167

In the reverse proxy layer, Nginx is used as reverse proxy server. Nginx is typically
used to distribute the load among several servers [6, 7]. In our case, it passes requests
for processing to the application servers over protocols. To balance application server
loads using Nginx, the different upstream IP addresses are configured in the nginx.
conf file. The multiple IP addresses represent the machines in the application server
cluster.
Although Nginx can distribute multi-network traffic across backend servers, in
our case we usually only deal with TCP traffic and HTTP traffic. For most of sensors,
we refer to the reliable TCP connection, and for the mobile devices, we usually transfer
the data through HTTP connection. Since HTTP being a ‘stateless’ protocol, Session
persistence becomes a common challenge for any transaction that involves two or
more requests. Although session persistence is supportable and it can direct all of the
user’s requests to that same server, we design our applications store state externally,
considering system scaling in distributed system. Via storing session object into
Redis, session data could be shared in the cluster.

nginx.conf Nginx Reverse Proxy Layer

msg-gate msg-gate
Application Layer

msg-logic msg-logic

Figure 4. Nginx Reverse Proxy in Reverse Proxy Layer

For a scalable generalized message platform, there are two main issues need to be
addressed. One is to deal with complexed upload data, the other is to deliver the
message to clients reliably. To solve the first problem, we introduce message queue
technique to decouple the business logic, which results in fast scaling. By caching
messages while delivering, reliable message delivery is archived. Messages flow is
show in Figure 5. After the coming messages filtered by msg-gate modular, msg-logic
modular publishes messages to the message queue with associated message topic. In
the app server cluster, app server consumes the messages that have been subscribed
in advance in the message queue. Whenever the app server needs to send message to
the client, RPC interfaces defined in msg-logic can be invoked. Once receiving message
sending request, the msg-logic modular checks device online status cached in Redis.
If the target client is online, the msg-logic caches the sending message at first and
then sends it to the client. Cached messages can only be deleted after the platform
168   Research on Docker-based Message Platform in IoT

receives acknowledge message from clients. In a situation where the target client is
offline, cached messages will be delivered when the client logins on the platform next
time.

msg-gate
1 6 7

msg-logic
2
8
MQ 5
4 db/redis
3

App- App- ………… App-


server0 server1 serverN

Figure 5. Message Flow in Application Layer

3 Experimental Results

To set up a testing environment for our message platform, we choose Docker


virtualization container technique. Docker is an open platform for developers to
build, ship, and run distributed applications. It takes just a few seconds to create new
Docker container. We set up our experiment environment as is show in Table 1.

Table 1. Experiment Setup

Server Type Count IP Version

DNS Server 1 172.17.0.2 Ubuntu 15.10

Nginx Server 2 172.17.0.3/4 Nginx 1.9.9

Message Server 2 172.17.0.5/6 Ubuntu 15.10

MQ Server 2 172.17.0.7/8 RabbitMQ 3.6.3

App Server 2 172.17.0.9/10 Ubuntu 15.10

Mysql Server 2 172.17.0.11/12 Mysql 5.6.17

Redis Server 2 172.17.0.13/14 Redis 2.8


 Research on Docker-based Message Platform in IoT   169

In the client layer, seven different sensors are selected which including temperature
sensor, press sensor, humility sensor, oxygen sensor, water flow sensor and noise
sensor. Those sensors periodically upload collected data to the message platform.
The platform sends back acknowledgement message to the clients. Once the client
receives acknowledgement message, it sends session-end message to the platform.
To test high availability of the platform, we stop partly servers in cluster of each layer.
The experiment results The experiment results described in Table 2, show the message
platform can provide continued service when a particular server crashes.
To test the platform scalability, we add a new accelerometer to the platform.
Without modifying the existed architecture, we simple add a new app server which
subscribes accelerometer topic from MQ.

Table 2. Experimental Results

Failure Test STATUS System Status

DNS Server Failure Test 172.17.0.2/down fail

Nginx Failure Test 172.17.0.3 up success


172.17.0.4 down

Message Server Failure Test 172.17.0.5 up success


172.17.0.6 down

MQ Server Failure Test 172.17.0.7 up success


172.17.0.8 down

App Server Failure Test 172.17.0.9 up success


172.17.0.10 down

Mysql Server Failure Test 172.17.0.11 up success


172.17.0.12 down

Redis Server Failure Test 172.17.0.13 up success


172.17.0.14 down

4 Conclusion

In this paper, a generalized message platform has been proposed. Firstly, via MQ
technique, message handling business logic is decoupled form the platform. Then a
high available and scalable architecture is described based on the distributed system.
We use Docker virtualization container to test the high availability and scalability. The
experiment results show the message platform can provide continued service when a
particular server crashes due to the HA design. Although the scalable design, there is
still room for improvement on alleviating the system bottleneck. Heavily loaded DNS
server in the client layer will be removed in the follow-up research.
170   Research on Docker-based Message Platform in IoT

Reference
[1] Reetu Gupta; Rahul Gupta, ”ABC of Internet of Things: Advancements, benefits, challenges,
enablers and facilities of IoT”, 2016 Symposium on Colossal Data Analysis and Networking
(CDAN), DPI:10.1109/CDAN.2016.7570875
[2] Maciej Rostanski, Krzysztof Grochla, Aleksander Seman, “Evaluation of highly available
and fault-tolerant middleware clustered architectures using RabbitMQ”, Computer Science
and Information Systems (FedCSIS), 2014 Federated Conference on, pp. 879 - 884, DOI:
10.15439/2014F48
[3] Minh Thanh Chung, Nguyen Quang-Hung, Manh-Thin Nguyen, Nam Thoai, “Using Docker in high
performance computing applications”, 2016 IEEE Sixth International Conference on Communi-
cations and Electronics (ICCE), pp.52-57, DOI: 10.1109/CCE.2016.7562612
[4] Kun Wang, Yuhua Zhang, Yue Yu, Yan Li,” Design and optimization of socket mechanism for
services in Internet of Things”, 2013 22nd Wireless and Optical Communication Conference, pp.
327 - 332, DOI: 10.1109/WOCC.2013.6676387
[5] Chi-Chung Cheung, Man-Ching Yuen, A. C. H. Yip, “Dynamic DNS for load balancing”,
Distributed Computing Systems Workshops, 2003. Proceedings. 23rd International Conference
on, pp. 962-965, DOI: 10.1109/ICDCSW.2003.1203676
[6] Marijana Vujović, Milan Savić, Dejan Stefanović, Ištvan Pap, “USAGE OF NGINX and websocket
in IoT”, Telecommunications Forum Telfor (TELFOR), 2015 23rd, pp. 289 - 292, DOI: 10.1109/
TELFOR.2015.7377467
[7] Xiaoni Chi, Bichuan Liu, Qi Niu, Qiuxuan Wu, “Web Load Balance and Cache Optimization
Design Based Nginx under High-Concurrency Environment”, Digital Manufacturing and
Automation (ICDMA), 2012 Third International Conference on, pp. 1029-1032, DOI: 10.1109/
ICDMA.2012.241
Yan-xia YANG
Research on the Recommendation of Micro-
blog Network Advertisement based on Hybrid
Recommendation Algorithm
Abstract: In this paper, the enterprise brand and product marketing issue being
discussed, combining with personalized recommendation algorithm, a layered
hybrid network advertisement recommendation algorithm which is based on the
micro-blog platform is proposed and implemented. In this algorithm, two kinds of
recommendation algorithms are combined in a stacked way, and on the basis of
results of classification recommendation algorithm, recommendation is conducted
by employing the algorithm based on user clustering. The experiment proves that
the algorithm can solve the data sparsity problem and optimize the recommendation
results.

Keywords: Micro-blog Marketing; Recommended Algorithm; Collaborative Filtering;


Naive Bayesian Classification Algorithm

1 Introduction

With the rapid expansion of micro-blog, the huge commercial value hidden behind
also successfully draws the attention of many entrepreneurs and scholars, the
emergence of increasing successful micro-blog network marketing cases ascertains
the value and significance of micro-blog for marketing [1]. The key to micro-blog
marketing is accuracy and efficiency. The precison of micro-blog’s recommendation
in the early stage of micro-blog marketing has important significance, which requires
a personalized recommendation to accurately predict the target audience and to
recommend users for products which meet their interest preferences. And a truly
good personalized recommendation is not only to recommend products to users,
in addition, but also virtually, establishing some kind of close connection with
users, making users dependent on this tailored personalized service in the premise
of satisfying them, so as to enhance the using experience [2]. In the personalized
recommendation, choosing appropriate and effective recommendation algorithm is
of great concern for the recommendation effect. Whether it is in the field of electricity
suppliers or in academia, the recommendation algorithm research has had a long-
term development and application [3].

*Corresponding author: Yan-xia YANG, City College, Wuhan University of Science and Technology,
Hubei, 430083, E-mail: yxy_job@163.com
172   Research on the Recommendation of Micro-blog Network Advertisement

2 Hybrid Recommendation Algorithm

2.1 Recommendation Algorithm Based on Classification

In this paper, the naive Bayesian classification algorithm is used to classify the micro-
blog advertising, the basic method of which is looking for classified feature words
of micro-blog, on the basis that the feature item appears, calculating the probability
of each category according to the feature item, so as to realize the classification
[4]. Bayes theorem is the basis of Bayesian classification, and it is a theorem about
conditional probability and marginal probability of random event A and B [5]. Bayes
theorem is shown in (1).
P(B | A) P(A)
P(‌A | B) =
P(B) (1)

Conditional independence is the basic assumption of naive Bayes, whose formalized


expression is shown in (2):

‌‌ P(B | A) = P(b1 | A) * P(b2 | A)*...* P(h | A) (2)

Bayesian classification can be divided into three stages:


The first stage—preparation stage. Work in this stage is an indispensable condition
for naive Bayesian classification, whose main task is to preprocess the data, mainly
including data acquisition, Chinese segmentation, denoising, feature extraction,
and then labeling and classifying class attribute in the training set, in this stage,
inputing all the samples to be classified, outputing classified classification attributes
and training set. In the whole process of naive Bayesian classification, only in the
first stage, human intervention is required, which has an important impact on the
classification results [6].
The second stage—classifier training stage. This is the process of classifier’s
generation, which mainly calculates probability of each class in training samples,
that is the number of occurrences of each category and the total number of samples,
and also calculates conditional probability of features arisen in each class.
The third stage—application stage. The naive Bayesian algorithm is applied in
practice, the data set being classified and its accuracy being estimated.

In this paper, the naive Bayesian classifier based on document frequency is adopted
to classify micro-blog network advertising and users, whose specific steps are as
follows:
Step 1: Text data feature extraction. Combining the web crawler and micro-blog API
to collect micro-blog data, segmenting collected micro-blog text and tagging
the part of speech, and finally denoising the data which has been segmented.
 Research on the Recommendation of Micro-blog Network Advertisement   173

Step 2: Calculating log P(a|y), the probability value of each feature word in every
category after word segmentation, the principle of which is the ratio of the
number of times that feature wowd a appears in class y and the total number
of class y.
Step 3: Calculating the probability of each category, the principle of which is the ratio
of the number of times that each category appears in all the samples and total
samples, then retaking log, that is log P (y).
Step 4: After the trainer is completed, finding out the probability of the largest category
of max{ P(y1|x), P( y2|x ), P( yn|x )} under the characteristic X.

Recommendation algorithm based on classification can be regarded as a semi


personalized recommendation. Although it can’t deviate from the user interest
and recommend product advertising that users may be of interest in to them,
recommendation precision is not enough, which means it can only recommend a
certain class or a few categories of product advertising to users, and it will lead to
a phenomenon that network advertising appears too frequently in front of users,
causing antipathy, the loss overweighing the gain. Thus, to get a more precise
recommendation result, further improvement in calculating algorithm is essential.

2.2 User - project Rating Matrix

After roughly classifying by using Bayesian classification algorithm, optimizing it by


further adopting collaborative filtering algorithm, consequently, the recommended
result being more accurate. Usually the first step is to build user-item rating matrix,
counting the similarity degree among users, then, predicting the rating to target
users’ unknown term made by them, and producing the final recommendation
matrix. From the point of view of the user’s score, retweeting indicates that the user
is interested in the product of the advertisement and the product of it, and the rating
is three; Thumbing up means the user’s interest is two; Evaluation can’t determine
whether users’ interest is positive, so the rating is set as one; If the user doesn’t do any
treatment to the micro-blog they read, expressing as a missing value, and the rating
is 0. Thus the user interest model based on the user-item rating can be obtained, user
U={U1,U2,…,Um}, project I={I1,I2,…,In}, there into, the value range of Ri,j is [0, 3].

2.3 Recommendation Algorithm Based on User Clustering

In the previous micro-blog recommendation system, recommending micro-blog


content to users’ needs to traverse all the micro-blog in the system, which has now
become particularly difficult when the number of users and micro-blog are in a sharp
increase, the characteristics of real time and efficiency that recommendation system
174   Research on the Recommendation of Micro-blog Network Advertisement

should have being challenged. Therefore, at first, cluster users, and then recommend
content to users within the cluster range. This paper will look for users’ nearest
neighbors on the basis of the users’ interest model, thus forming the cluster. The
specific steps of the algorithm are illustrated as follows:

Step 1: Calculating the similarity of score vector


According to the user - item score matrix obtained from the last section, the similarity
between the users is calculated by using the cosine similarity method which is shown
in the (3).
_ _

 cI i , j
( Ri ,c  Ri )( R j ,c  R j )
sim(i, j ) 
_ _

 cI i
( Ri ,c  Ri ) 2  cI j
( R j ,c  R j ) 2
(3)

Among them, i, j, respectively


 (rx ,indicates different micro-blog network advertising, Ri,c
i  sim(U x , U y ))
signifiesPthe user’s score
(U y , I i )  to
iN (U )
the micro-blog
y
content i which is derived from the user’s
behavior. 
iN (U )
sim(U x , U y )
y

Step 2: Constructing similar user group


Scaning the entire user set, conducting the comparison of similarity between target
users and each user. Sorting other users out according to the similarity value, and
then taking out some users of larger similarity value to form a similar user group.

Step 3: Generating recommended items _ _

cI (users
According to the neighbor Ri ,c  in user
Ri )( R j ,cgroups,
 R j ) generating recommended items for
sim(i, j )  i, j

target users. Rating of( R


the user’s
_
unknown item can _
be predicted through the project
cI i ,c  Ri ) 2 cI ( R j ,c  R j ) 2
weighted rating of the user, whose calculation formula is as follows in the formula
i j

(4):

( r  sim(U , U ))
iN (U y )
x ,i x y

P (U y , I i )  (4)

iN (U y )
sim(U x , U y )

N is the similar neighbor set of target users.

2.4 Hybrid recommendation algorithm

On the results of the rough classification matching by using naive Bayesian classification
algorithm, optimizing the recommendation algorithm by further employing the
collaborative filtering algorithm based on user clustering, a hybrid recommendation
algorithm of higher accuracy is obtained. Firstly, according to existing micro-blog
network advertising categories, analysing and processing network advertising text
and micro-blog posted by users by using naive Bayesian classification algorithm,
then according if it is the same category, determining whether to recommend to users,
obtaining the initial recommendation results through classification matching and
 Research on the Recommendation of Micro-blog Network Advertisement   175

feedback the results to users, users express their preferences for the recommendation
through forms like retweeting, commenting and thumbing up, the system builds
user-item matrix according to the users’ feedback on the recommendation results.
Recommendation steps are as follows.
Step1: According to the user-item score matrix, conducting the similarity calculation
between users by using cosine similarity coefficient formula, then producing a
similar set of neighbors of target users.
Step2: Predicting the rating of the unknown item from neighboring users in a similar
set of neighbors.
Step3: Sorting the predicted items out, then recommending the former N items to
users.

3 Implementation of Hybrid Recommendation Algorithm in Micro-


Blog Network Advertising

For implementing the hybrid recommendation algorithm of micro-blog advertising


that is proposed in this paper, first of all, the collected micro-blog data will be
preprocessed, mainly Chinese word segmentation, data denoising and feature
extraction, then getting the training samples. And then using Bayesian classification
to classify the training set, on the basis of classification matching results, the user
item rating matrix is established according to the user’s feedback, so that the final
recommendation results are acquired by employing the collaborative filtering
algorithm based on user clustering according to scoring matrix of the data.

3.1 Micro-blog Data Crawl

In order to analyze the user’s interest preferences, the first need is to crawl the requisite
data from micro-blog down. It mainly crawl the micro-blog network advertising text
and data information of users’ behavior such as posting, retweeting, thumbing up
and commenting micro-blog.

3.2 Chinese Word Segmentation Technology

Micro-blog and other information collected in the micro-blog system cannot be


directly applied in the implementation of the algorithm, and these pure text forms
of data need to be preprocessed in different ways according to the application of
different methods. ICTCLAS is used to conduct Chinese word segmentation and part
of speech tagging on the acquisition of the micro-blog network advertising and users’
176   Research on the Recommendation of Micro-blog Network Advertisement

micro-blog text. Removing some of the stop words, such as “no”, “what”, “yes” and
so on.

3.3 Data Feature Extraction

Document frequency method is adopted to carry on the selection of feature words,


putting off words whose word frequency is less than 3 and whose emergence rate is
more than 95% in order to get rid of words whose word frequency is too small or too
large, then the rest will be taken as feature words. The extracted features are divided
into nine categories: fitness, fashion, examination, entertainment, finance, life,
science, technology, and tourism. The extraction results of users’ micro-blog texts are
shown in Figure 1.

Figure 1. Word frequency statistics of feature word categories

3.4 Bayesian Classification

After obtaining the training samples, then according to the feature item, calculating
the probability of each category, so as to achieve classification. Using naive Bayes to
achieve a rough classification matching of micro-blog Internet advertising and users.
The result is shown in Figure 2.
 Research on the Recommendation of Micro-blog Network Advertisement   177

Figure 2. The operation result of naive Bayesian classification 

3.5 Establishment of User - project Matrix

According to the results of the naive Bayesian classification, micro-blog network


advertising after rough classification matching is recommended to corresponding
users,  the users immediately thumb up, comment, or retweet the recommended
micro-blog advertising, and then according to this kind of behavior feedback of users,
setting up a corresponding user - project evaluation matrix. Taking five ordinary
users’ behavior feedback to ten micro-blog advertising recommended by them as an
example, the project evaluation matrix is shown in Table 1.

Table 1. Sample of user item rating matrix


I I1 I2 … Ii … I10
U

U1 3 2 … 0 … 1

U2 0 1 … 2 … 2

U3 2 3 … 3 … 1

U4 1 2 … 3 … 3

U5 3 0 … 2 … 0
178   Research on the Recommendation of Micro-blog Network Advertisement

3.6 Recommendation Algorithm Based on User Clustering

This algorithm is conducted on the basis of user clustering, and it needed to find the
user’s nearest neighbors, thus forming a cluster. To recommend the user the former N
advertising micro-blogs that he may be interested in, here we set N to 8.
The main process of calculating the similarity function between the users is
shown in Figure 3.
The main flow of the matrix ranking function of users’ similarity is shown in
Figure 4.

Figure 3. Flow chart of similarity function Figure 4. Flow chart of similarity matrix ranking

The main process of how the user i predict the function of users’ interest degree
towards item j is shown in Figure 5.
Operation result of collaborative filtering algorithm which is based on user
clustering is shown in Figure 6.
 Research on the Recommendation of Micro-blog Network Advertisement   179

Figure 5. Flow chart of predicting the degree of user’s interest

Figure 6. Collaborative filtering based on user clustering

4 Conclusion

The precise content of micro-blog’s interest in the early stages of micro-blog marketing
has important significance, which requires a personalized recommendation to
accurately predict the audience, and to recommend users to meet their interest
preferences of the product. Firstly, in this paper, the enterprise brand and product
180   Research on the Recommendation of Micro-blog Network Advertisement

marketing issues are discussed, and combined with personalized recommendation


algorithm, a marketing recommendation pattern which is on the micro-blog platform
is proposed. This pattern is committed to tailoring personalized recommendation and
services for different users, and more accurately recommend people or goods that
users might be interested in to them through the analysis and prediction of each user’s
interest. Then using naive Bayesian classification algorithm to roughly classify micro-
blog network advertising and micro-blog users, advertising that is similar to users’
preferences will be recommended to them. Then, on the basis of matching results
of rough classification, this paper proposes a collaborative filtering recommendation
algorithm based on user clustering, according to customer’s feedback to recommended
results, establishing user - project evaluation matrix, predicting unknown project
evaluation, recommendation matrix being obtained. Finally, the combination of the
simple Bayesian classification algorithm and the collaborative filtering algorithm
are combined, and a hybrid recommendation algorithm is proposed, which makes
the results more accurate. This hybrid strategy can effectively reduce the amount of
computation of collaborative filtering algorithm, and improve the overall efficiency
of the algorithm, and to a certain extent, it can solve the problem of data sparsity
and cold boot. The hybrid recommendation algorithm can help enterprises more
accurately find the target users, thus making recommendation results more precise in
order to enhance using experience.

Acknowledgment: The work in this paper is partially supported by the Scientific


Research Plan Project of Education Department of Hubei of China under Grant No.
B2015360 and in partially supported by the Humanities and social science Research
Plan Project of Education Department of Hubei of China under Grant No. 16G250.

References
[1] S. Huang, J. Sun, X. Wang, H, Zeng and Z, Chen. Subjectivity Categorization in Weblog Space
using Part-Of-Speech based Smoothing. In Proceedings of 6th Inernational Conference on Data
Mining.
[2] R. Kumar,J. Novak,P. Raghavan,and A. Tomkins. Structure and Evolution of Blogspace. Commun.
ACM, 47(12):35-39, 2010.
[3] J.D. Lasica,Weblogs:A New Source of Information. In We’ve got blog: How weblogs are changing
our culture, John Rodzvilla (ed). Perseus Publishing,Cambridge,MA,2012.
[4] F. Sebastiani, “Machine Learning in Automated Text Categorization”, ACM Computing Surveys,
Vol.34, No.1, p1-p47, March 2012.
[5] J. Bar-llan. An Outsider’s View on “Topic-oriented” Blogging. In Proceedings of the Alt. Papers
Track of the 13th International Conference on World Wide Web, papers 28-34,May,2013
[6] K.T. Durant and M.D. Smith. Mining Sentiment Classification from Political Web Logs. In
Proceedings of Workshop on Web Mining and Web Usage Analysis of the 12th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining (WebKDD-2012). August,
2012.
Zhong GUO*, Nan LI
Fuzz Testing based on Sulley Framework
Abstract: Currently, fuzz testing was a security testing method, which could enforce
automated explorations to the software security vulnerability through fuzzifier. Fuzz
testing framework could provide the developers of fuzzifiers with a fast, flexible
and reusable development environment. At the same time, this environment was
isomorphic definitely. In this article, the authors analyzed the structure, characteristics
and advantages of fuzz testing framework Sulley, introduced the testing process based
on Sulley framework and illustrated an case analysis, which provided reference for
security testing.

Keywords: component; Software testing; Fuzz Testing; Sulley; Security Testing

1 Concept of fuzz testing framework

Fuzz testing was a kind of automated software testing technology based on defect
injection [1]. Some random input could be offered to the target programs through
writing fuzzifier utilities, and these data were selected purposefully sometimes. And
then, the problems would be exposed through observing the subsequent responses.
Nowadays, some available and specialized fuzzy testing utilities had existed already,
which could serve a large number of common open protocols as well as file formats.
It was no doubt that these fuzzifiers could test an appointed protocol thoroughly.
Besides, they could also be extended to stress testing on different programs supporting
the specific protocol. Although these fuzzifiers were effective with regard to the wide
common use, in fact, we still needed to implement more specialized and thorough
fuzz testing to the protocols usually, which were private and had never been tested
before. At this point, the fuzz testing framework became very useful.
Nowadays, these available fuzz testing framework were written with C language,
while other frameworks were written with Python or Ruby. Some frameworks were
written with development language, while others were realized through a custom
language. Besides, certain frameworks abstracted the generation of data, but others
did not. What’s more, a few of frameworks were object-oriented and open, while the
great majority of frameworks could only be applied by the developers in most cases.
However, all the fuzz testing framework aimed to offer a fast, flexible, reusable as well
as isomorphic development environment to the fuzzifier developers and the final goal
was the same completely [2].

*Corresponding author: Zhong GUO, No.95899 Unit, PLAAF, E-mail: 18803806@qq.com


Nan LI, No.95899 Unit, PLAAF, Beijing, China
182   Fuzz Testing based on Sulley Framework

2 Architecture of sulley framework

As shown in Figure 1, the Sulley framework consisted of four parts and they were
data generation, session management/driver, agents and utilities respectively.
Besides, request library was also an indispensable part in this framework and it was a
collection of various requests defined on the basis of testing requirements.

Figure 1. Architecture of Sulley Framework

2.1 Data Generation

Data generation was composed of Primitive, Block, Lego and Request. Request was
mainly responsible for the interactive request message with the target in FT and it
could describe the Request in detail through the Primitive types offered by Sulley.
s_static, s_random, s_binary, s_char, s_string and s_delimeter were all the common
data types. For the better use or reuse, some Primitives could be put together and
be named for further convenient reference. In this way, a Block was formed and the
whole process just like stacking blocks. And then, these blocks would constitute
Request at last.
 Fuzz Testing based on Sulley Framework   183

2.2 Session Management/ Driver

Defining all sorts of Requests, describing their relations and drawing the Request
diagrams in advance were essential in protocol fuzz testing. If so, Sulley would
traverse all routes from Root to End in accordance with the Request diagram.

2.3 Agents

As the target-program judgment error usually occurred after crashing in FT, which
would interrupt the subsequent testing process. For the more automated implement
of FT, Sulley offered three Agents to control VMware virtual machine, monitor the
process and capture the network traffic respectively.

2.4 Utilities

Sulley offered a series of utilities, such as crashbin_explorer, ida_fuzz library_


extender, pcap_cleaner and so on. All these utilities would be helpful for the analysis
of the error reasons after crashing [3].

3 Case study of sulley

Generally speaking, the main process of fuzz testing based on Sulley could be
summarized as Figure 2.

Constructing
Analyzing the Connecting the Writing scripts
protocol package Starting Fuzz
target protocol and testing target and of python and
according to the and checking
understanding the Agent offered by constructing
session process of the result
session process Sulley Session
target protocol

Figure 2. Process of fuzz testing based on Sulley

Next, an example was followed to make a full introduction of the Sulley usage as
well as its basic functions. In this experiment, we assumed that the testing target
was a SMTP Server which port was bound to tcp25. Readme in Sulley had offered
a complete Fuzz Testing progress, including the full analysis of coredump after the
program crash.
For this example, the SMTP session could be simplified as Figure 3, which meant
EHLO, MALL FROM, PCTP TO, DATA, ……,QUIT interacted with Server in command
order.
184   Fuzz Testing based on Sulley Framework

Figure 3. Session of SMPT

Fuzz testing on the SMTP realization of session server was necessary in the whole
testing process, so interaction between all kinds of reasonable data, which were
generated by each command in random, and Mail server was an indispensable step.
We usually named the data package generated by each piece of command as Request.
In Sulley, Request was constructed by “block” and each “block” was composed by
Primitive, which was a kind of basic data type, such as integer, character, random
number/ string and so on.
These Requests needed to be organized into one Session in a reasonable order
after they were all completed. Only in this way could the Sulley interact with Mail
Server according to the definition of Session. In short, each Request could be imagined
as a node, and Session defined the directed connection among nodes. For example,
“A→B” meant that Request B was followed only when Request A was done and the
related response was received as well. It was worth saying that the registration of a
callback function would be helpful, which could guarantee an appropriate treatment
before a request or after receiving a response.
All the design above could be displayed in Figure 4. Among them, “ehlo” and
“helo” could be regarded as starting command and the blue nodes were callback
functions.
After defining the Session, we need to write the Python script code. The specific
code was illustrated in Figure 5. The whole testing process could be checked through
the console after running the script program and implementing the testing process
(as shown in Figure 6). It could also be seen through the in-built web port of Sulley
and the port number was 26000 (as shown in Figure 7).
 Fuzz Testing based on Sulley Framework   185

Figure 4. Definition of Session


# import all of Sulley's functionality.
from sulley import *
# does nothing but save greeting message
def get_greeting_msg(sock):
greet_message = sock.recv(10000)
session.log("Greeting Message -->%s" % greet_message, 2)
def callback(session, node, edge, sock):
session.log("Date sent -->%s" % node.render(), 2)
s_initialize("helo")
if s_block_start("helo"):
s_static("helo ")
s_delim(" ")
s_static("test.com")
s_static("\r\n")
s_block_end()
s_initialize("ehlo")
if s_block_start("ehlo"):
s_static("ehlo ")
s_delim(" ")
s_random("xxx.com", 5, 10)
s_static("\r\n")
s_block_end()
s_initialize("mail from")
if s_block_start("mail from"):
s_static("mail from: ")
s_delim(" ")
s_delim("<")
s_static("haha@ims.com")
s_delim(">")
s_static("\r\n")
#s_random('\x09\x78', min_length=101124, max_length=102219)
s_block_end()
s_initialize("rcpt to")
if s_block_start("rcpt to"):
s_static("RCPT TO")
s_delim(":")
s_static("alice@test.com")
s_static("\r\n")
s_block_end()
s_initialize("pre_data")
if s_block_start("pre_data"):
s_static("DATA\r\n")
s_block_end()
s_initialize("data_content")
if s_block_start("data_content"):
s_static("Received:")
s_string("Whatever")
s_static("\r\n")
s_static("Subject:")
s_string("GOGOGOA"*2)
s_static("\r\n")
s_static("\r\n")
s_string("haha")
s_static("\r\n.\r\n")
s_block_end()
sess = sessions.session(log_level=100)
target = sessions.target("127.0.0.1", 25)
sess.add_target(target)
sess.connect(sess.root, s_get("helo"), callback)
sess.connect(sess.root, s_get("ehlo"), callback)
sess.connect(s_get("helo"), s_get("mail from"), callback)
sess.connect(s_get("ehlo"), s_get("mail from"), callback)
sess.connect(s_get("mail from"), s_get("rcpt to"), callback)
sess.connect(s_get("rcpt to"), s_get("pre_data"), callback)
sess.connect(s_get("pre_data"), s_get("data_content"), callback)
sess.fuzz()

Figure 5. Python Script Code


186   Fuzz Testing based on Sulley Framework

Figure 6. Testing Process Shown on Console

Figure 7. Testing Process Shown through the Web Port of Sulley

When the Mail Server exited abnormally during the testing process, the script would
stop because of the request timeout. ProcMon and CMControl would be helpful if the
testing needed to be continued because they would help to restart the target program
or Rollback virtual machine. NetMon would be helpful if the traffic during testing
needed to be recorded for the need of post-mortem analysis.

4 Conclusion

Fuzz testing framework offered a flexible, reusable and isomorphic development


environment for the error testers and the QA team. Sulley framework was the latest
among the expanding fuzz testing framework family, and it was beyond other ones in
function. Concerning the data generation, monitoring the network communication,
maintaining the related records systematically, monitoring the application status of
the target and doing some recovery were its common functions. Besides, fuzz testing
framework could test, track and classify the discovered errors and implement fuzz
testing on them. To sum up, the authors made an exhaustive analysis of the fuzz
testing framework Sulley in this article and introduced the security testing process
based on it in detail.
 Fuzz Testing based on Sulley Framework   187

References
[1] GORBUNO S,ROSENBLOOM A. AutoFuzz: automated network protocol fuzzing framework[J].
International Journal of Computer Science and Network Security,2010,10(8):239-245
[2] SHUANG Kai,WANG Si-yuan,ZHANG Bo. IMS security analysis using multi-atttibute model[J].
Journal of Network,2011,6(2):263-271.
[3] TAKANEN A,DeMOTT J,MILLER C.Fuzzing for software security testing and quality assurance [M].
Norwood,MA;Artech House,2008:22-32
Jun ZHANG, Shuang ZHANG, Jian LIANG, Bei TIAN, Zan HOU*,
Bao-zhu LIU

A Risk Assessment Strategy of Distribution Network


Based on Random Set Theory
Abstract: In view of the complexity and uncertainty of fault information in
distribution network risk assessment, a representation and modeling method of
multisource information based on random set theory is proposed. In this method,
random variables describing parameters are converted to their random set form, and
the belief function and plausibility function of random set are used to obtain the upper
and lower cumulative probability distributions of risk indices. Thus, the probability
range of risk can be derived. The analysis about a typical radial distribution network
shows that the proposed method is reasonable and effective.

Keywords: distribution network; multisource information; random set; upper and


lower probability; risk assessment

1 Introduction

The purpose of risk assessment for distribution network is to obtain the quantitative
indices of uncertainty and security according to the operational state of power grid
within a short time in the future, which depends on the probability characteristics of
its behavior [1,2]. The probability and consequences of outage are all integrated in
risk assessment, therefore the security level of power grid operation can be evaluated
more comprehensive. In the short term risk assessment process, the risk indices are
affected by some factors, such as operational conditions and equipment parameters.
So the suitable stochastic process model is needed to describe the random behavior
of components for evaluation period in the future.
In the current evaluation theory, the component reliability parameters are
typically estimated as the average value according to the historical data [3] which is a
reflection of the long-term operation of the equipment. So the system risk assessment
affected by historical and future operational conditions cannot be described.
In the risk assessment of distribution network, there are many influencing factors,
for example, the complexity of devices itself and the uncertainty of operational

*Corresponding author: Zan HOU, School of Electrical and Electronic Engineering, North China Electric
Power University, Beijing, China, E-mail: zan_hou@163.com
Jun ZHANG, Shuang ZHANG, Jian LIANG, Bei TIAN, Electric Power Research Institute, State Grid Ning-
xia Electric Power Company, Yinchuan, China
Bao-zhu LIU, School of Electrical and Electronic Engineering, North China Electric Power University,
Beijing, China
 A Risk Assessment Strategy of Distribution Network Based on Random Set Theory   189

environment, and how to realize the unified representation and amalgamation


of multisource information is a current research hotspot. Random set theory [4] is
expected to solve this problem. Random set theory is an important new branch of
mathematics which combines the traditional probability and set theory, and it is
an extension of random variables. Its research objects are a set of elements in the
possible results. In [5], the fault sample model and the information to be detected are
expressed as random sets, and the belief function and plausibility function described
by random set can be obtained after matching. Then it is followed by decision making
and diagnosis. The remote sensing image, rainfall and other multivariate information
are analyzed and integrated through the calculation of random set in [6], then the
cumulative probability distribution function of a single risk level can be obtained. The
flood risk level of a certain area can also be evaluated according to certain judgment
criterion, and the probability of the risk can be derived more precisely.
In this paper, the random set form of random variables and the probability model
of risk assessment based on random set theory are used to produce a simple and
flexible method for the judgment of risk probability. And a double distribution line is
taken as an example to illustrate the effectiveness of the method.

2 Random Set Theory

2.1 Concept of Random Set

Suppose (Ω, F , P ) is a probability space, and (Θ, BΘ ) is a measurable space, then


the set-valued mapping [7] can be defined as follows. X : Ω → 2Θ . The symbol F is
a σ –field of Ω , and BΘ is a σ –field of Θ . When T ∈ BΘ , the upper inverse, lower
inverse and inverse can be expressed as follows.
X * (T=
) {ω ∈ Ω : X (ω )  T ≠ ∅} . (1)
X * (T=
) {ω ∈ Ω : X (ω ) ∈ T } . (2)
X −1 (T=
) {ω ∈ Ω : X (ω=) T } . (3)
Θ Ω −1
Define an operator. j : 2 → 2 . ∀T ∈ BΘ , there is= j ( A) X (T ) ⊆ Ω . If j (T ) ≠ ∅ ,
then T is called a focus set of j . And the focus set satisfies the following properties.

 { j (T ) : T ⊆ Θ} = Ω .
∅.
T1 , T2 ∈ ( BΘ )T1 ≠ T2 ⇒ j (T1 )  j (T2 ) =
If the set-valued mapping X given above is strongly measurable, this means that
when ∀T ∈ BΘ , X * (T ) ∈ F . Then X is supposed to be a random set. ∀T ∈ BΘ , the
upper and lower probability can be denoted as follows.
P* (T ) P(T * ) / P(Θ* ) . (4)
=
190   A Risk Assessment Strategy of Distribution Network Based on Random Set Theory

P* (T ) P(T* ) / P(Θ* ) . (5)


=
If the random set X is measurable in F − BΘ , and P is the probability measure of F ,
then the probability measure of X in BΘ can be obtained as follows.

T ) PX −1 (=
PX (= T ) P(ω ∈ Ω : X (=
ω ) T ) . (6)
P(ω : X (ω ) = ∅) = 0, ∅ ∈ BΘ . (7)
Let X and Y be two random sets defined in the same space. With respect to any
ω ∈ Ω , the operation of intersection, union and complement can be expressed as
follows.
( X  Y )(ω ) = X (ω )  Y (ω ) .
( X  Y )(ω ) = X (ω )  Y (ω ) .
( X c )(ω ) = [ X (ω )]c .

2.2 Confidence Representation and Random Relation

Suppose (F,m) denotes the random set defined in Θ , and the domain Θ denotes a
nonempty finite set. The symbol F is collection of sets constructed by the nonempty
subset of Θ . Define m as a mapping. And it can be expressed as follows. m : F → [0,1] ,
∑ m( A) = 1 . Then suppose F support for the random set (F,m), and m is a mass function
A∈F
which is also called basic probability assignment (BPA). When ∀A ∈ F , the symbol
A is a focus element of (F,m) under the condition of m( A) > 0 .
Suppose the random set (F,m) equals to the belief function Bel in DS evidence
theory [8], and the relation between them is in (8). The plausibility function can also
be defined in (9).

Bel ( A) =∑ m( B) . (8)
B⊆ A
∑ m( B) . (9)
1 − Bel ( Ac ) =
Pl ( A) =
A B ≠∅

The relation Θ = Θ1 ×  × Θ n is defined as an n-dimensional Cartesian product, and


the random set (F,m) is a random relation defined in Θ . Suppose ( Fk , mk ) is the
projection of the random relation (F,m) on Θ , which is called marginal random set,
and the range of k is 1 to n.
When ∀A ∈ F , A = C1 ×  × Cn , there is m( A) = m1 (C1 ) × × mn (Cn ) , and (F,m)
is denoted as a decomposable Cartesian product random relation [9]. Moreover, the
marginal random sets ( F1 , m1 ), , ( Fn , mn ) are independent of each other.
 A Risk Assessment Strategy of Distribution Network Based on Random Set Theory   191

2.3 Extension Principles

Suppose ξ = (ξ1 , , ξ n ) is a variable defined in Θ = Θ1 ×  × Θ n . The formulation


is expressed as follows. = ζ f (ξ ), ζ ∈ V , f : Θ → V . The purpose of extension
principles obtained from [10] is to achieve a transformation from the random relation
(F,m) of ξ to the random set ( R, ρ ) of ζ . The formulation of the transformation can
be denoted as follows.
=R = {R j f ( Ai ) | Ai ∈ F . }
=ρ (R j ) ∑
= {m( Ai ) | R j }
f ( Ai ) .
f ( Ai=
) { f (u ) | u ∈ Ai } ,=i 1, , M .
The symbol M is the number of elements in F. The measurement m( Ai ) of any
nonempty subset Ai in Θ is delivered to the measurement ρ ( R ) of subset R in
V through the mapping f. And it is obvious that multiple focus elements Ai may
correspond to the same image R j .

3 Random Set Method for Risk Assessment

The random set method for risk assessment of distribution network can be derived
according to the definition of the random set theory. The detail steps are as follows.

3.1 Definition of the Random Relation

In the risk assessment model, the symbol ξ k belongs to the interval I k , and the range
, then ξ (ξ1 , , ξ n ) ∈ Θ . Suppose F = Θ ,
of k is 1 to n. If the symbol Θ= I1 ×  × I n=
p
m = pξ , which means that the focus elements of F is a single value and ξ is a joint
probability mass function about ξ . Suppose= F ' {=
Ai , i 1, , M } is the division of
Θ , which means that I k is divided into d k subintervals. Then the symbol Ai can be
expressed as follows.

Ai = C1i × C2i ×  × Cni .

i '
The symbol Ck is a subinterval
n
of I k , and the number of elements in F can be
denoted as follows. M = ∏ d k . Suppose m ( Ai ) = u:∑
'
pξ (u )
u∈ A
. If ξ is a continuous random
k =1 i

variable, then pξ (u ) is a joint probability density function which can be defined in


(10).
m' ( Ai ) = ∫ pξ (u )du . (10)
Ai
192   A Risk Assessment Strategy of Distribution Network Based on Random Set Theory

3.2 Calculation of the Image

Using the extension principles of random set, the formulation of the image can be
( F ' , m' ) .
expressed as follows after constructing the frame of
R' =
= {
R j f ( Ai ) | Ai ∈ F ' . }
f ( Ai )
= { f (u ) | u ∈ Ai } .
=ρ ' (R j ) ∑
= {m' ( Ai ) | R j }
f ( Ai ) .

3.3 Construction of the Upper and Lower Probability

When ∀R ∈ R ' , then the belief function Bel ' and plausibility function Pl ' of ( R ' , ρ ' )
derived from the confidence representation of random set can be defined in (11) and (12).
Bel ' ( R)
= ∑
= ρ ' (Q) ∑ ∑ m' ( Ai ) . (11)
Q⊆ R Q⊆ R Q=
f ( Ai )

Pl ( R)
= '
∑ ρ (Q)
= '
∑ ∑ m' ( Ai ) . (12)
Q  R ≠∅ Q  R ≠∅ Q = f ( Ai )

If the probability of ζ ∈ R is P ( R ) , then Bel ' ( R ) and Pl ' ( R ) are the upper and lower
' '
boundaries of P ( R ) . The formulation is as follows. Bel ( R ) ≤ P ( R ) ≤ Pl ( R ) . And the
upper and lower cumulative probability distribution [11] of ζ is in (13) and (14).

F * (=
y ) Pl ((−∞, y=
]) ∑ {ρ ( f ( Ai )) | y ≥ inf( f ( Ai ))} . (13)
F*=
( y ) Bel ((−∞,=
y ]) ∑ {ρ ( f ( Ai )) | y ≥ sup( f ( Ai ))} . (14)
The symbol inf and sup are the lower and upper bound operators respectively.
Thus, the upper and lower probability distribution curve contains the true probability
distribution curve, which can be expressed as follows.

F* ( y ) ≤ F ( y ) ≤ F * ( y ) .

3.4 Analysis of the Probability of Risk Indices

Suppose D = [ y A , yB ] is the value range of risk index, then the probability of ζ ∈ D


can be defined as follows.
P( D) F ( yB ) − F ( y A ) .
=
P* ( D) F* ( yB ) − F * ( y A ) .
=
P* ( D) F * ( yB ) − F* ( y A ) .
=
P* ( D) ≤ P( D) ≤ P* ( D) .
Thus, the risk indices level of distribution network can be judged by the formulations
given above.
 A Risk Assessment Strategy of Distribution Network Based on Random Set Theory   193

4 Case Study

The graph of double distribution line is in Figure 1. The beginning bus is connected
with a large capacity generator, so the voltage U1 is assumed to be constant. The
impedance of line 1 is supposed to be constant. However, the impedance of line 2 is
assumed to be changing depending on the fault type, such as short circuit or open
circuit. The end bus is connected with an uncertain load. The constant parameters
are: U1 = 1.05 , R1 + jX 1 =0.1 + j 0.2 . The value ranges of parameters of line 2 and
the output power are: R2 + jX 2 = (0.1 + j 0.2) ± 10% , P2 + jQ2 = (1 + j 0.5) ± 10% .
Suppose the parameters R2 , X 2 , P2 and Q2 follow normal distribution and
independent of each other. And we want to observe the variation law of voltage U 2
under the condition of variable load and fault occurrence. It should be noted that the
parameters given above are not necessarily subject to be normal distribution and can
also be other distributions which depends on the specific situation. All the parameters
given above are the standard values. And the transverse component of voltage drop is
not considered in the calculation.

U1 U 2

G R1 + jX1

P2 + jQ2

R2 + jX 2

Figure 1. Distribution network with double lines

The detail steps of the proposed method are listed as follows.

4.1 Determine the Function of the System

According to the calculation of the voltage drop, the formulation of the terminal
voltage U 2 is in (15).
P2 RΣ + Q2 X Σ ⇒ U1 U2
U=
1 U2 + U2 = + −( P2 RΣ + Q2 X Σ ) + 1 . (15)
U2 2 4
Suppose ξ1 , ξ 2 , ξ3 and ξ 4 are defined as four random variables as follows to
describe the stochastic process. The symbol µ and σ are the mean value and the
standard deviation of the variables.
0.1
ξ1 P2 =
= , µξ1 1,=
σ ξ1 .
3
194   A Risk Assessment Strategy of Distribution Network Based on Random Set Theory

0.05
=ξ 2 Q=
2 , µξ 2 0.5,
= σ ξ2 .
3
0.01 0.02
=ξ3 R=
2 , µξ3 0.1,
= σ ξ3 = . ξ 4 X=
2 , µξ 4 0.2,
= σ ξ4 .
3 3

4.2 Determine the Random Relation

Suppose the symbols I1, I2, I3 and I4 are the value intervals of the variables defined above
and they are expressed as follows. I1 = [0.9,1.1] . I 2 = [0.45, 0.55] . I 3 = [0,1000] .
I 4 = [0,1000] . The number 0 denotes the short circuit condition of the resistance R2
or reactance X 2 , and the number 1000 is used to denote the open circuit condition.
The four intervals are divided averagely into 4 or 3 subintervals respectively. The
detail divisions are expressed as follows.

I1, j = [u1, j , u1, j +1 ) , ( j = 1, 2,3, 4) . I 2, k = [u2, k , u2, k +1 ) , (k = 1, 2,3, 4) .

I 3, q = [u3, q , u3, q +1 ) , (q = 1, 2,3) . I 4,l = [u4,l , u4,l +1 ) , (l = 1, 2,3) .

Thus, the domain Θ= I1 × I 2 × I 3 × I 4 is divided into 144 focus elements, and the form
of set can be defined as follows.

F'
= {I1, j × I 2,k × I3,q × I 4,=
l | j 4; k 1, 2,3,=
1, 2,3,= 4; q 1, 2,3;
= l 1, 2,3 }.
Because the variables ξ1 , ξ 2 , ξ3 and ξ 4 are independent of each other, the joint
probability density function can be expressed in (16).

m' ( I1, j × I 2, k × I 3, q × I 4,l ) =


m1' ( I1, j )m2' ( I 2, k )m3' ( I 3, q )m4' ( I 4,l ) . (16)
The symbols m1' ( I1, j ) , m2' ( I 2, k ) , m3' ( I 3, q ) and m4' ( I 4,l ) denote the basic probability
assignment of subintervals and they are listed in Table 1-4.

Table 1. The Basic Probability Assignment of ξ1

I1, j [0.9,0.95) [0.95,1.0) [1.0,1.05) [1.05,1.1)

m1'
0.0655 0.4332 0.4332 0.0655

Table 2. The Basic Probability Assignment of ξ 2


I 2,k
[0.45,0.475) [0.475,0.5) [0.5,0.525) [0.525,0.55)
m2'
0.0655 0.4332 0.4332 0.0655
 A Risk Assessment Strategy of Distribution Network Based on Random Set Theory   195

Table 3. The Basic Probability Assignment of ξ3


I 3,q
[0, 0.09) [0.09, 0.11) [0.11, 1000)
m3'
0.0013 0.9974 0.0013

ξ
Table 4. The Basic Probability Assignment of 4
I 4,l
[0, 0.18) [0.18, 0.22) [0.22, 1000)
m4'
0.0013 0.9974 0.0013

4.3 Calculate the Image

It is obvious that the output U 2 and its partial derivative function are continuous
on the interval Θ , and U 2 is monotone decreasing with regard to P2 , Q2 , RΣ and
X Σ respectively. The focus elements of the image ( R ' , ρ ' ) and the corresponding
probability assignment can be derived according to the extension principles. The
' '
result is in Table V. The corresponding images of each focus element Ai in ( F , m )
are different. So the result is expressed as follows.

ρ ' ( f ( I1, j × I 2, k × I 3, q × I 4,l ))= m' ( I1, j × I 2, k × I 3, q × I 4,l ) .


The curves for upper and lower cumulative probability distribution Fup (U 2 ) and
Flow (U 2 ) obtained from the method of random set are in Figure 2. According to the
random set theory, these two curves contain the curve for true probability distribution.

Table 5. Results of the Image ( R ' , ρ ' )

j , k , q, l Ri ρ'

1,1,1,1 (0.9558,1.0500] 7.2505e-09

1,1,1,2 (0.9497,0.9850] 5.5628e-06

1,1,1,3 (0.8177,0.9748] 7.2505e-09

1,1,2,1 (0.9497,0.9758] 5.5628e-06

1,1,2,2 (0.9446,0.9613] 0.0043


(There are 144 focus elements in Ri. Due to the space limitation, only five are listed in this paper.)
196   A Risk Assessment Strategy of Distribution Network Based on Random Set Theory

Figure 2. Curves for upper and lower probability distribution of U2.

4.4 Assess the Risk Level

When the range of U 2 is assumed as D = [0.935, 0.955] , the corresponding


probabilities obtained from the method of random set can be calculated as follows.

F * (0.935)
= ∑ {ρ ( f ( Ai )) | 0.935 ≥ inf( f=
( Ai ))} 0.3066 .

∑ {ρ ( f ( Ai )) | 0.935 ≥ sup( f ( Ai ))} =


F* (0.935) = 7.2505e − 09 .

F * (0.955)
= ∑ {ρ ( f ( Ai )) | 0.955 ≥ inf( f=
( Ai ))} 0.9948 .

F* (0.955)
= ∑ {ρ ( f ( Ai )) | 0.935 ≥ sup( f =
( Ai ))} 0.6883 .
Then the upper and lower probability of the interval and the relation between them
can be calculated as follows.

P* ( D) = F* (0.955) − F * (0.935) =0.3817 .

P* ( D) = F * (0.955) − F* (0.935) = 0.9948 .

P* ( D) = 0.3817 ≤ P( D) ≤ P* ( D) = 0.9948 .
The risk probabilities of low voltage or high voltage can also be evaluated when the
range of U 2 is the corresponding voltage interval. And the more detailed division
of V = I1 × I 2 × I 3 × I 4 , the higher the accuracy is. The four intervals are divided
averagely into 20 subintervals respectively. Thus, the curves which are smoother are
in Figure 3.
 A Risk Assessment Strategy of Distribution Network Based on Random Set Theory   197

Figure 3. Curves for probability distribution of U2 with more divisions

5 Conclusion

Based on the random set theory, a simple and flexible risk assessment method of
distribution network is proposed in this paper. In this method, random variables
describing parameters are converted to their random set form, and the belief function
and plausibility function of random set are used to obtain the upper and lower
cumulative probability distributions of risk indices. The double distribution line is
taken as an example to illustrate that better evaluation results can be obtained in case
of small amount of calculation.
In addition, considering the uncertainty of information, some parameters in
the system are random variables, and the distribution of the other parameters is
unknown. Thus, the unified representation of all these information using the random
set need further study, such as random and fuzzy information. This will be the next
research direction.

References
[1] Li Wenyuan, Risk Assessment of Power Systems: Models, Methods and Applications. Beijing:
Science Press, 2005: 11-36(in Chinese).
[2] Feng Yongqing, Wu Wenchuan, and Zhang Boming, “Power System Operation Risk Assessment
Using Credibility Theory,” IEEE Trans on Power Systems, 2008, 23(3): 1309-1318.
[3] BILLITON R and ALLAN R N, Reliability Evaluation of Engineering Systems: Concepts and
Techniques. New York, NY, USA: Plenum Press, 1983.
[4] Xu Xiaobin, Wen Chenglin, and Liu Rongli, “The Unified Method of Describing and Modeling
Multisource Information Based on Random Set Theory,” Acta Electronica Sinica, 2008, 26(6):
1-7.
[5] Miao Rui, Chen Guochu, Li Yue, Xu Yufa, and Yu Jinshou, “A Wind Turbine Fault Diagnosis
Method Based on Vague Evidence of Random Set,” Automation of Electric Power Systems, 2012,
07: 22-26.
198   A Risk Assessment Strategy of Distribution Network Based on Random Set Theory

[6] Xie Yajuan, “Research on Multi-source Information Fusion and Uncertainty Modeling of Flood
Risk Assessment,” Huazhong University of Science and Technology, 2012.
[7] Han Chongzhao, Zhu Hongyan, and Duan Zhansheng. Multisource Information Fusion. Beijing:
Tsinghua University Press, 2006(in Chinese).
[8] Shafer G. A Mathematical Theory of Evidence. Princeton: Princeton University Press, 1976.
[9] Kohlas J, “Modeling Uncertainty with Belief Functions in Numerical Models,” Rep. No. 141,
Institute for Automation and Operations Research, University of Fribourg, Switzerland, 1987.
[10] Dubois D, and Prade H, “Random Sets and Fuzzy Interval Analysis,” Fuzzy Sets and Systems,
1991, 42: 87-101.
[11] Tonon F, and Bemardini A, “A Random Set Approach to the Optimization of Uncertain
Structures,” Computers and Structures, 1998, 68(6): 583-600.
Rui LIU*, Bo-wen SUN, Bin TIAN, Qi LI
A Software Homology Detection based on BP Neural
Network
Abstract: Software plagiarism is nowadays a widespread and uncurbed problem
which lead to the development of software homology detection. But, detecting the
software homology accurately is hard as it has complex structures and various coding
styles. To tackle the problem, in this paper, we proposed a homology detection based
on programming styles which uses BP neural network to train and classify. We selected
the features of programming styles elaborately which are based on researches and
experiences and those features are most typical. The BP neural network we designed
has enough ability to learn fast and classify with high accuracy. An experiment
confirms that the method is effective to detect the homology software and can achieve
a satisfactory outcome. It demonstrates that the method is not only easy to operate
but also relatively accurate to get results which exceed some other work.

Keywords: homology detection, feature, programming style, BP neural network

1 Introduction

The software engineering field has acted a significant part in the development of
computer industry. Due to the enormous economic benefit which created by software,
people began to pay more and more attention to the values of software [1]. However, in
spite of the rapid development of the software, code plagiarism is always a widespread
and serious problem. It is important not only for the software engineering, but also for
the other computer field and academics. So, Solving how to protect the intellectual
property rights of software and how to safeguard rights and interests of the software
engineers are urgent needs.
So far, there are many useful technologies are proposed to prevent and detect
the source code plagiarism such as software birthmark, anti-reverse software
engineering and software homology detection [2]. Some common problems among
them are that the methods to detect the plagiarism are too difficult to implement
and not have a good universality to most programming languages. Mohamed and

*Corresponding author:Rui LIU, School of CyberSpace Security, Beijing University of Posts and
Telecommunications, Beijing, China, E-mail: 281097756@qq. com
Bo-wen SUN, School of CyberSpace Security, Beijing University of Posts and Telecommunications,
Beijing, China
Bin TIAN, China Information Technology Security Evaluation Center, Beijing, China
Qi LI, Beijing University of Posts and Telecommunications, Beijing, China
200   A Software Homology Detection based on BP Neural Network

Naliah introduced a method that encompasses attribute-based and structure based


to detect the similarity of Java programming assignments [3]. Suying Yang proposed
a method to identify similar C codes based on weighted attributes eigenvector [4].
Although they all get good results for software homology detection, these methods
have a significant limitations result in these methods can’t be spread widely. In order
to develop innovation and out of the ordinary, Seo-Young Noh introduced an XML-
based model to detect similarities among programs that arise under plagiarism. It
exactly has some highlights that are eye-catching, but the process to get a result need
a wide range of knowledge and very complicated to implement [5].
The homology detection technology which used for source code mainly have three
methods. One is text-based which ignores the semantic characteristics of the source
code itself and it is easy to use this shortage to make plagiarism [6]. Next one is based
on the abstract syntax tree. It can better embody the structures of programming logic.
But this one is complicated to carry out and is possible to have an error check in theory
[7,8]. The rest one is based on programming style. It can determine well whether the
source codes are written by the same person but just need to be more automated. The
two vital components in this method are the feature selection and extraction and the
technique to make categorizations [9]. Bayyen proposed a syntactic features include
punctuation and Chaski proposed an idiosyncratic features include spell errors and
other usage mistakes. These are too unilateral to get enough features [10,11]. About
the technologies, many useful and effective methods have been introduced such
as support vector machines and principal component analysis. But it still have not
proved that which one is the best.
In this paper, we incorporated a abandon set of features on programming habits,
which consist of syntactic features, structural features and so on [12]. These features
apply to most programming lauguages and have certain representative. To solve
the non-automatization of feature extraction, we wrote a program to extra different
feature from different codes. It does not require person to do statistics manually. On
this basis, we proposed BP neural network to help identify who is the author of codes
based on programming style. We used the BP neural network as a supervised training
to learn the sample we provide. After the sample training, a complete network was
finished. Next, we can test the BP neural network by test source codes which were
selected randomly.
The remainder of this paper is divided as follows. Section two is the feature
selection process. Section three describes that using the BP neural network to
training and learning. Section four presents the experimental results of the homology
detection and makes comparison between the experiment and expectation.
 A Software Homology Detection based on BP Neural Network   201

2 Feature selection

With reading a lot of source codes from different languages, we summarized and
analyzed the different style of these codes and combined these with the daily code
habits. Finally, we identified 15 features to distinguish different author’s programming
habits. In order to make a better statistics and simplify the classification which will
be introduced later, we put the 15 features into a multi-dimensional vector. Each
dimension of this vector represents different features and values to 0 or 1. As is shown
in Table 1.

Table 1. The value vectors and discriptions of 15 features

Features Descriptions Value vector

F1 Variable names using/not using underline 1/0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


F2 Variable names using/not using a camel type 0 1/0 0 0 0 0 0 0 0 0 0 0 0 0 0
F3 Using/not using a form of single-line comment to write 0 0 1/0 0 0 0 0 0 0 0 0 0 0 0 0
an annotation
F4 Using/not using a form of multi-line comment to write 0 0 0 1/0 0 0 0 0 0 0 0 0 0 0 0
an annotation
F5 Annotation is/is not an exclusive line 0 0 0 0 1/0 0 0 0 0 0 0 0 0 0 0
F6 Coding using the tab/space indents 0 0 0 0 0 1/0 0 0 0 0 0 0 0 0 0
F7 Operators having/not having a space on both sides 0 0 0 0 0 0 1/0 0 0 0 0 0 0 0 0
F8 Strings using single quotation marks/no strings 0 0 0 0 0 0 0 1/0 0 0 0 0 0 0 0
F9 Strings using double quotation marks/no strings 0 0 0 0 0 0 0 0 1/0 0 0 0 0 0 0
F10 Only/Not only one statement in each line 0 0 0 0 0 0 0 0 0 1/0 0 0 0 0 0
F11 Coding having/not having an blank line to set off 0 0 0 0 0 0 0 0 0 0 1/0 0 0 0 0
sections of logical related code
F12 Coding having/not having a space between 0 0 0 0 0 0 0 0 0 0 0 1/0 0 0 0
parentheses and variable names or function names
F13 Coding which have nested relation all using a correct 0 0 0 0 0 0 0 0 0 0 0 0 1/0 0 0
indentation
F14 After a comma having/not having a space 0 0 0 0 0 0 0 0 0 0 0 0 0 1/0 0
F15 Coding using/not using a abbreviated format like i+=1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1/0

For statistical purpose, we extracted features and got a multi-dimensional vector


from each three line. Because the statistical work is too much, we wrote a program
to execute the works. To better understand, we made an instance. We took the follow
source codes randomly in Figure 1 and the multi-dimensional vectors of each line are
shown as follow:Feature_vector_line8-10 = (1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0)
Feature_vector_line11-13 = (0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0)
Feature_vector_line14-16 = (0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0)
Feature_vector_line17-19 = (0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1)
Feature_vector_line20-22 = (0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0)
202   A Software Homology Detection based on BP Neural Network

Figure 1. The code snippets which selected randomly


We defined the feature vector is X = (x1, x2,...,x15) and Vi = the number of each three
lines when Fi = 1. So, for standardizing the values of each dimension from the feature
vector, we defined.

3 BP neural network ,
,
Artificial neural network (ANN) is a nonlinear . system which consists of a large
number of simple processing units widely connected and is used to simulate human
brain’s neural system. It has the abilities of nonlinear mapping, adaptive learning and
parallel information processing. Artificial neural network has many different form,
of which BP neural network is the most extensive used one. BP neural network is a
multi-layer perceptron which adopt error back-propagation algorithm. It mainly used
for regression prediction and classification and identification. Based on them, it can
.
solve most of the problems faced by neural network.
BP neural network which designed in this paper is composed of a input layer,
a hidden( layer, a output
) layer and nodes of each layer of which have a special
connection. According to Kolmogorov theorem. A three layer network which have
one hidden layer as long as the hidden layer has enough nodes can approximate any
nonlinear continuous function with arbitrary precision on a closed set. Therefore, the
quantity of the layer in BP neural network which designed in this paper is enough and
the network topology structure is shown in Figure 2.
 A Software Homology Detection based on BP Neural Network   203

Figure 2. The structure of BP neural network

The input vector of the input layer I is , the elements in


the vector are the fifteen features which are extracted from codes written by
one author. The output vector of the hidden layer J is , and
. The output vector of the output layer K is
, and . m is the quantity
of the author who participate in this classification. Among the expressions, wij and
wjk are weights between two layer and θj and θk' are threshold value in each neurals.
The functions of two expressions are nerve cells’ activations function. In our model,
we assumed that we had P training samples and the vector x(i) represents the input
of sample i. Beacuse of the multilayer neural network based on linear activation
function in essence is also a superposition of several linear function, and the result
is still a linear function, we intended to let . Besides, we chose the bipolar
sigmoid function( ) as f2 due to we expected the value of output can make us
better distinguish between different categories.
To have a clear result,we defined the expectal out vector is d(i). And the elements
in d(i) are 0 or 1. When d(i)k=1, it represents the codes are written by author k. So,we
have the expectal relation between the result.

d(i)1 + d(i)2 +...+ d(i)m = 1

Next we would talk about the specific derivation process of BP algorithm to get a
completed network and the process uses batch process and designed by the summary
of predecessors and our own innovation.
First, We assumed that the error function of the sample is,
.
( ) ( )
.
( )
204 (  A Software
) Homology Detection based on BP Neural Network

And the E(i) is the error function of one sample and in our paper, one sample consisted
of the feature of author’s coding habits and the identification of the feature vector:

So we got the error function:

Second, we made an example of calculating the weights between output layer


and hidden layer and the threshold value between hidden layer and output layer.
According to gradient-descent algorithm, BP algorithm update weights and bias by
the following method in every iteration. In the follow expressions, n represents the
times of calculation and σ is a learning rate and the value of ranges from 0 to 1:

Then, we should do the computing process.

Similarly,

To simplify these formula, we defined:


 A Software Homology Detection based on BP Neural Network   205

So,

Third, we would calculate the weights between input layer and hidden layer and the
threshold value between input layer and hidden layer:

Similarly,

So,

Now, the BP neural network had completed one forward propagation and back-
adjusting. This process is called a learning or an iteration. BP algorithm need many
206   A Software Homology Detection based on BP Neural Network

iterations to make error convergence to the preset accuracy. And the preset accuracy
is based on the actual situation of learning. But for preventing that we can’t reach a
preset accuracy after too many interations, we defined the times of interations’ upper
limit.
The contents of the above just are a part of network. In order to design a
comprehensive BP neural network, we had other works to do. First, we need to certain
the number of the nodes in input layer and output layer. In our model, the number
of the input layer nodes is fifteen which depends on the number of the features and
the number of the output layer nodes depend on the number of the authors. Then,
we need to define the number of the hidden layer nodes. If the value is too small, the
network has not enough ability to obtain information from the sample and lead to the
network can’t train to a good result. On the contrary, if the number is too big, the time
of learning maybe too long and the error maybe not the best. So, we should determine
a proper number to make the best network.
First, by a large number of experiments, the experience formula of optimum
quantity of hidden layer nodes are summed up. In our model, we adopt the following
formula:

L = log2n

In this formula, L represents the number of hidden layer nodes and n is the number of
input layer nodes. In order to achieve a better calculation result, we defined 2 log2n as
the initial value of the number of nodes.
Next, we introduced a method which can delete the number of nodes reasonably
based on cosine similarity. We defined yip is the output vector of node i in hidden layer
K which is training the sample p, yjp is the output vector of node j in hidden layer K
which is training the sample p. Then, we got the correlation coefficient between yip
and yjp is:

So, after training by all sample, we can got the correlation coefficient between yi and
yj is:

If Rij approaches one, it represents that the correlation between yi and yj is high and
the function of the two nodes has some overlapping part. So, we can merge the two
nodes. This work will do until no nodes can be merged. At that time, we get a proper
value as the quantity of nodes in hidden layer.
 A Software Homology Detection based on BP Neural Network   207

By the work we did above, the completed BP neural network can be acquired and
the detailed procedure is shown in Figure 3.

Figure 3. The entire process of calculatng a completed BP neural network

4 Experiment

For purpose of acquiring an accurate result, we chose codes from Internet randomly
and selected codes which written by the students around and ourselves. In the end,
we got 330 kinds of codes from thirty authors, including C, C++, JAVA, Python, C# and
other different programming languages. In consideration of the quantity of features
and codes, we wrote a small program to extract features of a lot of codes for reducing
the workload. Then, for each author, we all got a feature vector to represent their
programming style which was acquired by the process of feature extraction. After
that, we input respectively the feature vectors to the BP neural network for traning
according to the different author and use the method we introduced in section of BP
netual network. Finally, we acquired a BP neural network in which the input layer
has fifteen nodes, the hidden layer has nineteen nodes and the output layer has thirty
nodes. Besides, the weights and the threshold values in the BP neural network were
also calculated.
208   A Software Homology Detection based on BP Neural Network

In order to test the accuracy of the classification by BP neural network, we


randomly conducted five tests, with each test included fifty kinds of codes. First,
we extracted features. Then, we input the feature vectors to the BP neural network
which have structured just now. Ultimately, we make a comparison between the
experimental results and the expected results. It is shown in Table 2.

Table 2. The results of an experiment

INPUT ACCURACY

8 kinds of codes from author 3


7 kinds of codes from author 10
TEST1 11 kinds of codes from author 15 83. 67%
6 kinds of codes from author 17
9 kinds of codes from author 22
9 kinds of codes from author 26
5 kinds of codes from author 7
9 kinds of codes from author 9
TEST2 8 kinds of codes from author 13 81. 29%
10 kinds of codes from author 21
8 kinds of codes from author 27
10 kinds of codes from author 30
11 kinds of codes from author 1
10 kinds of codes from author 4
TEST3 11 kinds of codes from author 14 75. 94%
11 kinds of codes from author 19
7 kinds of codes from author 25
11 kinds of codes from author 2
7 kinds of codes from author 6
TEST4 10 kinds of codes from author 11 80. 43%
8 kinds of codes from author 16
6 kinds of codes from author 23
8 kinds of codes from author 29
6 kinds of codes from author 5
11 kinds of codes from author 8
TEST5 8 kinds of codes from author 12 82. 97%
10 kinds of codes from author 18
7 kinds of codes from author 24
8 kinds of codes from author 28

Based on the statistical results provided in the Table above, it’s apparent that after the
classification by BP neural network, most of the input codes can be correctly classified
into the expected result. Although, some results of the experiment are not yet ideal,
the feature extraction and the model of BP neural network is feasible in general.
 A Software Homology Detection based on BP Neural Network   209

5 Conclusion

With the vigorous development of the software, detecting software homology is


becoming critical in software engineering. On the basis of a lot of researches about
existing software homology detection technology, we proposed a method of homology
detection which based on programming styles. According to daily programming
trainings and reading books about programming style, we extracted 15 typical
programming features which are suitable for most programming languages. Then,
we introduced BP neural networks which use the features we extracted and have
stronger ability in adaptive learn to classify codes by different author. Experiments
demonstrate that the method is general effective in homology detection. It can be
used as a reference for malware forensics and copyright disputes solving.

Acknowledgment: This work is supported by the National Natural Science


Foundation of ChinaProject (No.61401038, No.61302087, No. U1536119).

References
[1] Niklaus Wirth. A Brief History of Software Engineering [J]. IEEE Journals & Magazines, 2008,
30(3):32-39
[2] D. H. Qiu, H. Li, and J. L. Sun. Measuring Software Similarity based on Structure and Property of
Class Diagram[C]//2013 Sixth International Conference on Advanced Computational
Intelligence. 2013:75-80
[3] Mohamed El Bachir Menai, Nailah Salah Al-Hassoun. Similarity Detection in Java Programming
Assignments[C]// The 5th International Conference on Computer Science & Education.
2010:356-361
[4] Suying Yang, Xin Wang, Cheng Shao, Peng Zhang. Recognition on Source Codes Similarity
with Weighted Attributes Eigenvector[C]// International Conference on Intelligent Control and
Information Processing. 2010:539-543
[5] Noh Seo-Young, Gadia S K. An XML plagiarism detection model for procedural programming
languages[C]//Proceedings of the 2nd International Conference on Computer Science and its
Applications. 2004:320-326
[6] Liuliu Huang, Shumin Shi, Heyan Huang. A New Method for Code Similarity Detection. Progress
in Informatics and Computing. 2010:1015-1018
[7] Jingling Zhao, Kunfeng Xia, Yilun Fu, Baojiang Cui. An AST-Based Code Plagiarism Detection
Algorithm. [C]//2015 10th International Conference on Broadband and Wireless Computing,
Communication and Applications. 2015:178-182
[8] Gang Chen, Yuqing Zhang, Xin Wang. Analysis on Identification Technologies of Program Code
Similarity[C]// 2011 International Conference of Information Technology, Computer Engineering
and Management Sciences. 2011:188-191
[9] AHMED ABBASI and HSINCHUN CHEN. Writeprints: A Stylometric Approach to Identity-
Level Identification and Similarity Detection in Cyberspace. ACM Trans. Inf. Syst. 26, 2,
Article 7 (March 2008), 29 pages. DOI = 10. 1145/1344411.1344413 http://doi. acm. org/ 10.
1145/1344411. 1344413
210   A Software Homology Detection based on BP Neural Network

[10] An experiment in authorship attribution. In Proceedings of the 6th International Conference on


Statistical Analysis of Textual
Data.
[11] CHASKI, C. E. 2001. Empirical evaluation of language-based author identification techniques.
Forensic Linguist. 8, 1, 1–65
[12] Weipeng WANG. An inventory optimization model based on BP neural network. [C]// 2011 IEEE
2nd International Conference on Software Engineering and Service Science. 2011:415-418
Ming LIU*, Hao-yuan DU, Yue-jin ZHAO, Li-quan DONG, Mei HUI
Image Small Target Detection based on Deep
Learning with SNR Controlled Sample Generation
Abstract: A small target detection method based on deep learning is proposed.
First, random background parts are sampled from some cloud-sky images. Then,
random generated target spots are added to the backgrounds with controlled signal
to background noise ratio (SNR) to generate target samples. Then training and testing
results show that the performance of deep nets is superior to tradition small target
detection techniques and the selection of sampling SNRhas an important effect on
nets training performances. SNR = 1 is a good selection for deep nets training, not
onlyfor small target detection,but also for other applications.

Keywords: small target detection; Nerual Network; Deep learning; SNR control

1 Introduction

Image small target detection plays a crucial role in infrared warning and tracking
systems which requires that small targets on background such as cloud-sky or sea-sky
can be detected effectively. There have been many approaches to resolve this problem.
Some papers provided methods by filtering or morphology. A high-pass template
filter was designed for real-time small target detection by Peng and Zhou [1]. In order
to detect an infrared small target, Yang et al. presented an adaptive Butterworth high-
pass filter (BHPF) [2]. Wang et al. provided a real-time small target detection method
based on the cubic facet model [3]. Hilliard put forward a low-pass IIR filter to predict
clutter [4]. Xiangzhi Bai, et al. used top-hat transformation on small target detection
[5]. Some other papers proposed methods of classifier. Wang etc., proposed a detection
method based on high pass filter and least squares support vector machine [6]. Jiajia
Zhao, etc., designed a detection approach based on sparse representation.
In this paper, we propose an end-to-end deep learning solution for small target
detection based on deep learning which could be taken as a method belongs to
classifier. Experimental results prove that our proposed method is robust and
insensitive to background and target changing.
Nowadays, deep architectures with convolution and pooling are found to be highly
effective and commonly used in computer vision and object recognition [7-19]. The most

*Corresponding author: Ming LIU, Beijing Key Lab. for Precision Optoelectronic Measurement Instrument
and Technology, Beijing Institute of Technology, Beijing, China, E-mail: bit411liu@bit.edu.cn
Hao-yuan DU, Yue-jin ZHAO, Li-quan DONG, Mei HUI, Beijing Key Lab. for Precision Optoelectronic
Measurement Instrument and Technology, Beijing Institute of Technology, Beijing, China
212   Image Small Target Detection

impressive result was achieved in 2015 ImageNet contest, where 1.2 million images in
the training set with 1000 different object classes. On the test data set including 150,000
images, the deep Convolutional Neural Network (CNN) approach described in [18]
achieved the top 5 error rates 3.57%, considerably lower than human recognition error
rate of 5%. Furthermore, CNN has achieved superior classification accuracy on different
tasks, such as handwritten digits or Latin and Chinese character recognition [8,9],
traffic sign recognition [10], face detection and recognition [11], radar target recognition
[19]. All facts make us believe that deep neural nets could potentially be used on many
other applications, including image small target detection.
In recent years, there have been many papers about using deep neural nets on
target detection [20-22]. However, their goal is to locate and recognize large objects on
images. The size of the objects in the image are usually scores or hundreds of pixels,
full of complex image details, which is easy for human to locate and recognize. As
for small target, sized a few pixels, it can only be located, barely able to identify its
classes, which is the focus of this paper. So long as we know, it is the first time that
deep neural nets are used in this field.

2 Deep Nets Architectures and configuration

We designed the nets with an input dimension of 21×21 pixels. The small input
nets are used as a moving filter window to detect small target at all positions of an
image. During training, the only preprocessing we do is subtracting the mean value,
computed on the training set, from each pixel. The image is passed through a stack
of layers. Each layer is Full-Connected with following layer. The first layers have
128 channels each, and the last performs 2-way classification. The final layer is the
soft-max transform layer. All hidden layers are equipped with the rectification non-
linearity [23]. Table 1 lists the models we trained in this paper.

Table 1. Deep architectures for small target detection

Model Name A B C D E

Input layers 441×128+ 441×128+ 441×128+ 441×128+


RELU RELU RELU RELU
Middle layers 1× 2× 3×
{128×128+ {128×128+ {128×128+
RELU} RELU} RELU}
Output layers 441×2+ 128×2+ 128×2+ 128×2+ 128×2+
Softmax Softmax Softmax Softmax Softmax
Number of full 1 2 3 4 5
connection
layers
 Image Small Target Detection    213

3 Sample Set Generation

As for any neural networks to be set up, an important job is to find and establish
a dataset with enough number of samples for nets training and validation. Unlike
general daily life object image datasets which are publicly accessible, the datasets for
small target detection have to be generated ourselves.
First, we randomly downloaded some cloud-sky images from the internet and
transformed them to gray level images as a source for background generation.
Figure  1a shows some of these background images. Second, background image
patches are randomly cropped from background image source by a dimension of 21×21
pixels. Next, a simple program was written to add target spot at the center of half
number of background patches as target images, the other half number of background
images are kept unchanged as sample images of no target. The grey-scale(或grey/
intensity) distribution of spot target are generated by Eq.1~Eq.4. Figure 1b shows some
generated image samples.

wx
= rand () × 2 + 1. (1)

wy
= rand () × 2 + 1. (2)

=α rand () × π . (3)

s ( x, y ) =
exp{−[(( cos(α ) x − sin(α ) y ) / wx ) 2
+(( cos(α ) y + sin(α ) x ) / w y ) 2 ] }. (4)

Where wx and wy are the dimensions of gauss distribution on two perpendicular


directions. α is random angle between 0~π, decides the direction of distribution.
rand() generates uniform distributed random value between 0 and 1. s(x,y) is the grey
level function and (x,y) represents the pixel position in a sample patch, where the
center coordinate of patches is (0,0).

Figure 1. (a) Some cloud-sky images for background generationand (b) some generated samples
(Random SNR between 0 and 1).
214   Image Small Target Detection

In order to train the nets effectively, the target spot intensity should be wisely selected.
There are basically two kinds of strategy we used in sample sets establishing, a
random intensity strategy and an intensity strategy with constant SNR. There are
some considerations to be made.
First, unlike common object recognition, the sampling regions of target and no
target overlap with each other under a low or unstable SNR, result in problem of
over-fitting in training (Figure  2c). Besides, if the SNR is very high, there could be
much space between two sampling regions, and the samples could even be separated
linearly (Figure 2a). The nonlinear classification ability of neural nets would be in
vain. Figure 2 illustrates the effect of SNR under a low dimensional condition.

Figure 2. Samples Classification under different SNR: triangles and quadrates represent image
samples with and without target in sample space respectively. The quadrate target samples are
obtained by adding target vector to the corresponding triangular no target sample. The upper vector
represents target vector of different amplitude for each situation. A longer target vector makes a
higher SNR.

Second, in order to have a better training performance with limited number of


samples, training sample points should be as close to class boundaries as possible
while maintaining an ideal shape of boundary surface without mixing with other
classes. A boundary approaching effect, which is illustrated by Figure 3,can be
obtained by holding SNR as a specific constant in a training set.

Figure 3. Boundary approaching effect: triangles and quadrates represent negative and positive
samples respectively. Samples near boundary while maintaining good surface make training easier
 Image Small Target Detection    215

All in all, we choose a strategy with fixed SNR to generate target samples and compare
the net training result with the strategy of random SNR sample generation.
First, a random background patch is normalized to have unit variance as Eq.5.
Then, target spot and background patch are added together according to SNR to make
a target sample image, as Eq.6.

B ( x, y ) − mean ( B ( x, y ) )
Bn ( x, y ) = (5)
(
Σ( x ', y ') B( x ', y ') − mean ( B ( x, y ) ) )
2

P ( x, y ) =SNR × s ( x, y ) + Bn ( x, y ) . (6)

Where Bn(x,y) is normalized background image, mean(B(x,y)) is the mean value of the
background patch. P(x,y) is the generated target sample image. By controlling SNR,
we generated 140000 sample images each time for one time training and validation.
Half of the samples are positive targets and the others are negative backgrounds with
no targets.

4 Experiments

We trained deep nets of different layer width and depth on training sets of different
SNRs. Then evaluate the trained nets on a large sky-cloud image (sized 1024×768)
which does not belong to the background source set. 100 small spot targets are
generated by Eq.1~4 and randomly added on the sky-cloud background image to
generate an image for final performance test. The intensity of spot is controlled in
order to get an evenly-distributed local SNR ranging from 0 to 1. The performance of
nets is compared by Eq.7 and 8.
Small Area Signal-to-Noise ratio gain:

SSNR Gain = ( S / Cs )out / ( S / Cs )in . (7)

Large Area Signal-to-Noise ratio gain:

LSNR gain = ( S / Cl )out / ( S / Cl )in . (8)

Where S is the signal amplitude, Cs and Cl are the standard deviations within a local
small area (21×21pixel) and a large area (201×201pixel) respectively. SSNR reflects the
ability of signal enhancement while LSNR reflects the ability of background noise
suppression. Both indices with larger value imply better performance. Different from
other papers, we do not use the indices of background suppression factor (BSF) due
to its dependency on signal amplification level.
All training and tests were conducted on matlab platform with matconvnet [24]
toolbox. A laptop computer with Intel Core i7-2630QM CPU and GTX650m Nvidia
Graphic Card was used.
216   Image Small Target Detection

We trained all nets in Table 1 by training set of different SNRs, then test and
compare their performances with traditional method of Max-mean and Max-median
filtering. Table 2 lists all test results.

Table 2. Performance of different methods and architectures

Models and Method Train SNR Mean SSNR Gain Mean LSNR Gain

A 1 2.0781 14.3569

Random 2.0781 14.3569

B 1 3.0178 18.4606

Random 2.5196 16.3126

C 16 3.9680 21.5455

8 4.0517 21.6317

4 4.1437 21.8808

2 4.2621 22.2605

1 4.3197 22.8432

0.5 3.9907 22.6232

0.25 3.6274 21.2948

0.125 2.9646 17.3168

Random 2.9282 16.1848

D 16 4.7604 24.5610

8 4.6380 24.1185

4 4.7964 24.2731

2 4.8474 24.4911

1 4.9587 25.0740

0.5 4.4921 25.5707

0.25 3.7134 22.0043

0.125 3.1878 16.5554

Random 3.9263 17.5989

Max-mean 1.8351 7.0192

Max-median 2.0031 8.5107


 Image Small Target Detection    217

As shown in Table 2, the performances of deep nets are significantly better than
traditional filtering methods of Max-mean and Max-median. Deeper nets get better
performances in case the number of full connection layers are less than or equal
5. The training process did not converge for nets with more than 5 full connection
layers. And the nets trained by samples with constant SNR (SNR≈1) achieves
best training performance, better than random sampling, except for the linear
classification model (type A). The performance of linear classification model does
not change with SNR.
Moreover, we give our explanations on why SNR≈1 is the best choice. Figure 4
provides a sampling space with only two dimensions, each axis of the Figure
represents a pixel gray level of a sample image which has only two pixels here.
Each image, whether it is a target of background, could be expressed as a point
in this Figure. Then, the random background images with equal intensity would
form a circle centered at the original point on the Figure. The circle radius equals
to the intensity of background. On the other hand, if a target is added to these
background, we would get a new circle formed by target images. As there are more
than one type of targets (There are 3 types of targets on Figure (4), we would get a set
of circles, each circle set in the point represent a type of target with given intensity.
Then Figure 4 shows three conditions with SNR>1, SNR=1 and SNR<1 in (a) (b) and
(c) respectively.
As shows in Figure 4a, as for background noise of totally random, a sampling
SNR>2, a large gap exists between background and target samples. The samples
could even be divided linearly which is obviously inaccurate. As Figure 4b shows,
sampling with SNR = 2 gets a perfect classification between classes. There are
many samples located near the classification surface while no mixture between
them, which meet the conditions of boundary approaching effect. In Figure  4c,
with a sampling SNR>2, samples of two classes mixed with each other and some
background samples are misclassified as target samples. Usually, there are far more
background pixels than target pixels in an image and this misclassification could
lead to pool performance in test. In a word, SNR=2 is the best choice for training in
small target sample generation.
This conclusion does not coincide with the result of experiments at a first glance.
However, the key idea to understand this inconformity is to notice the different
definitions of SNR. The intensity of signal used in sample generation experiments
is the peak value of signal pixels, while in Figure 4, it is the root-mean-square of
signal pixels. For a signal spot of gauss shape, the ratio between peak value and
root-mean-square is about 2, which causes the SNR inconformity.
218   Image Small Target Detection

Figure 4. Training Result with Different Sampling mean square root SNR: (a) SNR>2; (b) SNR=2; (c)
SNR<2
 Image Small Target Detection    219

5 Conclusions

In our work, a new small target detection method is proposed based on deep learning.
The performance of deep nets is significantly better than traditional filtering method.
It is found that, the nets trained by samples of specific constant SNR, gets better
performance than that trained by samples of random SNR. A reasonable choice of
SNR is SNR≈1 (peak signal value SNR) which gets the best test performance in all
tests with different sampling SNRs. As it has been well known that the addition of
noise to the input data of a neural network during training can lead to significant
improvements in generalization performance [25], our conclusion is that adding too
much noise would do more harm than good in training for the application to detect
signal from noise background. And we provide a simple explanation on why sampling
SNR≈1(peak signal value SNR) gets the best performance in all tests. And SNR = 2
(mean square root SNR) might get even better performance, which needs a further
testify. This conclusion might also be useful for training of nets on general objects
recognition and detection which needs a further research. The next work would be to
reconstruct the nets by convolutions to achieve a parallel faster even real-time small
target detection algorithm.

Acknowledgement: This research was financially supported by the National Science


Foundation of China (No.61301190)

References
[1] Peng, J.-X., and Zhou, W.-L.: ‘Infrared background suppression for segmenting and detecting
small target’, Acta Electron. Sin., 1999, 27, (12), pp. 47–51
[2] Yang, L., Yang, J., and Yang, K.: ‘Adaptive detection for infrared small target under sea-sky
complex background’, Electron. Lett., 2004, 40, (17), pp. 1083–1085
[3] Wang, G.-D., Chen, Ch.-Y., and Shen, X.-B.: ‘Facet-based infrared small target detection
method’, Electron. Lett., 2005, 41, (22), pp. 1244–1246
[4] Hilliard, C.I.: ‘Selection of a clutter rejection algorithm for real-time target detection from an
airborne platform’, Proc. SPIE, 2000, 4048,pp. 74–84
[5] Bai X, Zhou F, Xue B. Infrared image enhancement through contrast enhancement by using
multiscale new top-hat transform. Infrared Physics & Technology, 2011, 54(54):61-69.
[6] Wang P, Tian J W, Gao C Q. Infrared small target detection using directional highpass filters
based on LS-SVM. Electronics Letters, 2009, 45(3):156-158.
[7] Krizhevsky A, Sutskever I, Hinton G E. “ImageNet Classification with Deep Convolutional Neural
Networks,” NIPS. 2012, 1(2): 4.
[8] Ciresan D C, Meier U, Gambardella L M, et al. “Deep, big, simple neural nets for handwritten
digit recognition,” Neural computation, 2010, 22(12): 3207-3220.
[9] Ciresan D C, Meier U, Schmidhuber J. “Transfer learning for Latin and Chinese characters with
deep neural networks,” Neural Networks (IJCNN), The 2012 International Joint Conference on.
IEEE, 2012: 1-6.
220   Image Small Target Detection

[10] Cireşan D, Meier U, Masci J, et al. “Multi-column deep neural network for traffic sign classi-
fication,” Neural Networks, 2012, 32: 333-338.
[11] Taigman Y, Yang M, Ranzato M, et al. “Deep-Face: Closing the Gap to Human-Level Performance
in Face Verification,” IEEE CVPR. 2014.
[12] Yann LeCun. “Learning invariant feature hierarchies,” Computer vision–ECCV 2012. Workshops
and demonstrations. Springer Berlin Heidelberg, 2012: 496-505.
[13] Jarrett K, Kavukcuoglu K, Ranzato M, et al. “What is the best multistage architecture for object
recognition?” Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009:
2146-2153.
[14] Kavukcuoglu K, Sermanet P, Boureau Y L, et al. “Learning Convolutional Feature Hierarchies for
Visual Recognition,” NIPS. 2010,1(2): 5.
[15] Coates A, Ng A Y, Lee H. “An analysis of single-layer networks in unsupervised feature learning,”
International Conference on Artificial Intelligence and Statistics. 2011: 215-223.
[16] Zeiler M D, Krishnan D, Taylor G W, et al. “Deconvolutional networks,” Computer Vision and
Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010: 2528-2535.
[17] Zeiler M D, Taylor G W, Fergus R. “Adaptive deconvolutional networks for mid and high level
feature learning,” Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011:
2018-2025.
[18] He K, Zhang X, Ren S, et al. “Deep Residual Learning for Image Recognition”. Computer Science,
2015.
[19] Chen Sizhe, Wang Haipeng. “SAR Target Recognition Based on DeepLearning”.https://www.
researchgate.net/publication/281959075_SAR_target_recognition_based_on_deep_learning.
[20] Girshick R, Donahue J, Darrell T, et al. Region-Based Convolutional Networks for Accurate Object
Detection and Segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence,
2015, 38(1):1-1.
[21] Girshick R. Fast R-CNN. Computer Science, 2015.
[22] Ren S, He K, Girshick R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region
Proposal Networks. Computer Science, 2015.
[23] Dahl G E, Sainath T N, Hinton G E. Improving deep neural networks for LVCSR using rectified
linear units and dropout, 2013:8609-8613.
[24] “MatConvNet - Convolutional Neural Networks for MATLAB”, A. Vedaldi and K. Lenc, Proc. of the
ACM Int. Conf. on Multimedia, 2015.
[25] Bishop C M. Training with noise is equivalent to Tikhonov regularization. Neural Computation,
1995, 7(1): 108-116.
Yi-xin ZHANG*, Wen-sheng SUN
Agricultural Product Price Forecast based on Short-
term Time Series Analysis Techniques
Abstract: Agricultural product price is closely related to everyone, which has a great
influence on national food security in China. The purpose of this paper is to discuss
short-term time series analysis techniques and apply them to the price forecast from
historical values accurately. First, this paper discusses how to detect the fluctuation
of short-term time series with Auto Regressive Integrated Moving Average (ARIMA)
and Vector Error Correction (VEC) respectively. Then we used Eviews8.0 to observe
the performance of two models by making the experiment on monthly cantaloupe
price from January 2006 to April 2016. The forecast result reveals that the VEC model
outperforms the ARIMA model, generating lower mean absolute percent error (MAPE)
and Theil inequality coefficient than ARIMA. Besides, there exists a significant
long-run equilibrium relationship that the cantaloupe price is positively correlated
with consumer price index (CPI), which provides an economic basis for making
corresponding policies.

Keywords: agricultural product price; short-term time series; ARIMA; VEC;


forecasting

1 Introduction

China is an agricultural country with a large population, which determines that


the agricultural product market is an important part of the circulation system.
Agricultural product price has been the focus of government attention since forever.
It is not only indispensable to stabilize the incomes of farmers who account for 80%
of the general population in China, but also related to the healthy development of
China’s agricultural industry and society [1]. In recent years, however, agricultural
product price changes fast and frequently. The phenomenon that the price is in a
situation of big rise and fall rises a heated discussion among multiple scholars. The
frequent price variation has become the main obstacle of stabilizing farmers’ incomes
and increasing agricultural yield.
Therefore, it is significant to use the scientific data mining method to gain the
law of agricultural product price and forecast the fluctuation of short-term price

*Corresponding author:Yi-xin ZHANG, School of Information and Communication Engineering, Beijing


University of Posts and Telecommunications, Beijing, China, E-mail: zyxzhangyixin@foxmail.com
Wen-sheng SUN, School of Information and Communication Engineering, Beijing University of Posts
and Telecommunications, Beijing, China
222   Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques

accurately. Selecting an appropriate prediction method determines the accuracy


of results, as the market price is influenced by a variety of factors and there exists
complex relationships among them. In the current research, time series analysis
techniques, such as Auto Regressive Integrated Moving Average (ARIMA), have been
used widely. Box-Jenkins’ ARIMA model is pretty attractive because it can capture
complex time series patterns, including those that are stationary, non-stationary and
seasonal sequence. The ARIMA model has a wide range of applications. For example,
Vinay B Gavirangaswamy, Gagan Gupta, Ajay Gupta and Rajeev Agrawal analyzed
the traffic volume prediction on different types of roads [2], Nancy Tran and Daniel
A. Reed forecasted for adaptive I/O prefetching in file systems [3], and A. E. Milionis
and T. D. Davies examined the application to the monthly activity of temperature
inversions [4]. Thus it can be reckoned that capturing agricultural product price with
the framework of ARIMA is accessible. Regardless of economic factors in ARIMA, the
VEC model is based on Johansen co-integration test and Granger causality test, and
it reveals to us how price interacts with economic factors that drive fluctuation. The
VEC model provides a better picture of the dynamic response to price comparing to
the one with ARIMA.
The rest of this paper is organized as follows. Section 2 describes the ARIMA
framework, and Section 3 points out the shortness of ARIMA and describes the VEC
framework with creating an error correction term. Then Section 4 examines two
models with monthly cantaloupe price sourcing from Beijing Xin Fadi Wholesale
Market and makes a comparison of two forecast results. Finally the conclusion is
drawn in Section 5.

2 ARIMA Model

To forecast agricultural product price, choosing the multivariate ARIMA model


is attractive. ARIMA assumes that putting past behaviors and external random
noise together can describe the behaviors of the current. ARIMA consists of the
autoregressive part (AR), the moving average part (MA) and a combination of both
(ARMA). The short-term time series Xt drawn from stationary sequence can be
mathematically characterized by an ARMA (p, q) model. The ARMA (p, q) model is
    
 (1)
Where εt represents the noise term, assuming to be Gaussian distributed with mean
zero and constant variance σ2. The AR part is a linear combination
of the past p observations Xt-1...Xt-p, weighted by p coefficients Ø1….Øp The MA part
is a linear combination of the past q observations εt-1... εt-q, weighted
by q coefficients θ1... θq plus the current noise.

   


Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques   223

Below is the discussion on major steps of ARIMA model construction.


–– Stationary test: Using ARMA framework requires a stationary series that represents
a process in a statistical equilibrium, otherwise it turns to be a spurious regression.
Both autocorrelation function (ACF) and Augmented Dickey-Fuller (ADF) test
can recognize the non-stationary series. The ARMA model cannot directly work
in the application of non-stationary series, but the series can be transformed to
stationary one by d-order differencing, which is denoted as ARIMA (p, d, q).
–– Model recognition: Selecting an appropriate model pattern is crucial. Using
autocorrelation function (ACF) and partial auto-correlation function (PACF) can
expose dependencies among data. Table 1 shows properties of ACF and PACF for
the fitting ARIMA pattern.
–– Estimating coefficients: Use common techniques to find the values of coefficients,
including maximum likelihood estimation and least squares that minimize the
sum of forecast error.
–– Residual Diagnostics: The estimation equation needs to be tested whether the
estimation is consistent with theoretical assumption. Residual test is to make
sure that residuals are in the validity of estimation. Use normality test, LM test,
heteroskedasticity test to diagnose residuals.
–– Forecasting and analysis: Use the ARIMA model we have built to forecast future
values based on past and current values.

3 VEC Model: Advanced over ARIMA

Estimating non-stationary series with the ARIMA model has a difficult economic
issue that transforming to stationary series by differencing can lead to distortions of
data. Besides, some economic factors, which have great influences on agricultural
product price, are lack of being considered in process of ARIMA models. Thus we
suggest another technique Vector Error Correction (VEC) models that have advantages
over ARIMA in the analysis of agricultural product price.

Table 1. Properties of ACF and PACF for Model Recognition

AR (p) MA (q) ARMA (p, q)

ACF Infinite tails off, declined Declined slowly to lag q, and Declined exponential or
exponential or cosine waves cut off afterward to zero cosine waves to zero after
to zero lag q
PACF Declined slowly to lag p, and Infinite tails off, declined Declined exponential or
cut off afterward to zero exponential or cosine waves cosine waves to zero after
to zero lag p
224   Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques

 The VEC model consists of Vector Auto-regression


 (VAR)
and an error correction term.
The VAR model is the unstructured vector equation model to forecast short-term
 
time series and analyze the dynamic relationships 
in the variable system. It has been
proofed that the VAR model can be built with both stationary and non-stationary data

in levels 
as long as the major variables turn to be endogenous. We express VAR (p)
models as (2), which can be simply expressed as (3).
     

(2)
  
 (3)
 

 Where is the m-dimensional endogenous  


variable vector weighted by m×m
dimension coefficient matrix B1, Xt is the n-dimensional exogenous variable vector
  
 
weighted by m×n dimension matrix H, and  is the  
m-dimensional noise term vector
     variance σ2.
assuming to be Gaussian distributed with mean zero and constant
  The error correction term is the residualseries basedon co-integration equation,
  
which is primarily distinguished from the VAR model.  This
 term is to make up the
       
long-run information distortions if differencing the data to be stationary. It is implied
 that
 the VEC model triggers forces pushing  the variable  back towards equilibrium via

 an error correction term when the variables are co-integrated. Equation (3) can be
 expanded
 as (4) for VEC models.  
        
 (4) 

   (5)  
         
  
  Where


 Δ is the difference operator, Q is the
 m×1

 dimension
  coefficient
  matrix, P is the
   
1×m dimension coefficient matrix, and CointEq is  the error correction term generated by

 the
 long-run equilibrium relationship/s between  endogenous
  variables shown as (5).

    
 Below
 is the discussion on major steps of VEC model
 construction.
   
   –– Identifying optimal lag structure:  Theoptimal
  lag structure is identified to
  
 determine the order p. The fundamental method 
   is to minimize error measured
  by 
 some variables such as Akaike information    criterion (AIC),
 final prediction

  error (FPE) and Schwarz criterion (SC). The variables are defined as follows,

 where is the residual covariance matrix, k is the number of endogenous 
 
variables, j is the lag order of the process, and  T is the number of observations.

   
   (6)  
  
  (7) 
       
  (8) 
  
     



Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques   225

–– Johansen co-integration test: To check out whether the long-run equilibrium


relationship exists among variables for non-stationary series, Johansen
co-integration test uses the method of least squares to estimate. Pay attention
that the test can be applied only when the variables are in the same difference
order to be stationary.
–– Granger causality test: Granger causality test can be applied to check out whether
the causality exists between two variables, regardless of considering the variables
are stationary or not. It should be endogenous variables when passing the granger
causality test, otherwise it turns to be exogenous ones.

4 Empirical Study on Two Models

This section makes empirical analysis on the short-term price data with professional
analysis software Eviews8.0. The experiment starts by establishing the ARIMA model
to examine the characteristics of monthly price. Then we analyze the price data with
the VEC model, testing whether there exists the long-run equilibrium relationship
between price and economic factors and how. Finally we compare the forecast results
with some statistical measurement to verify the argument that have been mentioned
above.

Figure 1. The raw data of monthly cantaloupe price (Yuan/kg).


226   Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques

Figure 2. The data of lnprice_sa (Yuan/kg).

4.1 Data Preprocessing

The data that we use in this paper are monthly cantaloupe price sourcing from Beijing
Xin Fadi Wholesale Market. The data cover the period from January 2006 to April 2016
as shown in Figure 1.
It is pretty clear that the sequence goes seasonally which is potentially non-
neglected for prediction. The sequence is decomposed into non-seasonal component
and seasonal term, as other factors are supposed to be emphasized instead of
seasonality. Then use the natural logarithm of non-seasonal data for the model input
in order to remove the effect of heteroskedasticity. The input data, which is denoted
as lnprice_sa, is shown in Figure 2.

4.2 Construction of ARIMA Model

1) Stationary Test: The stationary test for lnprice_sa must be first made when making
econometric analysis of time series data. We use the most comprehensive ADF test
to check the stationarity, and the test result of lnprice_sa is reported in Table 2. The
result shows that the lnprice_sa data is expected to be non-stationary data at the 95%
confidence level, while the price data is integrated of order 1, I (1), based on the fact
that the probability of the hypotheis that the Δlnprice_sa data is non-stationary is
zero.
Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques   227

2) Model Recognition: The basis for identifying a fitting model is to recognize the patten
of ACF and PACF. According to the properties of ACF and PACF shown in Figure 3, we
expect that ARIMA (6, 1, 6) might be appropriate for Δlnprice_sa.

Figure 3. ACF and PACF of lnprice_sa.

Table 2. Stationary Test

Test equation ADF Prob. Stationary or not


(I, T) t-statistic

lnprice_sa (I,0) -2.205282 0.2056 NO


lnprice_sa (0,0) -7.447444 0.0000 YES

3) Estimating coefficients: It must be emphasized that the ARIMA model is the


empirical model buliding from the data. Therefore, both the plots of ACF and PACF
should be observed and probably more than one model are approriate for data. By
keep trying and adjusting coefficients, it turns out that ARIMA (5, 1, 6) performs best
based on the priciple of minimizing AIC and SC, which is shown in Figure 4.

4) Residual Diagnostics: The residuals are expected to be white noise if the buliding
model is a fitting one. Examine them with LM test shown in Table 3, and the result
indicates that the residuals are certainly white noise at the 95% confidence level.

5) Forecasting: Now use ARIMA (5, 1, 6) to forecast the price data lnprice_sa. The
forcast result lnprice_saf is shown in Figure 5.
228   Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques

Figure 4. Estimate equation for ARIMA.

Table 3. The Residual LM Test

F-statistic 0.247484 Prob. 0.8630

Obs * R-squared 0.817838 Prob. 0.8452

Figure 5. The forcast result lnprice_saf.

4.3 Construction of VEC Model

1) Stationary Test: The monthly consumer price index (CPI) and consumer price index
for food (CPI_F) are included in the model since the agricultural product price is
sensitive to CPI according to the economic theory. Use the natural logarithm of each
variable in order to match lnprice_sa in the period. The model starts by stationary test
Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques   229

for each series data shown in Table 4. The ADF stationary test shows that these three
varibles are non-stationary, but all integrated of order 1, meeting the condition that
varibles in the VEC model should be in the same difference order.

Table 4. Stationary Test

Test equation (I, T) ADF t-statistic Prob. Stationary or not

lnprice_sa (I, 0) -2.205282 0.2056 NO


lncpi (I, 0) -2.524273 0.1125 NO

lncpi_food (I, 0) -1.978001 0.2961 NO

lnprice_sa (0, 0) -7.447444 0.0000 YES

lncpi (0, 0) -4.709097 0.0000 YES

lncpi_food (0, 0) -5.166544 0.0000 YES

Table 5. Lag Structure Test

LR FPE AIC SC

1 NA 7.30e-11 -14.82671 -14.59923*


2 23.51129 6.84e-11 -14.89277 -14.43781

3 12.89613 7.10e-11 -14.85568 -14.17323

12.33618 7.40e-11 -14.81690 -13.90697

5 21.21628 6.96e-11 -14.88120 -13.74379

6 4.745649 7.85e-11 -14.76432 -13.39943

7 14.00762 7.94e-11 -14.75965 -13.16728

8 14.38080 7.95e-11 -14.76576 -12.94591

9 7.801338 8.64e-11 -14.69435 -12.64701

10 12.42146 8.81e-11 -14.68854 -12.41372

11 11.57200 9.05e-11 -14.67784 -12.17553

12 24.06114 7.74e-11 -14.85512 -12.12533

13 40.02544* 5.14e-11* -15.29014* -12.33287


230   Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques

2) Identifying optimal lag structure: It can be assumed that the lag order p should be
around 12 because of the monthly data frequency. Lag structure test, shown in Table
5, indicates that the 13-period-lag would fit the data best based on the optimal criteria
of LR, FPE, AIC, SC.

3) Co-integration test: Use Johansen co-integration method to test whether long-run


combinations of these I (1) series exist. The result in Table 6 shows that the null
hypothesis “At most 1 (CE)” can be accepted, indicating that the co-integrated relation
exists and the error correction term can be estimated.

4) Granger Causality Test: The granger causality test result is presented in Table  7. It
shows that the null hypothesis “lncpi does not granger cause lnprice_sa” is rejected at
the 90% confidence level, indicating CPI exerts a significant influence on cantaloupe
price as an endogenous variable in the VEC model, while CPI_F is regarded as an
exogenous variable as CPI_F does not effect price directly in our result.

5) Estimate Equation: After determining the pattern of VEC model, a three-variable


equation is estimated as follows. Equation (9) reveals that the price is 17.06% positively
correlated with CPI, which significantly echoes the economic conclusion that CPI can
generally reflect the rise and fall of agricultural product price.

Table 6. Johansen Co-Integration Test

Hypothesized No. of CE (s) Eigenvalue Trace Statistic 0.05 Critical Value Prob.

None* 0.249295 38.20421 29.79707 0.0043


At most 1 0.057524 7.522809 15.49471 0.5177

At most 2 0.011001 1.183621 3.841466 0.2766

Table 7. Granger Causality Test

Obs F-statistic Prob.

lncpi does not granger cause lnprice_sa 111 1.68770 0.0785

lnprice_sa does not granger cause lncpi 1026472 0.2507

lncpi_food does not granger cause lnprice_sa 107 1.21955 0.2813

lnprice_sa does not granger cause lncpi_food 2.57984 0.0049

lncpi_food does not granger cause lncpi 107 -1.49271 0.1384

lncpi does not granger cause lncpi_food 0.36097 0.9780


Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques   231

Figure 6. The forcast result lnprice_sa_vecf.

 (9)

6) Estimate Equation: Now use VEC(13) to forecast the price data lnprice_sa again. The
forcast result lnprice_sa_vecf is shown in Figure 6.

4.4 Comparison

We compare the forecast performance of two classical short-term time series


techniques, using the principle of minimizing statistics variables including Akaike
information criterion (AIC), Schwarz criterion (SC), mean absolute percent error
(MAPE), and Theil inequality coefficient. In contrast with the earlier results shown
in Table 8, all criteria select the VEC model to be a better one, laying on the fact
that the VEC model includes an error correction term which represents the long-run
equilibrium relationship between cantaloupe price and CPI, and the term works for
fixing the short-term forecast error.
232   Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques

Table 8. Comparsion of Two Models

AIC SC MAPE Theil

ARIMA -0.768456 -0.529766 8.511528% 0.052095

VEC -1.056954* -1.006402* 7.791485%* 0.048294*

5 Conclusions

To accurately forecast agricultural product price, using short-term time series analysis
techniques is accessible. The ARIMA framework and the VEC framework have been
designed for modeling the monthly activity of cantaloupe price. Experiments with
two approaches have shown that the VEC model outperforms the ARIMA model, as
the strong equilibrium relationship exists when CPI is the dependent variable of the
co-integration equation that fixes the monthly cantaloupe price forecast error.
What is more, the empirical research demonstrates that the cantaloupe price
is 17.06% positively correlated with CPI by rising 1% in Beijing Xin Fadi Wholesale
Market, indicating the fact that relevant policies should be appropriately and
positively made to ensure that the development of China’s agricultural markets and
industry is in harmony based on the fluctuation of CPI.

Acknowledgment: We are thankful that the study is supported by National Natural


Science Foundation of China under Grant 61302080.

References
[1] Shiliang Du and Dexian Zhang, “The wheat prices prediction based on ARIMA model,” Journal of
Simulation, vol. 2, No. 4, August 2014.
[2] Vinay B Gavirangaswamy, Gagan Gupta, Ajay Gupta and Rajeev Agrawal, “Assessment of
ARIMA-based prediction techniques for road-traffic volume,” MEDES ‘13: Proceedings of the
Fifth International Conference on Management of Emergent Digital EcoSystems, pp. 246-251,
October 28-31, 2013.
[3] Nancy Tran and Daniel A. Reed, “ARIMA time series modeling and forecasting for adaptive I/O
prefetching,” ICS ‘01: Proceedings of the 15th international conference on Supercomputing, pp.
473-485, June, 2001.
[4] A. E. Milonis and T. D. Davies, “Box-Jenkins univariate modelling for climatological time series
analysis: an application to the monthly activity of temperature inversions,” International
Journal of Climatology, vol. 14, pp. 567-579, 1994.
[5] Huda M. A. El Hag and Sami M. Sharif, “An adjusted ARIMA model for internet traffic,” AFRICON
2007, pp. 1-6, 26-28 Sept. 2007.
[6] Yang Chang-zheng and Li Hui-min, “Three-industry-structure analysis based on the VAR and VEC
models: empirical study of economic data of Hangzhou from 1978 to 2008,” 2011 International
Conference on Management Science & Engineering (18th), pp. 689-695, 13-15 Sept. 2011.
Agricultural Product Price Forecast based on Short-term Time Series Analysis Techniques   233

[7] Jingiu Xu, “Rearch on the interactive transmission mechanism of house price in China---based
on a mixed VAR model,” Advanced Materials Research, vols. 108-111, pp. 513-518, May, 2010.
[8] Liu Ya-chen and Zhang Shuai, “Econometric analysis on the relationship between RMB
exchange rate and real estate price by VAR model,” 2nd International Conference on Science
and Social Research, pp. 428-430, July, 2013.
[9] Huang Ting and Liu Huangjin, “Dynamic relationship between real estate prices and inflation
rate,” 2013 IEEE International Conference on Granular Computing (GrC), pp. 153-156, 13-15 Dec.
2013.
[10] Dongmei Cai, “Analysis of economic influence factors of vegeTable market price short-term
fluctuation based on VAR model,” Carpathian Journal of Food Science & Technology, vol. 7 issue
1, pp. 20-25, 2015.
[11] Changshou Luo, Qingfeng Wei, Liying Zhou, Junfeng Zhang and Sufen Sun, “Prediction of
vegeTable price based on neural network and genetic algorithm,” The 4th IFIP International
Conference on Computer and Computing Technologies in Agriculture and the 4th Symposium on
Development of Rural Information(CCTA 2010), pp. 672-681, Oct 2010.
[12] Phillip Fanchont and Jeanne Wendel, “Estimating VAR models under non-stationarity and
cointegration: alternative approaches for forecasting cattle prices,” Applied Economics, vol. 24
issue 2, pp. 207-217, Feb 1992.
[13] Yan Wang, Application of Time Series Analysis. China Renmin University Press Co. LTD. Beijing,
2010.
Xiao-lin ZHAO*, Jing-feng XUE, Qi ZHANG, Zi-yang WANG
The Correction of Software Network Model based on
Node Copy
Abstract: As the basic tool to study the complexity of software by using the theory
of Complex Network, software network model is of great importance. Especially,
class-collaborative network is superior in many fields and thus is widely used
by researchers. However, as a type of static network reflecting the interactions
between classes, the structure of class-collaborative network doesn’t include enough
information about dynamic binding, and thus misses some correlations between
classes which are generated during the operation of software. To correct the class-
collaborative network and let it can reflect the relations between classes generated by
dynamic binding, we design a method called node-copy. To prove the correctness of
our method, we use the node-copy method to generate software network models on
four types of software different in structure obviously. For all network models above,
by comparing the differences of spreading range and spreading paths before and after
using node-copy method, we demonstrate that our method can effectively correlate
the class-collaborative model and also reduces the complexity of the software network
to ensure software security. Since our modified network model reflect the correlations
between nodes (the structure of software) accurately and extremely, it can be used to
study the propagation statistics of software and evaluate the importance of each node
to the whole network.

Keywords: Complex network; class-collaborative network; software security; node-


copy; spreading range

1 Introduction

Nowadays, software systems become dramatically big in size and complicated in


structure. As the basic property of software and the main cause of software defects,
complexity greatly affects the quality and production efficiency of software products
[1]. Under this circumstance, much attention should be put into the recognition,
measure and control of software complexity. Traditional methods tend to focus on the
local parts of software, lacking consideration of the whole system [2], and thus cannot
be used to study the software complexity effectively. Complex Network theory, a new

*Corresponding author: Xiao-lin ZHAO, School of Software, Beijing Institute of Technology, Beijing,
China, E-mail: zhaoxl@bit.edu.cn
Jing-feng XUE, Qi ZHANG, Zi-yang WANG, School of Software, Beijing Institute of Technology,
Beijing, China
 The Correction of Software Network Model based on Node Copy   235

way to study complexity, abstracts a system into the network model, and uses the
theory of topology to analyze the complexity of system. As an effective tool proved by
researchers from multiple areas to study complexity [3-6], Complex Network theory
seems valuable to solve the serious problem the Software Engineering faces.
In the past decade, researchers have used Complex Network theory to study
the causes of software complexity, design complexity-measurement suite and
locate essential parts of software [7-10]. Their achievements powerfully promote the
development of research on software complexity. However, further research in this
area is still needed as current methods lack accuracy seriously.
To study the complexity of software by using Complex Network theory, we should
firstly generate the software network model. Thus, the procedure of creating software
network model becomes the basic and critical link in software network analysis, as
we cannot accurately measure the complexity of software when the network model
cannot exactly reflect the structure of software.
Basically, a correct software network model should contain all elements of a
software and cover all interactions within these elements. Missing any of these
components means that the network model has defects in reflecting the structure
of software. To emphasize the effect of defects from network model on the study of
software complexity, we use information spreading paths as an example.

2 The basic flaw of class-collaborative network

2.1 Classification of software network

2.1.1 Classified by the time of generation


As the manifestation of the structure in software, software network can be generated
by two ways: 1. by using the codes of the software; 2. by tracking and recording the
trace of the operation.
If we choose to use the codes to generate the software network, the network then
belongs to static network model, as the process of generation doesn’t need to run the
software. And if we choose to generate the network model by the second way, the
network belongs to dynamic network model as the creation of such network needs to
run the software.
Static network model has the following characteristics:
–– As the generation of the network model doesn’t need to run the software, we don’t
need to consider the adequacy of the test case at all.
–– The network model covers most of the execution paths if it reflects the structure
of the network reasonably.
–– The generation and accuracy of the static network model depends on the code
type. If we use source code to generate the network model, then the structure of
the software can be easily extracted and reflected in the network model. If the
236   The Correction of Software Network Model based on Node Copy

code type belongs to binary code however, then the generation of the network is
difficult, not to say the accuracy of such network in reflecting the structure of the
software.
–– Some execution paths may not be caught by using such static network. Although
static network model tends to cover most of the execution paths, there indeed
exist some which cannot be acquired by such network. For instance, execution
paths generated by dynamic binding in Java or C++ are missing in such network.

Dynamic network model has the following characteristics:


–– In contrast with the static network model, the generation and accuracy of the
dynamic network completely depend on the adequacy of the test case. To
generate a dynamic network reflecting the structure of the software ideally, we
should ensure that most execution paths are covered by using our test case.
–– The generation of an ideal network model may be of huge amount of work. To
cover the execution paths as complete as possible, we need to test every case,
track and record the process of operation by using specific tools. All these work
may cost lots of time.
–– The execution paths we lose in static network model may be included in the
dynamic network model. As dynamic network model is the embodiment of
the software operation, it contains the execution paths generated by dynamic
binding. However, whether all paths generated by dynamic binding can be
achieved is closely related to the completeness of our test case, as mentioned in 1.

2.1.2 Classified by the level of abstraction


We can also use another standard to classify the software network. To generate the
network model of a software system, two factors need to be considered.
Firstly, we should choose the level of abstraction. For example, since codes
programmed by Java can be organized by function granularity, class granularity and
package granularity, we should point out to see nodes in network as which kind of
these units. As a result, the internal structure of the unit chosen to represent nodes
will be ignored in the network.
Secondly, the symbolization of edges between nodes generated in the first step
needs to be considered. Clearly, edges should be used to describe that there exist
correlation between units, but however, as the correlations between units may present
different coupling strength, we should then decide whether and how to use the
weight of edges to distinguish these coupling strength. For instance, if a class extends
another one, the coupling of these classes is stronger than of those, the correlations
of which only contain combination, dependency and so on.
However, as we haven’t decided how to use the coupling strength to study the
software network, the method about how to use the weight of edges is not discussed
in this paper.
 The Correction of Software Network Model based on Node Copy   237

Nowadays, researchers mainly use four software network models as below:


a) Function network: Use functions (methods) to represent nodes in network and
method call relations will be extracted as links between nodes [11].
Class network: Use class modules in object-oriented program to represent nodes and
edges will be added to the network when there exists coupling between classes. The
most common way to generate a class network ignores coupling strength between
classes, that is, the weight of edge is meaningless [12].
b) Package network: Use package modules to represent nodes and edges will be added
if there exists interchanges between packages.
It’s easy to transform network model from function network to class-collaborative
network and package network, as shown in Figure 1 [13].

Figure 1. Correlations of different types of software network.

2.2 Advantages and disadvantages of class-collaborative network

2.2.1 Advantages of class-collaborative network


Considering two standards of classification on software network model, we summarize
common software networks as Table 1.

Table 1. Classification of software networks

Function unit Class unit Package unit

Static generation Static-function network Class-collaborative Static-package network


network
Dynamic generation Dynamic-function Object-depending Dynamic-package
network network network
238   The Correction of Software Network Model based on Node Copy

Researchers tend to use codes of software to generate static software network model,
as its building process is easy and such static network has many advantages in
comparison with dynamic network model. Among the three common static network
models, class collaborative network is most popular.
On one hand, by using this model, researchers can easily combine the theory
of Complex Network with achievements got before. Since researchers have studied
software on the level of class deeply, many ways of software optimization have
been proposed. For example, to improve the expansibility of programs by removing
coupling between classes, many classical design patterns have been proposed; to
measure if the program has been properly designed, software metric suits, such as
MOOD suits C&K method are designed on the level of class.
On the other hand, other models have fatal defects. For example, package module
is too big if treated as nodes and the correlations of packages are too rough to reflect
the software structure. If we choose functions as nodes, then the network scale will be
unbelievable large as there exists too many nodes, and calculation on such network
will be time-consuming.
Due to these reasons, we believe that class-collaborative network is the ideal
model for research about software network. Reasonable as it seems, as a static
network model, class-collaborative network inherits weaknesses of static network
model and is defective.

2.2.2 Disadvantages of class-collaborative network


To show the disadvantages of class-collaborative network, we use such network
corresponding to Java program to as an example.
In software programmed by Java, programming to interfaces is a respected way
to decouple the correlations between elements and thus can be used to improve the
maintainability and expansibility of the software. In pattern design, almost every
pattern contains such method. Obviously, programming to interfaces can lead to
dynamic binding when running the system in most cases.
To show that some execution paths corresponding to codes which embody
programming to interfaces are missing in class-collaborative network, we use a
simple example.
As shown in Figure 2, the simple network on the right is the class-collaborative
network model corresponding to codes on the left. From codes above, we can easily
get that class C, D and E all implement interface B, and class A uses the method
which belongs to implementation of B, that is, C, D and E are possible to transmit
information to A. However, from its corresponding network, we cannot get that A and
C, D, E have connection.
To further explicate the situation that in class collaborative network, missing
information about dynamic binding will lead to the lost of many execution paths, we
use codes containing Decorator Pattern as an example.
 The Correction of Software Network Model based on Node Copy   239

Figure 2. The class-collaborative network corresponding to programming to interfaces

In object-oriented programming, the Decorator Pattern is a design pattern that


allows behavior to be added to an individual object, either statically or dynamically,
without affecting the behavior of other objects from the same class. In Figure 3, class
‘ConcreteDecorator’, as the subclass of Decorator, can wrap any class that implements
interface ‘Component’. That is, ‘ConcreteDecorator’ may depend on any class as long
as it’s an implementation of interface ‘Component’. However, from the corresponding
class collaborative network of the Decorator Pattern, we find that many edges we just
analyzed are not included. These missing edges include:
<ConcreteComponent, ConcreteDecorator1>,
<ConcreteComponent, ConcreteDecorator2>,
<ConcreteComponent, Decorator>,
<ConcreteDecorator1, ConcreteDecorator2>,
<ConcreteDecorator2, ConcreteDecorator1>,
<ConcreteDecorator1, ConcreteDecorator1>,
<ConcreteDecorator2, ConcreteDecorator2>.

These missing of edges will trigger a chain reaction as information spreading paths
including these edges are lost.
240   The Correction of Software Network Model based on Node Copy

Figure 3. Decorator Pattern and its corresponding class-collaborative network

3 The correction of class-collaborative network

3.1 Necessity of modifying the class-collaborative network

Clearly, the missing of edges and information spreading paths has a negative effect
on using class-collaborative network to study the propagation characteristics of real
system. For instance, many properties of software network concerning route traverse
will be affected.
• Betweenness:
Betweenness is a centrality measure of a vertex within a graph (there is also edge
betweenness, which is not discussed here). Betweenness quantifies the number of
times a node acts as a bridge along the shortest path between two other nodes.
• Spreading range
For each node, this property can be used to measure the scale of its influence area.
The spreading range of a node contains all nodes that can be connected by this one
through a path.
• Eigenvector centrality
As a measure of the influence of a node in a network, it assigns relative scores to all
nodes in the network based on the concept that connections to high-scoring nodes
contribute more to the score of the node in question than equal connections to low-
scoring nodes. Google’s PageRank is a variant of the eigenvector centrality measure.
 The Correction of Software Network Model based on Node Copy   241

• Cross-clique centrality
Cross-clique centrality of a single node, in a complex graph determines the
connectivity of a node to different Cliques. A node with high cross-clique connectivity
facilitates the propagation of information or disease in a graph.
According to the analysis above, we get that before using the class-collaborative
network to do further research, it’s necessary to let the network model contain the
edges generated by dynamic binding.

3.2 The core idea of modifying the class-collaborative network

Since class-collaborative network has the defect of missing execution paths


generated by dynamic binding, it’s essential to add these missing paths to the
current network model without adding wrong information spreading paths
meanwhile.
As mentioned in last chapter, dynamic network, such as object-depending
network, can cover execution paths generated by dynamic binding more or less.
Thus, some researchers use object-depending network or the combination of
object-depending network and class-collaborative network to achieve execution
paths as many as possible. However, since the coverage of execution paths in
dynamic network relies heavily on the effect that execution of test case generates,
these ways can hardly get execution paths( including those generated by dynamic
binding) entirely.
In codes that use programming to interfaces, the relationships between classes
mainly include inheritance (generalization) and dependency (combination,
association). Only if these two types of relations exist simultaneously, can
polymorphism present in such codes.
In Figure 4, node A represents an interface or abstract class which depends on
class D, node C represents a class implements A, and B represents a class depends
on A. From this picture we can get that D can affect A directly, and affect B and C
by means of A. However, C cannot affect B as there doesn’t exist any edge started
from C in Figure 4. In fact, as an inheritance system, every node in an inheritance
tree presents the same function to nodes outside this tree. That is, A and C present
the same function to node B. Thus, C can affect B and an edge should be added
between C and B to show their relationship of dependency.
By using Figure 4 as an example, we can get that the key link of correction on
the class-collaborative network is to modify the edges of those non-root nodes in
inheritance trees. By modifying the topological structure of these nodes and let
them present the functionality of the father node in class-collaborative network,
can execution paths generated by dynamic binding be included in such static
network.
242   The Correction of Software Network Model based on Node Copy

Figure 4. A simple network representing programming to interfaces

To achieve such purpose, we referenced the “node-copy model” used by researchers


in studying the growth and evolution rule of complex network. In studying such
regulation of Web, Kleignberge believed that such evolution is implemented by
copying of subgraph and they supplied the algorithmic description of the growth on
Web model. During the growth of such model, on one hand, nodes and edges are
added randomly, on the other hand, some nodes copy the edges of other nodes [14].
For subclasses in software, the functionality inherited from the superclass can
also be expressed as the functionality copied from the superclass. Thus, we think
that non-root nodes should partially copy the topological structure of the father node
when added to the class collaborative network. That is, there should be edges between
non-root nodes in the inheritance tree and nodes that depend on the superclass of
these non-root nodes.

3.3 The procedure of modifying the class-collaborative network

On the basis of the analysis above, we design a method called “node-copy” about how
to correct the class collaborative network.
We build two Table s to store information about relationships between classes in
software. Table “Dependency-edge” is used to store edges representing dependency
(association, aggregation) and Table “Inheritance-edge” is used to store edges
representing inheritance and generalization. All coupling information between
classes can be obtained by using tool “Dependency Finder” to scan the codes of the
software. The structure of the Table s above are shown in Table 2 and Table 3.

Table 2. Dependency-edge Table

Nodes Dependent nodes

X Y (Y depends on X), Z …
… …
 The Correction of Software Network Model based on Node Copy   243

Table 3. Inheritance-edge Table

Nodes Parent nodes

X W (X inherits or implements W), V…


… …

Table 4. Nodes registered Table

Nodes Whether operation of adding edges has been executed on such node

X False or true
… …

After such operation, the class collaborative network currently consists of isolated
nodes and no edges are included. Correspondingly, all nodes in Table “nodes
registered” are tagged “false”, which means that we haven’t add edges to all these
nodes.
For those nodes which haven’t been added edges, we execute adding edges
operation on each of these nodes (this operation can lead to recursion of adding
edges operation on its parent nodes).
Firstly, we choose a node, like ‘A’, from the “nodes registered Table” which is
tagged false. Then we get all direct parent nodes of ‘A’ as a set by using Inheritance-
edge Table. If there exits some node in such set which hasn’t been added edges either,
we should execute adding edge operation on such node before adding edges to ‘A’
(this is the cause of recursion for adding edges operation).
When all parent nodes of ‘A’ are tagged ‘true’ in “nodes registered Table ” (we
have executed operation of adding edges on all parent nodes), we get all nodes that
depend on the parent nodes by using “Dependency-edge” Table and add these nodes
to the line corresponding to ‘A’ in ‘Dependency-edge’ Table (that is, we let these nodes
depend on ‘A’).
After the operation above, we now add edges for node ‘A’ by two steps.
–– For every parent node of ‘A’, we build an edge starting from such node, pointing
to ‘A’.
–– By using the “Dependency-edge” Table, we get all nodes that depend on ‘A’ as a
set. Then, we add edges starting from ‘A’, pointing to every node in such set.

After the operation above, we change the tag of ‘A’ in “nodes registered” Table to
“true” which indicates that we have finished adding edges for node ‘A’.

For every node which is tagged “false” in “node registered Table ”, we use the
operations above to add edges for the node, until all nodes are tagged “true” in “node
registered Table ”.
244   The Correction of Software Network Model based on Node Copy

4 Experiments

To show that the method “node-copy” we propose can effectively correct the class
collaborative network, we design a couple of software modules and each module
reflects programming to interfaces. For every module, we build its original class
collaborative network and correctional class-collaborative network which needs
to use our “node-copy” method. For every node in the network, we calculate its
spreading range in two network models. For each software above, by comparing the
difference of spreading range on each node in corresponding network models, can we
get whether the modification on class collaborative network is valid.

4.1 The correctness of our modification on class collaborative network

As shown in Figure 5, there are four classes and one interface. The interaction of <class
A, interface B> and <interface B, class C> belongs to dependency. The interaction of
<class D, interface B> and <class E, class D> belongs to inheritance.

Figure 5. Class diagram of sample 1

From Figure 6, we can get the spreading range of each node, as shown in Table 5.
If dynamic binding is considered in such graph, the spreading range of D and E
should also include C, which are not shown in Table 5.

Figure 6. The original class collaborative network of sample 1


 The Correction of Software Network Model based on Node Copy   245

Table 5. The spreading range of each node in the original class collaborative network

Node Spreading range

A B, C, D, E
B C, D, E
C Null
D E
E null

By using “node-copy” method as mentioned in the last chapter, the process of


building modified class-collaborative network is shown in Figure 7.

Figure 7. Class diagram of sample 2

The spreading range of each node in the correctional network is shown in Table 6.

Table 6. The spreading range of each node in the modified network

Node Spreading range

A B, C, D, E
B C, D, E
C Null
D C, E
E C

From Table 5, we can get that by using “node-copy” method to modify the class-
collaborative network of sample 1, the spreading ranges of both D and E contain C.
That is, the effect of dynamic binding can be achieved by such modified network.
From such network we can also get information spreading paths as Table 7 shows.
According to our analysis in sample 1, we can get that all paths in Table 7 are right
and have covered the transmission routs between any two classes completely.
246   The Correction of Software Network Model based on Node Copy

Table 7. The information spreading paths of sample 1 in the modified network

Information spreading path

A—>B—>C, A—>B—>D, A—>B—>D—>E,


A—>B—>D—>C, A—>B—>D—>E—>C,
B—>C, B—>D, B—>D—>E,B—>D—>C,B—>D—>E—>C,
D—>E,D—>C,D—>E—>C, E—>C

In sample 1, the relation between any two classes is simple and single. However, in most
software systems, the coupling between some classes are complex.
In Figure 10, the interaction of <class A, interface B>, <interface B, class C> belongs to
dependency. The interaction of <class D, class E>, <class F, class D> belongs to inheritance.
Particularly, the interaction of <interface B, class D> includes both dependency and
inheritance.
The class collaborative network corresponding to Figure 8 is shown in Figure 9.
By using our “node-copy” method, the process of building modified class-
collaborative network is shown in Figure 10.
The modification of sample 2 is special in step3 to step5 comparing with the
modification of sample 1. In step 3, we add a loop for D, which is due to the composite
correlation between B and D. Since D is the child node of B, we add an edge <B, D>.
Moreover, D also depends on B, which means that we should add edges from all nodes
that are child nodes of B to D. Thus, such loop is added for D. Due to the same reason, we
add an edge <E, D> in step 4. And since D is the parent node of E, edge <D, E> also exists.
Thus, D and E depend on each other. Adding edges for F in step 5 is similar to E.
Furthermore, by using our “node-copy” method, the corresponding network model
of Decorator Pattern shown in Figure 3 will be of the following pattern.
By using such network model, we can get the spreading range of each node as Table 8
shows.
As we have described, in decorator pattern, every object of “Decorator” can
decorate any object which has implemented interface “Component”, so there may exist
correlations of dependency between every child node of “Decorator” and every child node
of “Component”. Thus, the spreading range of every node in Table 8 is reasonable and our
modification on the software network corresponding to Decorator patter is correct.

Table 8. The spreading range of each node in the modified network of Decorator Pattern
Node Spreading range
ConcreteComponent Decorator, concreteDecorator1, concreteDecorator2
Component ConcreteComponent, Decorator, concreteDecorator1 concreteDecorator2
Decorator Decorator, concreteDecorator1, concreteDecorator2
concreteDecorator1 Decorator, concreteDecorator1, concreteDecorator2
concreteDecorator2 Decorator, concreteDecorator1, concreteDecorator2
 The Correction of Software Network Model based on Node Copy   247

Step 1 :

Step 2 :

Step 3 :

Step 4 :

Figure 8. The process of generating correctional network for sample 1

Figure 9. The original class collaborative network of sample 2


248   The Correction of Software Network Model based on Node Copy

Step 1 :

Step 2 :

Step 3 :

Step 4 :

Step 5 :

Figure 10. The process of generating correctional network for sample 2


 The Correction of Software Network Model based on Node Copy   249

Figure 11. The modified class-collaborative network corresponding to Decorator Pattern

4.2 The significance of our modification on class collaborative network

The samples above are used to show that our “node-copy” method can correctly
modify the original class collaborative network model by adding edges representing
execution paths generated by dynamic binding. To further emphasize the effect of our
method on modifying the class collaborative network, we use software “log4j-1.2.8” as
an example. “log4j” is a tool widely used to record the log in Java programming. The
process of our experiment on such software is as follows.
1. 
The information about structure of “log4j-1.2.8” is extracted by using tool
“Dependency Finder”.
2. We build the original class collaborative network of “log4j-1.2.8” and its modified
network respectively.
3. We use “Gephi”, a visualization software specializing in Social Network Analysis to
generate images corresponding to network models in step 2.

Differences of original network and modified network are shown in Figure 12.
From Figure 12 we can get that the modified network is similar to the original one
in the rough structure. However, the modified network is tighter than the original one,
as more nodes in such network are closely linked after adding the edges representing
dynamic binding. The statistical information of both network models is shown Figures
and Table s as follows.
From Table 9, we can get that after modification on the network model of “log4j”,
more edges representing dynamic binding are added into the current network,
which leads to the increase of diameter, average out-degree and average in-degree.
In particular, the diameter of the network also increases after modification, this is
caused by that after adding edges between nodes, isolated nodes become connected,
leading to the expansion of the network in maximum-connected-distance.
250   The Correction of Software Network Model based on Node Copy

Figure 12. Differences of original network and modified network

Table 9. The main statistical data of “log4j”network before and after modification

Statistical index in network Original network Modified network

Count of nodes 228 228

Count of edges 595 1287

Diameter 6 8

Density 0.011 0.025

Average out-degree/in-degree 2.61 5.6

The changes about distribution on in-degree, out-degree and spreading range after
modification of the class collaborative network corresponding to “log4j” are shown
in Figure 13, 14, 15 respectively.
From Figure 13 we can get the differences of in-degree after modification as
bellow: 1) in-degree of most nodes in the network remains the same; 2) The in-degree
of 40 nodes in the network increases by 1 or 2 after modification; 3) In-degree of
few nodes changes dramatically, and there exists some node the in-degree of which
increases by 35.
The analysis on differences of out-degree is similar.
From Figure 15, we can get that although the spreading range of half nodes
remain the same, the distribution of spreading range for the rest nodes reveals
normal distribution. In such figure, the spreading range of up to 40 nodes increase
by 117, and the spreading range of most heavily affected node increases by 177. Thus,
our modification on the class collaborative network of log4j dramatically changes
its network topology and has a significant influence on studying its propagation
 The Correction of Software Network Model based on Node Copy   251

characteristics. By using our node-copy method, the modified network model of log4j
includes correlations achieved by both static code analysis and dynamic binding, and
can be used to evaluate the global influence of each node effectively.
In this chapter, we use four samples to show that our node copy method can
effectively modify the class collaborative network and let it contain information
spreading paths generated by dynamic binding. To ensure that our method is correct
as well as meaningful, the samples we use contain both simple and complicated
correlations, and are of different scales. The changed spreading range of each node
sufficiently proves the correctness of our method.

Figure 13. The distribution of difference on in-degree

Figure 14. The distribution of difference on out-degree


252   The Correction of Software Network Model based on Node Copy

Figure 15. The distribution of difference on spreading range

5 Conclusions

As the basic feature of software, complexity has seriously restricted the control,
measure and maintaining of software, which lead to serious vulnerabilities, and thus
needs our attention. By using the theory of Complexity Network, researchers have
successfully achieved some critical information about software complexity.
As the first step of using Complexity Network, the building of network model
corresponding to software is of great importance. However, current types of network
model commonly present obvious disadvantages, as we have stated in chapter
2, which can seriously affect the analysis of global importance of each node. For
instance, although the class-collaborative network is of several advantages, it misses
the correlations between nodes generated by dynamic binding.
Thus, we design a method called node-copy to modify the current class-
collaborative network. By adding edges to current network according to certain simple
rules, our modified network model contains correlations between nodes generated by
dynamic binding indeed, as proved by using four samples in chapter 4. Since our
modified network model reflect the correlations between nodes (the structure of
software) accurately and extremely, it can be used to study the propagation statistics
of software and evaluate the importance of each node to the whole network.

Acknowledgment: This work was supported by National Key Research and


Development Project (Grant No. 2016YFB0800700).
 The Correction of Software Network Model based on Node Copy   253

References
[1] Okutan, Ahmet, and O. Taner Yıldız. “Software defect prediction using Bayesian
networks.” Empirical Software Engineering 19.1 (2014): 154-181.
[2] M. Yutao, et al. “A complexity metrics set for large-scale object-oriented software systems.” The
Sixth IEEE International Conference on Computer and Information Technology (CIT’06). IEEE,
2006.
[3] Gurtner, Gérald, et al. “Multi-scale analysis of the European airspace using network community
detection.” PloS one 9.5 (2014): e94414.
[4] Birnbaum, Robert, et al. “The Innovator’s Dilemma: When New Technologies Cause Great Firms
to Fail.” (2005): 80-84.
[5] Ciafrè, Silvia Anna, and Silvia Galardi. “microRNAs and RNA-binding proteins: a complex
network of interactions and reciprocal regulations in cancer.” RNA biology 10.6 (2013):
934-942.
[6] Sporns, Olaf. “The human connectome: a complex network.” Annals of the New York Academy
of Sciences 1224.1 (2011): 109-125.
[7] Cai, Kai-Yuan, and Bei-Bei Yin. “Software execution processes as an evolving complex
network.” Information Sciences 179.12 (2009): 1903-1928.
[8] Gao, Song, and Chunping Li. “Complex network model for software system and complexity
measurement.” Computer Science and Information Engineering, 2009 WRI World Congress on.
Vol. 7. IEEE, 2009.
[9] Zhang, Xizhe, et al. “Analysis on key nodes behavior for complex software
network.” International Conference on Information Computing and Applications. Springer
Berlin Heidelberg, 2012.
[10] Csardi, Gabor, and Tamas Nepusz. “The igraph software package for complex network
research.” InterJournal, Complex Systems 1695.5 (2006): 1-9.
[11] Ma, James, Daniel Zeng, and Huimin Zhao. “Modeling the growth of complex software function
dependency networks.” Information Systems Frontiers 14.2 (2012): 301-315.
[12] Gao, Song, and Chunping Li. “Complex network model for software system and complexity
measurement.” Computer Science and Information Engineering, 2009 WRI World Congress on.
Vol. 7. IEEE, 2009.
[13] Pan, Weifeng, et al. “Multi-granularity evolution analysis of software using complex network
theory.” Journal of Systems Science and Complexity 24.6 (2011): 1068-1082.
[14] Kleinberg, Jon M., et al. “The web as a graph: measurements, models, and
methods.” International Computing and Combinatorics Conference. Springer Berlin Heidelberg,
1999.
Wei-ping WANG*, Fang LIU
A Linguistic Multi-criteria Decision Making Method
based on the Attribute Discrimination Ability
Abstract: Attribute weights are the most important one in the linguistic multi-criteria
decision making problems. Based on the linguistic evaluation information of every
scheme’s attribute, the ability of each attribute to distinguish and evaluate schemes
can be inferred. This paper proposed a concept of attribute discrimination ability,
which can determine the weights of attributes. A decision making method based on
attribute discrimination ability is also given, which is feasible and can reduce the
dependence of the accuracy of score function.

Keywords: linguistic information; multi-criteria decision-making; attribute


discrimination ability

1 Introduction

It had been pointed out that [1], as a generalization of fuzzy sets [2], the notion of
vague sets proposed by Gau and Buehrer [3] was the same as that of intuitionistic
fuzzy sets presented by Atanassov [4]. The notions of membership function and non-
membership function can reflect the level of people’s understanding comprehensively
in three aspects: support degree, negative degree and uncertainty degree.
Vague sets and intuitionistic fuzzy sets have been applied to simulate human
decision-making processes and any activities requiring human expertise and
knowledge [5-8], which are inevitably imprecise or not totally reliable.
In the linguistic information multi-criteria decision making problems, because
of the complexity and fuzziness of the objective things, it is difficult to define the
attributes’ weights in the evaluation of the scheme set. That is, the weights of the
attributes are unknown. However, according to the linguistic attribute values given
by the evaluation people, to a certain extent, we can infer their understanding of
different attributes, and thus to determine the weights of the attributes.
In view of the attribute weights are unknown completely linguistic multi-criteria
decision making problems, this article put forward the thinking of how to measure the
attribute discernibility, and proposed a decision making method based on attribute
discrimination ability.

*Corresponding author: Wei-ping WANG, Dept. of Public Administration, University of International


Relations, Beijing, China, weipw@sina.com
Fang LIU, School of Mechanical and Electrical Engineering, Beijing Information Science &Technology
University, Beijing, China
 A Linguistic Multi-criteria Decision Making Method    255

2 Problem description

In the problem of linguistic multi-criteria decision making, the attribute weights are
unknown completely.
Suppose
A = { A1 , A2 , , Am } is a decision scheme set.
C = {C1 , C2 , , Cn } is an attribute set.
{ }
R = rij is a linguistic assessment matrix, where rij is the linguistic attribute
value of the scheme Ai on attribute C j .
In this paper, the eleven-level language index is used, known  as “absolutely
good (AG), very good (VG), good (G), fairly good (FG), slightly good (SG), median (M),
slightly poor (SP), fairly poor (FP), poor (P), very poor (VP), absolutely poor (AP)”.

3 Analysis of the influence of linguistic attribute values in the


scheme optimization

Using linguistic attribute values to determine attribute weights should consider the
following two aspects.

3.1 The number of linguistic evaluation indexes used

Generally, there are two main reasons if different schemes have the same value in a
certain attribute. One possibility is that these schemes do be roughly the same in this
attribute. Another possibility may be due to the evaluator’s limited understanding on
the attribute. He or she cannot compare these schemes only according to this attribute.
Whatever, the final decision makers should give a lesser weight to this attribute.
For example, if all the schemes are the same on a certain attribute, the attribute for
scheme comparison is not playing any role. It cannot be considered, i.e. zero weight.
Conversely, when evaluating schemes according to an attribute, the more
indicators are used, the stronger the ability of the attribute to distinguish schemes is.
For example, using five linguistic indicators “fairly good (FG), good(G), median(M),
poor(P) and very poor(VP)” could reflect the attribute difference of these schemes
exactly much than using only two indicators “median(M)” and “good(G)”. Particularly,
if all the schemes values are deferent in an attribute, thus all the schemes can be
distinguished by this attribute. The finally decision maker should pay more attention
to this attribute and give it greater weight.
256   A Linguistic Multi-criteria Decision Making Method

3.2 The biggest difference in linguistic attribute values

Even if the number of indicators used in the assessment of the two attributes is the
same, the discrimination abilities of these two attributes are not necessarily the same.
For example, when attribute C1 is evaluated, it used “good” and “fairly good” two
indicators. And when attribute C2 is evaluated, it used “absolutely good” and “poor”
two indicators. Obviously, the gap between “good” and “poor” is greater than “good”
and “good”. Therefore, it can be considered that the difference of schemes in attribute
C2 is greater than that of the attribute C1. So, it is reasonable to assign a larger weight
to the attribute C2.
To sum up, the number of evaluation indicators used to evaluate an attribute can
reflect but not determine the attribute discrimination ability. The real determinant is
the maximum difference among assessment values of the attribute.

4 Attribute Discrimination Ability

In the following discussion, the definition of attribute discrimination ability is given


firstly, based on which an approach of linguistic multi-criteria decision making is
proposed.

Definition 1 Let X be a space of points (objects), with a generic element of X denoted


by x. A vague set A in X is characterized by a truth-membership function t A and a
false-membership function f A . t A ( x ) is a lower bound on the grade of membership
of x derived from the evidence for x, and f A ( x ) is a lower bound on the negation of
x derived from the evidence against x. t A ( x ) and f A ( x ) are both associated a real
number in the interval [0, 1] with each point in X, where t A ( x ) + f A ( x ) ≤ 1 . That is

t A : X → [0,1] , f A : X → [0,1] .

This approach bounds the grade of membership of xi in the vague set A to a


subinterval [t A ( x ) ,1 − f A ( x )] of [0,1] . The uncertainty degree (hesitation degree) of x
to A can be evaluated by the uncertainty function π A ( x ) :

π A (x ) = 1 − t A ( x ) − f A ( x ) ,
where π A ( x ) ∈ [ 0,1] .

Definition 2 As for numerical vague value=


x [t ,1 − f ] , S = t − f is called a risk
neutral score function.

Definition 3 The vague values in Table 1 correspond to the eleven-level linguistic


index, and the corresponding score function values are also given.
 A Linguistic Multi-criteria Decision Making Method    257

Table 1. Vague values and score function values corresponding to the eleven level linguistic indexes

Grade Vague Value Score Function Values

AG [1,1] 1

VG [0.9,0.95] 0.85

G [0.8,0.9] 0.7

FG [0.7,0.85] 0.55

SG [0.55,0.7] 0.25

M [0.4,0.6] 0

SP [0.4,0.55] -0.05

FP [0.3,0.45] -0. 25

P [0.2,0.3] -0.5

VP [0.1,0.15] -0.75

AP [0,0] -1

Definition 4 In the linguistic multi-criteria decision making problems with completely


unknown attribute weights, the discrimination ability of attribute C j is defined as
= { }
D j max S ( vij ) − min S ( vij ) , (1)
1≤i ≤ m 1≤i ≤ m
{ }
Where vij is the vague value converted from a linguistic attribute value of the scheme
Ai on attribute C j , and S is the score function the finally decision makers selected.

5 Decision making method based on attribute discrimination


ability

STEP 1 Convert linguistic values rij of each scheme to vague values vij . Select an
appropriate score function S , and calculate the function value S (vij ) of each attribute.
In this paper, we choose the risk neutral scoring function S = t − f .

STEP 2 Calculate the discrimination ability of each attribute D j using formula (1),
according to which to assign the weight ωi of each attribute, where


n
ωi = Di j =1
D j , i = 1, 2, , n. (2)

STEP 3 Calculate the weighted score function values of each scheme, where

S ( Ai ) = ∑ j =1 ω j S (vij ), i = 1, 2, , n. (3)


n

STEP 4 Compare and select the optimal scheme according to S ( Ai ) .


258   A Linguistic Multi-criteria Decision Making Method

6 Case Analysis

Example Evaluate the performance of knowledge management of four enterprises in


a special economic zone. There are 15 main evaluation index shown as below:
C1 : Customer profitability;
C2 : Customer satisfaction rate;
C3 : The proportion of large customers;
C4 : Sales per customer;
C5 : Proportion of repeat orders and loyal customers;
C6 : Internal structure investment;
C7 : Investment in information technology;
C8 : The proportion of staff support;
C9 : Turnover rate of staff;
C10 : Support staff qualifications and the proportion of Lodge;
C11 : Knowledge employees’ seniority;
C12 : The education level of staff;
C13 : Proportion of knowledge workers;
C14 : Knowledge worker per capita profit;
C15 : Knowledge worker qualifications.
The assessment of the four enterprises with eleven linguistic indicators is shown as
in Table 2.

Table 2. Linguistic Assessment Form

Enterprise Attributes

c1 c2 c3 c4 c5 c6 c7 c8

A1
FG FG M M M G G VG
A2
G M FP M G VG G FG
A3
FG G VG VG FG FG FG G
A4 VG G G M G M G G

c9 c10 c11 c12 c13 c14 c15

A1
FG M M FG FG M SP
A2
SP M M SP SP M M
A3
FG M SP G FG FG FG
A4
M G M FG M G FG
 A Linguistic Multi-criteria Decision Making Method    259

Step 1 Convert the linguistic attribute values to vague values according to Table 1.
Assuming that the decision maker chooses the risk neutral score function S = t − f
to calculate each score function value, shown as in Table 3.

Table 3. The Scores of Linguistic Attribute Indicators And attribute discrimination ability

Enterprise Evaluation Index

c1 c2 c3 c4 c5 c6 c7 c8

A1
0.55 0.55 0 0 0 0.7 0.7 0.85
A2
0.7 0 -0. 25 0 0.7 0.85 0.7 0.55
A3
0.55 0.7 0.85 0.85 0.55 0.55 0.55 0.7
A4
0.85 0.7 0.7 0 0.7 0 0.7 0.7
Dj
0.3 0.7 0.85 0.85 0.7 0.85 0.15 0.3
ωj 0.034 0.080 0.098 0.098 0.080 0.098 0.017 0.034

c9 c10 c11 c12 c13 c14 c15

A1
0.55 0 0 0.55 0.55 0 -0.05
A2
-0.05 0 0 -0.05 -0.05 0 0
A3
0.55 0 -0.05 0.7 0.55 0.55 0.55
A4
0 0.7 0 0.55 0 0.7 0.55
Dj
0.6 0.7 0.05 0.75 0.6 0.7 0.6
ωj
0.069 0.080 0.006 0.086 0.069 0.080 0.069

Step 2 According to formula (1) and (2), calculate the attribute discrimination ability
D j , based on which the attribute weights ω j are obtained shown in Table 3 as below:
ω = (0.034, 0.080, 0.098, 0.098, 0.080, 0.098, 0.017, 0.034, 0.069, 0.080, 0.006,
0.086, 0.069, 0.080, 0.069)

Step 3 According to formula (3), the final evaluation values of schemes S ( Ai ) are
calculated by using of attribute weights and scores obtained in the above.
S ( A1 ) = 0.292 , S ( A2 ) = 0.183 ,
S ( A3 ) = 0.590 , S ( A4 ) = 0.442 .

Step 4 Obviously, S ( A3 ) > S ( A4 ) > S ( A1 ) > S ( A2 ) .


That is, A3 A
4 A1 A2 . So, the knowledge management performance of
enterprise A4 is the best.
260   A Linguistic Multi-criteria Decision Making Method

7 Conclusion

In the multi-criteria decision making problems, the determination of attribute


weights is the most important. This paper proposed the concept of how to measure
the attribute ability to distinguish the schemes according to the linguistic evaluation
information given by the evaluation person, and a method to assign large weights to
the attribute with strong discrimination abilities. This method is more feasible and
effective to reduce the dependence of score functions in the decision making process.

Acknowledgment: This paper is supported by the special funds of the basic scientific
research funds of University of International Relations.

References
[1] H. Bustince, P. Burillo, “Vague sets are intuitionistic fuzzy sets”, Fuzzy Sets and Systems, 79(3),
1996, pp.403-405. doi:10.1016/0165-0114(95)00154-9
[2] Zadeh L A., “Fuzzy sets”, Information and Control, vol.8, 1965, pp. 339-353.
[3] Gau WL, Buehrer D J.”Vague sets”, IEEE transactions on systems, man, and cybernetics, vol. 23,
1993, pp.610 - 614.  
[4] Atanassov K T, “Intuitionistic fuzzy sets”, Fuzzy Sets and Systems, vol. 20, 1986, pp.87-96.
doi:10.1016/S0165-0114(86)80034-3
[5] K. Atanassov, G. Pasi, R.R. Yager, “Intuitionistic fuzzy interpretations of multi-person multi-
criteria decision making”, in: Proceedings of 2002 first international IEEE symposium Intelligent
Systems, Sept. 2002, pp.115-119. doi: 10.1109/IS.2002.1044238
[6] E. Szmidt, J. Kacprzyk, “Using intuitionistic fuzzy sets in group decision making”, Control and
Cybernetics, vol. 31, 2002, pp.1037–1053.
[7] Weiping Wang, Qi-zong Wu, Hu Xiao-dong Hu, “New Score Functions Based on Risk Preference
in Vague Set Theory”, Proceedings of 2008 Chinese Control and Decision conference,
(2008CCDC), IEEE Press, Jul. 2008, pp. 2106-2111. doi: 10.1109/CCDC.2008.4597696
[8] Weiping Wang, “Linguistic Information Multi-criteria Decision Making Approach Based on Score
Functions”, Mathematics in Practice and Theory, vol. 43, Jul., 2013, pp. 98-103.
Part II: Computer Science and Information
Technology II
Gui-fu YANG, Xiao-yu XU, Wei-shuo LIU*, Cheng-lin PU, Lu YAO,
Jing-bo ZHANG, Zhen-bang LIU

The Reverse Position Fingerprint Recognition Algorithm


Abstract: In order to meet the special demand in the field of forensic science that
indoor positioning without prior authorization of the equipment, a new algorithm
reverse position fingerprint recognition algorithm is proposed in this paper. It
converts the positioning mode from active positioning to passive positioning. So
when it is used in the indoor positioning, we can fast and accurately positioning the
target without authorization. By doing measurement experiment in real environment,
the validity of the algorithm is verified. Besides, both the time for positioning and the
measuring accuracy conform to the needs of forensic science.

Keywords: indoor positioning; reverse position fingerprint recognition algorithm;

1 Introduction

Indoor positioning technology as a new technology gets more and more attention
because of the increasingly demand  in medical industry, disaster relief and other
fields. So there are a variety of ways to do distance measurement such as GPS [1],
Ultrasonic wave [2], Infrared RSSI signal strength [3], Wi-Fi wireless network [4].
With the development of positioning technology, various positioning systems
emerging, one of the more famous are A-GPS, Wave Lan Wi-Fi positioning system [5],
Active Bats positioning system [6]. These system mainly use the Triangulation algorithm [7]
or the Location fingerprint recognition algorithm [8]. However, both of the two algorithms
are implemented in localization need to get the authorization of target in advance.
To meet the demand of indoor positioning with Wi-Fi wireless network in the field
of forensic science, the reverse position fingerprint recognition algorithm is proposed
in this paper. On the basis of the fingerprint recognition algorithm the new algorithm
converts the positioning mode from active positioning to passive positioning.

*Corresponding author: Wei-shuo LIU, School of Computer Science and Information Technology
Northeast Normal University, Key Laboratory of Intelligent Information Processing of Jinlin Universities
Changchun, Jilin, China, E-mail: liuws214@nenu.edu.cn
Gui-fu YANG, Cheng-lin PU, Jing-bo ZHANG, School of Computer Science and Information Technology
Northeast Normal University, Key Laboratory of Intelligent Information Processing of Jinlin Universi-
ties, Changchun, Jilin, China
Xiao-yu XU, Material Evidence Identification Center of Jilin Provincial Public Security Department,
Jilin Provincial Public Security Bureau, Changchun, Jilin, China
Lu YAO, Changchun Railway Vehicles Co.,Ltd., CRRC, Changchun, Jilin, China
Zhen-bang LIU, Changchun Institute of Applied Chemistry Chinese Academy of Sciences Northeast
Chinese Academy of Sciences, Changchun, Jilin, China
264   The Reverse Position Fingerprint Recognition Algorithm

2 Materials and Methods

2.1 The Relationship of RSSI and Distance

In the transmission process of the wireless signal, the signal transmitting power PR
and the receiving power PT satisfies the following formula, as in (1).

PR = PT / r n . (1)
Where r is the distance between the signal transmitting end and the receiving end,
and n is the propagation factor. When use dBm as the unit of RSSI, this formula can
be converted to the following form, as in (2)
PR ( dBm ) = A − 10 ⋅ n lg r . (2)

Where A is the received signal value at a distance of 1 m. It can be found that in the
case of the invariant of the propagation factor n, RSSI shows a linear attenuation
state with the increase of the receiving distance. When the distance is near, the
signal attenuation is more obvious, and when the distance is larger than a certain
value, the attenuation of RSSI gradually becomes more and more slowly. There is a
definite relationship between the RSSI and the signal transmission distance. So the
relationship can be used for indoor and outdoor ranging and positioning in the real
environment [9].

2.2 The reverse position fingerprint recognition algorithm

The reverse position fingerprint recognition algorithm is divided into two stages: off-
line calibration and positioning.
1) Off-line calibration
The main work of this stage is to collect all the information that positioning stage
needs of the target area, and use the collected  information to build a database of
the relevant information corresponding to the physical coordinates. The procedure
employed by Off-line calibration consists of the following five steps.
a) Divide the target area (wireless network coverage) in the form of a grid as the
reference point of the database. All nodes of the grid are the target position need
to collect the relevant information. So the size of the grid has a great impact on the
database. If the size of grid is too small, it not only increases the workload of data
acquisition because of a doubling of measuring the amount of data, but makes the
database increases a lot without obviously advance of accuracy. On the other hand,
if the size of grid is too large, it causes the loss of accuracy of measurement [12].
b) Select at least three sets of data acquisition equipment as the observers, and
respectively fix them to any position in the target area. The observers (usually are
laptops) need to let their wireless network card in promiscuous mode.
 The Reverse Position Fingerprint Recognition Algorithm   265

c) Place a mobile device in the area as a test target. When the it is connected with
the Wi-Fi network covering the area, all the observers obtain and record the RSSI
values of the it,then save these data (in array form) and the corresponding position
information in database.
d) Move the test target in the scope of the area, place it all the nodes position on the grid
in proper order. Thus the observers can obtain all the RSSI values corresponding
all the nodes of the area. A location fingerprint database is formed by the location
of each array corresponding to a reference point. The location information and
fingerprint information corresponding relationship shown in Table 1.

Table 1. The corresponding relationship between Location information and fingerprint information

Location Information Fingerprint Information

(x1,y1) (RSSI1, RSSI2, RSSI3, …, RSSIn)

(x2,y2) (RSSI1, RSSI2, RSSI3, …, RSSIn)

(x3,y3) (RSSI1, RSSI2, RSSI3, …, RSSIn)

…… ……

(xn,yn) (RSSI1, RSSI2, RSSI3, …, RSSIn)

Where (x1,y1) is the horizontal and vertical coordinates of the reference point, and
the array(RSSI1, RSSI2, RSSI3, …, RSSIn) is the signal strength values obtained by the
observers.
e) According to the damping characteristics of signal optimize the database by using
the linear interpolation method [10]. Firstly, determine the interval of the linear
interpolation. Then, calculate the coefficient named α,by using the formula, as in
(3) or (4).

α = ( x - x0 ) / ( x1 - x0 ) (3)
α = ( y - y0 ) / ( y1 - y0 ) (4)
Finally, calculate the RSSI at this point, using the formula as in (5).

RSSI = RSSI 0 + α ( RSSI1 - RSSI 0 ) (5)


Using linear interpolation method to optimize the database, makes the number
of reference points much more than before. The new information and the actual
information is basically consistent so that it cannot influence positioning accuracy [11].

2) Positioning
The main work of this stage is to match the relevant information acquired in actually
positioning process with the information in the database, and then achieve the
positioning of the target device.
266   The Reverse Position Fingerprint Recognition Algorithm

According to the multipath characteristics of the Wi-Fi channel, when the target
device is located in different position in the area, the characteristic parameters of the
information acquired by the observers are single. If the characteristic parameters set
is exactly the same or very similar to the parameter set in the database, the position of
the target device can be mainly confirmed.
The euclidean distance between the two sets of characteristic parameters reflect
the similarity degree of them, when assume them as a number of orthogonal vectors
in the same multidimensional space. The similarity coefficient can be calculated
using the following formula, as in (6).
n

∑ ( RSSI − RSSI i )
' 2
r= i
i =1
(6)

Where r is the similarity coefficient, n is the number of observers, RSSI is the


information characteristic parameters in the database, RSSI 'is the information
characteristic parameters acquired actually. If the r close to or equal to 0, the similarity
of the two sets is high, else it is low.
In the indoor space, because the area is limited, the characteristic parameter
values of the reference nodes are close to each other, even if the two points that are
far from each other. In order to avoid this situation, using the method of calculating
the multiple point deviation to optimize and improve the algorithm in the actual
positioning process.
Each of the reference points has a number of adjacent reference points around
it. When calculate the similarity coefficient of collected information and a reference
points information, the deviation values of the signal strength between the adjacent
points and the reference point can be calculated as in (7).

∑ ( RSSI − RSSI i )
' 2
d= i
i =1 (7)

Where d is the deviation value, n is the number of observers, RSSI is the information
characteristic parameters in the database, RSSI 'is the information characteristic
parameters acquired actually.the value of d is smaller, the distance of between the
points and the reference point is closer.
After calculating the multiple point deviations, the method of calculating
similarity coefficient can be optimized as as in (8).
n
R= r + ∑ di (8)
i =1
Where R is the optimized similarity coefficient, n is the number of neighboring nodes
to calculate the deviation value. The smaller R is, the higher the similarity of the two
sets is.
 The Reverse Position Fingerprint Recognition Algorithm   267

By using this method, calculate and compare the R of the collected data with
the data in the database. The position of target is given corresponding to the set of
parameters with the highest similarity.

3 Experiment and result

3.1 Experiment

In order to verify the validity of the algorithm in Wi-Fi indoor positioning, design the
following experiment.
Chose an indoor hall with an area of about 250 m2 as the location area to be
measured and fix 3 observers in it. Divide this area use the grid with the ratio of 15*10.
Then, chose 37 nodes as the reference points to collect relevant information and save
the information to database. After the establishment and optimization of database,
place a mobile device in the area and use the observers to locate it. Compare the result
with the actual position coordinates and calculate the error distance.

3.2 Analysis results

Record the error distance of all positions. Through the record, draw on the pseudo
color error distance is as follows:
Figure 1 shows the error of the direct test results, and Figure 2 shows the error of
the linear interpolation.

Figure 1. Pseudo color map of direct test result


268   The Reverse Position Fingerprint Recognition Algorithm

Figure 2. Pseudo color map of linear interpolation

From the analysis of the data obtained from Figure 3,the error distance is mostly
concentrated in 0~0.03 m, the average error distance is 0.0363m.

Figure 3. Pseudo color map of error distance distribution

In this experiment the environment is more ideal, because there is no influence caused
by personnel flow and other factors. So the positioning accuracy may be lower in the
actual environment. But the results are still sufficient to determine the feasibility of
the algorithm.
 The Reverse Position Fingerprint Recognition Algorithm   269

4 Conclusion

Through the analysis of the basic theory and the actual experiment results verify the
reverse position fingerprint identification algorithm is applied in the indoor Wi-Fi
positioning. This algorithm meet the special demand in the field of forensic science. It
cost very little time for positioning with high positioning accuracy.

Acknowledgment: This paper is sponsored by Science and Technology Department


of Jilin Province with the project No.20140204031SF and. No.20130206097SF.
Here, I sincerely thank the support of the Jilin Provincial Department of science
and technology, make our research projects to be implemented.
In addition, thank you very much for the Public Security Bureau of Jilin Province.
The staffs of the criminal investigation department gave us a great help, so that our
work can be carried out smoothly.

References
[1] Azuma R. Tracking Requirements for Augmented Reality [J]. Communication of the ACM, 1993,
36(7):50-51.
[2] Girod L, Estrin D. Robust Range Estimation Using Acoustic and Multimodal Sensing [C]// Proc. of
the IEEE/ RSJ Int’l Conf.on Intelligent Robots and Systems. Maui:IEEE Robotics and Automation
Society, 2001,3:1312-1320.
[3] Bulusu N, Heidemann J, Estrin D. GPS-less Low Cost Out-door Localization For Very Small
Devices [J]. IEEE Personal Communications Magazine, 2000,7(5):28-34.
[4] Girod L, Bychovskiy V, Elson J, Estrin D. Locating Tiny Sensors in Time and Space:A Case St udy
[C]// Werner B,ed. Proc.of the 2002 IEEE Int’l Conf.on Computer Design:VLSI in Computers and
Processors. Freiburg:IEEE Computer Society, 2002. 214-219.
[5] Bahl P, Padmanabhan VN. User Location and Tracking in an In-Building Radio Network,
Microsoft Research Technical Report:MSR-TR-99-12, February 1999.
[6] Harter A, Hopper A. A Distributed Location System for the Active Office, IEEE, Network, January
1994.
[7] Jian Zhu,Hai Zhao,Jiuqiang Xu,Jing Wang.Error analysis of Triangle Localization Algorithm [N].
Journal of Northeastern University,2009,30(5):648-651.
[8] Hao Li.The fingerprint positioning technology[J].Shanxi Electronic Technology,2007,5:84-87.
[9] Zhen Fang,Zhan Zhao,Peng Guo,Yuguo Zhang.Analysis of Distance Measurement Based on
RSSI[N].Chinese Journal of Sensors And Actuators,2007,20(11):26-30.
[10] Li B., Wang Y., Lee H. K., et al. Method for Yielding a Database of Location Fingerprints in
WLAN[C], IEEE Proceedings, Oct. 2005, Volume 152, Issue 5, 7 Page(s):580-586.
[11] Yinglong Wang,Lianhai Wang.Principle and Implementation of Sniffer and Antisniffer [J].
Application Research Of Computers,2011,12:60-63.
Xiao-Lin XU
Global Mean Square Exponential Stability
of Memristor-Based Stochastic Neural Networks
with Time-Varying Delays
Abstract: In this paper, we study the global mean square exponential stability of
memristor-based stochastic neural networks with time-varying delays by the means
of Lyapunov function and itô formula. Meanwhile, one of the central ideas of this
paper is that the theory of differential equations about discontinuous right-hand
sides is applied. The proposed exponential stability criteria extend and improve some
existing works. A numerical example is given to verify the results.

Keywords: memristor-based neural networks; stochastic systems; mean square


exponential stability; nonsmooth analysis

1 Introduction

Memristor is a circuit element which was proposed by Chua in 1971 [1] and has been
realized as the prototype by Hewlett-Packard laboratory [2, 3]. With the development
and application of memristors, this new element has gradually be a hot topic and
has been studied and analyzed from all aspects of physics, materials science, and
computational intelligence.
From the view of theory, memristor is a nonlinear resistor which can manage
and store a great quantity of informations. For its excellent properties about memory,
it is valuable to give a fascinating insight into the memristive neural networks
dependenting on state by replacing classical resistors with memristors. Over the
years, a lot of novel works on memristive neural networks with or without delays have
been reported [4-12]. The analysis and design of memristive neural networks is still
quite incipient due to complex nonlinear behavior of such systems such as coexisting
solutions, jumped, and transient chaos. Whereas, the existing nonlinear analysis
method for memristive neural networks is incapable of dealing effectively with many
problems caused by the state-dependent discontinuity.
Noting further that for engineering system, stochastic effect is everywhere. In
fact, various types of environmental noises in the implementation of artificial neural
networks cannot be estimated, that is, random factor is unavoidable. Therefore, it
is meaningful to consider the stochastic perturbation to neural network models and

*Corresponding author: Xiao-Lin XU, College of Computer Science and Technology Hubei Normal
University Huangshi 435002, China, E-mail: 489428692@qq.com
Global Mean Square Exponential Stability of Memristor-Based Stochastic Neural Networks    271

investigate the dynamic behaviors of stochastic neural networks. In recent years, some
results on stochastic neural networks are reported, such as mean square exponential
stability, p th moment exponential stochastic synchronization, synchronization
and anti-synchronization, associative memory. However, the study on memristive
stochastic neural networks is ignored.
Motivated by the above discussions, our main aim is to explore the global
mean square exponential stability of memristor-based stochastic neural networks
with time-varying delays. Some very succinct algebraic criteria about global mean
square exponential stability are obtained by means of Lyapunov function and itô
formula. Meanwhile, by borrowing from the theory of differential equations about
discontinuous right-hand sides, the obtained stability criteria extend and improve
some existing works.
The remainder of this paper consists of the following sections. Section 2 describes
some preliminaries. The main results are represented in Section 3. Finally, conclusion
is stated in Section 4.

2 Model description and preliminaries

2.1 Model Description

Consider the memristor-based stochastic neural networks with time-varying delays


described by the following equations:
 n

dxi (t ) = [− xi (t ) + ∑ aij ( x j (t )) g j ( x j (t ))
 j =1

 n
 + ∑ bij ( x j (t )) f j ( x j (t − τ ij (t ))) + I i ]dt
 j =1

 +σ i ( x(t ), x(t − τ (t )), I (t ))dB(t ), t > 0,
 −
 =
x( s ) φ ( s ), s ∈ [−τ , 0], (1)

for i = 1, 2, , n, where n is the number of neurons in the networks, x i (t )


denotes the state variable of the ith neuron, I i is the external input, g i (⋅) , f i (⋅)
express the activation function at the time t and t − τ ij (t ) , respectively, time delay
0 ≤ τ ij (t ) ≤ τ . B(t ) represents a one-dimensional Brownian motion defined
on a complete probability space (Ω, Χ, Ρ) with a natural filtration Χ t t≥ 0 { }
{
( Χ t σ B ( s ) : 0 ≤ s ≤ t ),
= }
σ i ( x(t ), x(t − τ (t )), I ) : R n × R n × R n → R
means the diffusion coefficient, aij ( x j (t )) , bij ( x j (t )) are connection memristive
weights which are defined as

aij , x j (t ) > T j , bij , x j (t ) > T j ,
aij ( x j (t )) =   bij ( x j (t )) =   (2)
aij , x j (t ) < T j , bij , x j (t ) < T j ,
272   Global Mean Square Exponential Stability of Memristor-Based Stochastic Neural Networks

   
aij (±T j ) = aij oraij , bij (±T j ) = b ij orbij
, where switching jumps T j > 0 , weights
 
b
aij , aij , bij and ij are all constants.
{ }
 
m
{  
} m
In this paper, denote aij = max aij , aij , bij = max bij , bij ,
Am = ( aijm ) , B m = ( bijm ) .
1≤i , j ≤ n 1≤i , j ≤ n

n×n n×n

Remark 1 In view that aij ( x j (t )) and bij ( x j (t )) are discontinuous in (1), we


cannot employ the classical definitions of solutions for differential equations
here. To overcome this difficulty, a solution concept for differential equations with
discontinuous right-hand sides is introduced.

Let us recall the concept of Fillippov solution.


dx
Definition 1 For the system dt = g ( x) , x ∈ R , with a discontinuous right-hand
n

side, a set-valued map is defined


= as ψ ( x) co[ g (Β( x, δ ) \ N )] ,
 
δ >0 µ ( N )=
0

where co[ E ] is the closure of the convex hull of set E , Β( x, δ=


) {
y: y−x ≤δ , }
and µ ( N ) is a Lebesgue measure of set N . If x (t ) , t ∈ [0, T ] , is called the solution
in Filippov sense about the Cauchy problem for the system with initial condition
x(0) = x0 when it is an absolutely continuous function, and satisfies the differential
inclusion as follows:
dx
∈ψ ( x) , for a.e. t ∈ [0, T ] .
dt
For (1), defined
 the set-value maps as follows
bij , x j (t ) > T j , aij , x j (t ) > T j ,
   
 
K (bij ( x j (t ))) =
= { }
co bij , bij , x j (t ) = T j , K (aij ( x j (t ))) =co {aij , aij } , x j (t ) T j ,
 
 
bij , x j (t ) < T j , aij , x j (t ) < T j ,

for i, j = 1, 2, , n .
From the theories of differential inclusions and set-valued maps, (1) can be
represented in the following form as:
 n

dxi (t ) ∈ [− xi (t ) + ∑ K (aij ( x j (t ))) g j ( x j (t ))


 j =1

 n (3)
 + ∑ K (bij ( x j (t ))) f j ( x j (t − τ ij (t ))) + I i ]dt
 j =1

 +σ i ( x(t ), x(t − τ (t )), I (t ))dB(t ), t > 0,



x( s ) φ ( s ), s ∈ [−τ , 0],
 =

for i = 1, 2, , n, or equivalently, there exist ηija ( x j (t )) ∈ K (aij ( x j (t ))) ,


ηijb ( x j (t )) ∈ K (bij ( x j (t ))) such that
Global Mean Square Exponential Stability of Memristor-Based Stochastic Neural Networks    273

 n

 i dx ( t ) =[ − x i (t ) + ∑ ηija ( x j (t )) g j ( x j (t ))
 j =1

 n

 + ∑ ηijb ( x j (t )) f j ( x j (t − τ ij (t ))) + I i ]dt (4)


 j = 1

 +σ i ( x(t ), x(t − τ (t )), I (t ))dB(t ), t > 0.



x( s ) φ ( s ), s ∈ [−τ , 0].
 =

Throughout this paper, we assume:


(A1) The activation functions g i , f i , are Lipschitz continuous, that is, for any
u , v ∈ R , there exist positive constants Gi , Fi such that
gi (u ) − gi (v) ≤ Gi u − v , fi (u ) − fi (v) ≤ Fi u − v , (5)

for i = 1, 2, , n .

(A2) There exist a constant λi > 0 such that

σ i2 ( x(t ), y (t ), I ) ≤ λi ( xi2 (t ) + yi2 (t )) , (6)


for i = 1, 2, , n .
For each input vector I ∈ R n , x∗ ∈ R n is an equilibrium of (1) if

0 ∈ − x∗ + Ag ( x∗ ) + Bf ( x∗ ) + I .
n
That is, when system state of (1) is in position of any equilibrium vector u0 ∈ R ,
it holds σ (u0 , u0 , I ) = 0 .
the p -norm of vector v = (v1 , v2 , , vn ) is represented as
T
In this
n
paper,
1

v p = (∑ vi ) . C ([t0 , +∞), R ) denotes the space of r -order continuous and


p p r n

differentiable functions from [t0 , +∞) into =


i =1
R n . λmax max = {
λi , i 1, 2, , n . }

2.2 Properties

The definition of exponential stability for (1) is given.



Definition 2 Let x ∈ R be an equilibrium of (1). The equilibrium point x is said
∗ n

to be globally mean square exponentially stable if there exist two positive scalars κ
p
and such that
2 2
E x(t , t0 , x0 ) − x∗ ≤ p φ ( s ) − x∗ e −κ t , ∀t ≥ 0 ,
and κis called the convergence rate.
The system (1) is said to be globally mean square exponentially stable if its
equilibrium is globally mean square exponentially stable.
274   Global Mean Square Exponential Stability of Memristor-Based Stochastic Neural Networks

3 Main Results

Now, a basic lemma is given.


Lemma 1 [10] If g j ( ±T j ) = 0, f j (±T j ) = 0( j = 1, 2, , n) , then for any uj ,
vj ∈ R
K (aij (u j )) g j (u j ) − K (aij (v j )) g j (v j ) ≤ aijmG j u j − v j

K (bij (u j )) g j (u j ) − K (bij (v j )) g j (v j ) ≤ bijm Fj u j − v j

for i, j = 1, 2, , n , or in other words for any


ηija (u j ) ∈ K (aij (u j )),ηija (v j ) ∈ K (aij (v j ))
,
ηijb (u j ) ∈ K (bij (u j )),ηijb (v j ) ∈ K (bij (v j ))
, we have
ηija (u j ) g j (u j ) − ηija (v j ) g j (v j ) ≤ aijmG j u j − v j
,
ηijb (u j ) f j (u j ) − ηijb (v j ) f j (v j ) ≤ bijm Fj u j − v j
,

for i, j = 1, 2, , n .
Lemma 2 Assume that g j ( ±T j ) =
0 , f j (±T j ) =
0 ( j = 1, 2, , n) , then (1) has
a unique equilibrium point x if there exist constants θ i > 0(i =

1, 2 , n) such
that
1 n 
max 
1≤i ≤ n θ
∑θ j (a mji Gi + b mji Fi )  < 1 . (7)
 i j =1 
Proof. Define a mapping

H (u ) = ( H1 (u ), H 2 (u ), , H n (u ))' ,
n uj uj n uj uj
and H i (u ) = θi ∑ K (aij ( for i = 1, 2, , n , where
))g j ( ) + θi ∑ K (bij ( )) f j ( ) + θi I i
=j 1 = j 1
'
θ θ θj θj
u = (u1 , u2 , , un ) . Applying Lemma 1 for any
j j
= two vectors u (u1 , u2 , , un ) ' ∈ R n
' n
=and v (v1 , v2 , , vn ) ∈ R , we have
H i (u ) − H i (v)
n uj uj n vj vj
= θi (∑ K (aij ( ))g j ( ) − ∑ K (aij ( ))g j ( ))
=j 1 =
j j j 1 θ θ θj θj
n uj uj n vj vj
+ (∑ K (bij ( )) f j ( ) − ∑ K (bij ( )) f j ( ))
=j 1 =
j j j 1 θ θ θj θj
n uj uj n vj vj
≤ θi ( ∑ K (aij ( ))g j ( ) − ∑ K (aij ( ))g j ( )
=j 1 =
j j j 1 θ θ c jθ j c jθ j
(8)
n uj uj n vj vj
+ ∑ K (b (θ
j =1
ij )) f j (
θj
) − ∑ K (bij (
j =1 θj
)) f j (
θj
))
j
n
1
≤ θi ∑ (aijmG j + bijm Fj ) u j − v j ,
j =1 θ j
Global Mean Square Exponential Stability of Memristor-Based Stochastic Neural Networks    275

for i =n 1, 2, , n . Therefore, it follows from (8) that


n n
1
=i 1
∑H i (u ) − H i (v) ≤ ∑ θi ∑
=i 1 =j 1 θj
(aijmG j + bijm Fj ) u j − v j

n n
1
= ∑θ ∑ θ
j
=j 1 =i 1
(a mji Gi + b mji Fi ) ui − vi (9)
i
n
1 n

j
m
ji i
m
=
ji i
θ
∑θ (a G + b F )∑ ui − vi
=i j 1 =i 1
n
≤ γ ∑ ui − vi ,
where i =1

1 n 
=γ max 
1≤i ≤ n θ
∑ θ (a j
m
ji Gi + b mji Fi )  < 1 .
 i j =1 
From (9), we can see that H (u ) − H (v ) ≤ γ u − v , hence the mapping
H : R n → R n is a contraction mapping on R n , and there is a unique fixed point
u ∗ ∈ H (u ∗ ) , i.e., ∗ ∗ ∗ ∗
n uj uj n uj uj
ui∗ ∈ H i (ui∗ ) θi ∑ K (aij (
= ))g j ( ) + θi ∑ K (bij ( )) f j ( ) + θi I i
θ θ θj θj
=j 1 =
j j j 1
,

∗ u
for i, j = 1, 2, , n . Let xi = i
(i = 1, 2, , n ), then
n θ∗i n
0 ∈ − xi∗ + ∑ K (aij ( x∗j ))g j ( x j ) + ∑ K (bij ( x∗j ))g j ( x∗j ) + I i
,
=j 1 =j 1
for i = 1, 2, , n , this implies that (1) has a unique equilibrium point x . This

completes the proof.


Without loss of generality, we shift the equilibrium point of (1) to the origin by
the translation z=i (t ) xi (t ) − xi∗ ( i = 1, 2, , n ) in (1). Then for i = 1, 2, , n ,
 n n

dzi (t ) ∈ [− zi (t ) + ∑ Aij ( z j (t )) + ∑ Bij ( z j (t − τ ij (t )))]dt


 =j 1 =j 1 (10)
 +σ i ( z (t ), z (t − τ (t )))dB(t ),

s ) φi ( s ) − xi∗ , s ∈ [−τ , 0],
 zi (=


where
z=
i (t ) xi (t ) − xi∗ ,
=Aij ( z j (t )) K (aij ( x j (t ))) g j ( x j (t )) − K (aij ( x∗j )) g j ( x∗j )
,
Bij ( z j (t )) K (bij ( x j (t ))) f j ( x j (t − τ ij (t ))) − K (bij ( x∗j )) f j ( x∗j )
=
,
σ i ( z (t ), z (t − τ ij (t=
))) σ i ( x(t ), x(t − τ ij (t )), I ) − σ i ( x∗ , x∗ , I )
,

or there exist η ( z j (t )) ∈ Aij ( z j (t )) , ηijb ( z j (t )) ∈ Bij ( z j (t − τ ij (t )))


a
ij such
that
276   Global Mean Square Exponential Stability of Memristor-Based Stochastic Neural Networks

 n n

 dz i ( t ) =[ − zi (t ) + ∑ η a
ij ( z j (t )) + ∑ ηijb ( z j (t − τ ij (t )))]dt
 =j 1 =j 1

 +σ i ( z (t ), z (t − τ (t )))dB(t ), (11)

 zi (= s ) φi ( s ) − xi , s ∈ [−τ , 0],


from (A2), we have
2 2 2
σ i ( z (t ), z (t − τ (t ))) ≤ λi ( zi (t ) + zi (t − τ (t )) ) , (12)
for i = 1, 2, , n .
On the other hand, by Lemma 1, for any ηija ( z j (t )) ∈ Aij ( z j (t )) ,
ηijb ( z j (t )) ∈ Bij ( z j (t − τ ij (t ))) it follows that
ηija ( z j (t )) ≤ aijmG j z j (t ) , ηijb ( z j (t − τ ij (t ))) ≤ bijm Fj z j (t − τ ij (t )) , (13)
for i = 1, 2, , n .
Next, we give the main theorem.

Theorem 1 Let g j ( ±T j ) = 0, f j ( ±T j ) = 0( j = 1, 2, , n) . Then (10) is globally


mean square exponentially stable if there exist a positive constant κ > 0 and
positive definite diagonal matrix P = diag ( p1 , p2 , , pn ) ' such that
 (κ + λmax − 2) P + 2G j PAm Fj PB m 
 <0
 Fj P ( B m ) . (14)
T
λmax P 
 
Proof. Consider the Lyapunov function
n
V (t , z ) = eκ t ∑ pi zi2 .
i =1 +
Next we will prove that D EV (t , z (t )) < 0 .
By the itô formula, we have
dV (t , z (t )) 1V (t , z (t ))dt +  2V (t , z (t ))dB (t )
=
, (15)

where n
1V (t , z (t )) eκ t ∑ {µ pi zi2 (t ) + 2 pi zi (t )
=
i =1
n n
× [− zi (t ) + ∑ηija ( z j (t )) + ∑ηijb ( z j (t − τ ij (t )))]
=j 1 =j 1

+ piσ i2 ( z (t ), z (t − τ (t )))} ,
n
= 2V (t , z (t )) 2e µt
∑ p z (t )σ ( z (t ), z (t − τ (t ))) .
i =1
i i i

Integrating both side of the equality (15) from t to t+h with h > 0 , it yields
EV (t + h, z (t + h)) − EV (t , z (t )) 1 t + h .
≤ ∫ E1V ( s, z ( s ))ds
h h t
Global Mean Square Exponential Stability of Memristor-Based Stochastic Neural Networks    277

Letting h → 0+ , then D + EV (t , z (t )) ≤ E V (t , z (t )) .
1
so n n

1
κt
i
=i 1 =j 1
i
2
 V (t , z (t )) ≤ e
i i i ∑ {(κ p − 2 p ) z (t ) + 2 p z (t )∑ aijm G j z j (t )

n
+ 2 pi zi (t )∑ bijm Fj z j (t − τ ij (t )) + pi λi ( zi2 (t ) + zi2 (t − τ ij (t )))}
j =1

n  n
≤ eκ t ∑ (κ pi − 2 pi + pi λi + ∑ ( pi aijm G j + p j a mji Gi )) zi2 (t )
=i 1 =  j 1
n
+ 2 pi zi (t )∑ bijm Fj z j (t − τ ij (t )) + pi λi zi2 (t − τ ij (t ))}
j =1

  (κ + λmax − 2) P + 2G j PA   z (t )
m
 z (t ) Fj PB m 
T

≤ eκ t    
τ   Fj P ( B )   z (t − τ ij (t )) 
z (t − (t ))  m T
 ij λmax P 
< 0. (16)

By (14), it yields

1V (t , z (t )) < 0 .
Hence we get that D + EV (t , z (t )) ≤ E1V (t , z (t )) < 0 which implies
EV (t , z (t )) ≤ V (t0 , zo ) , i.e.,
2 2
E x(t ) − x∗ ≤ p max φ (θ ) − x∗ e −κ t , ∀t ≥ 0 ,
−τ ≤θ ≤ 0
where
p = (max { pi }) 2 .
1≤i ≤ n

Therefore, we can conclude that x (t ) exponentially converges in mean square to x∗


with convergence rate κ .
If (1) is of without delays, then we can change it to the following form:
 n

 i dx (t ) =[ − xi (t ) + ∑ aij ( x j (t )) g j ( x j (t )) + I i ]dt
 i =1

 +σ i ( x(t ), I (t ))dB(t ), t > 0.


 x(t ) = x .
 0 0 (17)

Corollary 1 Let g j ( ±T j ) = 0 ( j = 1, 2, , n) . Then (17) is globally mean square
exponentially stable if there exist positive constants κ > 0 and pi>0, such that for
i = 1, 2, , n
n
(κ + λi − 2) pi + ∑ ( pi aijmG j + p j a mji Gi ) < 0
j =1 . (18)

Proof. The proof of Corollary 1 is similar to Theorem 1, so we omit it here.


In the following, we discuss an illustrative example.
Example 1 Consider the following model:
278   Global Mean Square Exponential Stability of Memristor-Based Stochastic Neural Networks

dx1 (t ) = [− x1 (t ) + a11 ( x1 (t )) g ( x1 (t )) + a12 ( x1 (t )) g ( x2 (t ))


 +b11 ( x1 (t )) f ( x1 (t − τ (t ))) + b12 ( x1 (t )) f ( x2 (t − τ (t )))

 + I1 ]dt + σ 1 ( x(t ), x(t − τ (t )), I )dB(t )

 2dx (t ) =[− x2 (t ) + a21 ( x2 (t )) g ( x1 (t )) + a22 ( x2 (t )) g ( x2 (t ))
 +b21 ( x2 (t )) f ( x1 (t − τ (t ))) + b22 ( x2 (t )) f ( x2 (t − τ (t )))

 + I 2 ]dt + σ 2 ( x(t ), x(t − τ (t )), I )dB(t )
, (19)
x = ( x1 , x2 )T= g ( x) tanh( x1 − 1) = f ( x) tanh( x − 1)
where , , , and

 1, x1 < 1,
−  0.5, x2 < 1,
−  1.5, x1 < 1,
−
a11 ( x1 ) =  a12 ( x2 ) =  a21 ( x1 ) = 
0.5, x1 > 1, 0.1, x2 > 1, 1, x1 > 1,
1, x2 < 1,
a22 ( x2 ) =  10, x1 < 1,  1, x2 < 1,
−
0.5, x2 > 1, b11 ( x1 ) = −1, x > 1, b12 ( x2 ) = −10, x > 1,
 1  2

 1, x1 < 1,
− 10, x2 < 1.
b21 ( x1 ) =  b22 ( x2 ) = 
−10, x1 > 1, 1, x2 > 1.
Clearly, the conditions
1 n 
max 
1≤i ≤ n θ
∑θ j (a mji Gi + b mji Fi )  < 1
 i j =1 
 (κ + λmax − 2) P + 2G j PAm Fj PB m 
 <0
 Fj P ( B m )
T
λmax P 
 
hold, so from Theorem 1, system (19) is globally mean square exponentially stable.

4 Concluding Remarks

In this paper, the global mean square exponential stability for memristor-based
stochastic neural networks has been addressed. By means of differential inclusion and
Lyapunov function, some sufficient conditions on global mean square exponential
stability are derived. In the future, it is very interesting to study the multistability for
memristor-based stochastic neural networks.

Acknowledgment: The work is supported by the Research Project of Hubei Provincial


Department of Education of China under Grant T201430.
Global Mean Square Exponential Stability of Memristor-Based Stochastic Neural Networks    279

References
[1] L. O. Chua, “Memristor-the missing circuit element,” IEEE Transactions on Circuit Theory, vol.
18, no. 5, pp. 507-519, 1971.
[2] D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S. Williams, “The missing memristor found,”
Nature, vol. 453, pp. 80-83, 2008.
[3] J. M. Tour and T. He, “The fourth element,” Nature, vol. 453, pp. 42-43, 2008.
[4] A. L. Wu and Z. G. Zeng, “Danamic behaviors of memristor-based recurrent neural networks with
time-varying delays,” Neural Networks, vol. 36, no. 8, pp. 1-10, 2012.
[5] S. P. Wen and Z. G. Zeng, “Dynamics analysis of a class of memristor-based recurrent networks
with time-varying delays in the presence of strong external stimuli,” Neural Processing Letters,
vol. 35, no. 1, pp. 47-59, 2011.
[6] G. Bao and Z. G. Zeng, “Multistability of periodic delayed recurrent neural network with
memristors,” Neural Computing and Applications, vol. 23, no. 7, pp. 1963-1967, 2013.
[7] R.Rakkiyappan, G.Velmurugan, and J. D. Cao, “Finite-time stability analysis of fractional-order
complex-valued memristor-based neural networks with time delays,” Nonlinear Dynamics, vol.
78, no. 4, pp. 2823-2836, 2014.
[8] X. Wang, C. D. Li, T. W. Huang, and S. K. Duan, “Global exponential stability of a class of
memristive neural networks with time-varying delays.” Neural Computing and Applications, vol.
24, no. 7, pp. 1707-1715, 2014.
[9] Z. Y. Guo, J. Wang, and Z. Yan, “Global exponential synchronization of two memristor-based
recurrent neural networks with time delays via static or dynamic coupling,” IEEE Transactions
on Systems, Man and Cybernetics: Systems, vol. 45, no. 2, pp. 235-249, 2015.
[10] A. L. Wu, S. P. Wen, and Z. G. Zeng, “Synchronization control of a class of memristor-based
recurrent neural networks,” Information Sciences, vol. 183, no. 1, pp. 106-116, 2012.
[11] A. L. Wu, Z. G. Zeng, X. S. Zhu, and J. E. Zhang, “Exponential synchronization of memristor-
based recurrent neural networks with time delays,” Neurocomputing, vol. 74, no. 17, pp.
3043-3050, 2011.
[12] H. Q. Wu, R. X. Li, R. Yao, and X. W. Zhang, “Weak, modified and function projective synchro-
nization of chaotic memristive neural networks with time delays,” Neurocomputing, vol. 149,
pp. 667-676, 2015.
Zhen-hong XIE
Research on Classified Check of Metadata of Digital
Image based on Fuzzy String Matching
Abstract: In order to facilitate data sharing and updates, metadata must be checked.
Metadata of digital images is divided into three kinds according to its attribute value,
the numerical value, character and the empty data. Fuzzy matching string algorithms
is used to check each type of data. This method can check right, wrong or empty, and
can check the wrong type and show error messages and give the reasons for the error.
The correct messages are not treated to improve the efficiency of check. There is a
certain reference value in the practical application.

Keywords: metadata; fuzzy matching; string; check

1 Introduction

In order to collect the basic geographic information data and construct a database,
metadata must be conducted a comprehensive examination including metadata
inspection. If there is a digital imaging standard, we can check metadata for template
matching based on standards [1]. As there are no strict criteria, digital image metadata
in acquiring generates a lot of problems. For there are no standard specification, the
author has proposed this classification method and checked metadata using fuzzy
string matching. In this paper, checked metadata are National Geomantic Center of
the DOM-AP metadata 106, DOM-SPOT and DOM-TM metadata 68.

2 The status of metadata check of digital image

It will cause a lot of mistakes when we get metadata of digital image. First, the image
itself can cause metadata mistakes; secondly, there may be man-made errors, such as
an input error and a different format issues in the absence of a strict standard. Based
on the above reasons we have to check metadata. Taking into account the requirements
of building a database, we must form a uniform standard of data making it easy for
data management and sharing and data entry comply with these standards [2].
The different needs of different units and the rapid development of data-
processing tools have given rise to different meta-data in content and processing
approaches. Existing inspection methods of digital image meta-data are relatively

*Corresponding author: Zhen-hong XIE, School of Surveying and Prospecting Engineering, Jilin
Jianzhu University, Changchun, China, E-mail: zhenhongl@163.com
Research on Classified Check of Metadata of Digital Image based on Fuzzy String Matching   281

single and manual. Metadata checks at home and abroad have relatively little research;
the reason that is very simple, meta-data is itself a text file. But as the amount of data
increases, the efficiency of manual inspection obviously does not meet the requirements
of large volumes of data, and it is very easy to make mistakes, so automatic inspection
of meta-data is necessary.

3 the contents checked of matadata of digital image

According to several major metadata standard systems and the existing data
requirements, only for inspections to meet the experimental data presented in this
meta-data inspection of a digital image, I give a reference content checked to intend
to achieve the rapid tests and meta-data sharing and retrieval. According to available
data, the author summarized as follows [3]:
1. Identification information: describing the data identification information and
providing the name of the image, data sources, data number, ground resolution,
equivalent scale denominator, sampling intervals, etc.
2. Data quality information: a description of data quality and providing information of
the image quality evaluation.
3. Spatial data descriptions: a description of the data content that is the core of image
metadata.
4. Spatial reference information: description of the relevant coordinate system.
5. Metadata Reference Information: the reference information of the preparation of
meta-data required.
6. Limited information: the use of restriction / access restrictions.
7. Release Information: online information, contacts, order instructions.

The firstly four items is necessary to be chosen, for which attribute values must be filled
in. The user can select or increase and allow the default for some items.

4 Digital Image Metadata Classification

4.1 Numerical ValueTtype

1. The date and time format. For exemple, 200010, 1999 year, 1991-1999, 199807, 10,
here, “year” belong to the numerical model due to no affection the in check. A
special symbol “-” “” “/ “ and Space can be ignored [4].
2. A fixed value. For exemple,Ellipsoid long-radius (6378140.0000) and ellipsoid flat
rate (1/298.257) are floating-point type. Scale denominator (50000) can be regarded
as an integer data.
282   Research on Classified Check of Metadata of Digital Image based on Fuzzy String Matching

3. Non-standard version. Writing on the unit or no writing unit of its value that is not
affected. For example, after value for years, m, m, mm, pixels, degrees.
4. Fixed length. Zip Code: 100044 (6), Telephone No. 01,087,654,321 (11 bit), longitude
and latitude (DDDMMSS) which has its fixed length.
5. Range of values. For example, Six times in our country is in a range of 13-23 degrees.
6. Character Type
7. A fixed value. Subordinate units such as product name: National Mapping
Bureau,product units, etc.
8. Fixed length. For example, Figure number (9), the whole side of the name of King
images (11 bit).
9. Optional value. Such as the Edge conditions (Received, Missed, free), integrity
(complete, incomplete), Security Level (secret, secret, top secret, internal),
elevation system name (normal high, geodetic height), Image color (color, black
and white, monochrome), then edge quality assessment (excellent, good, qualified,
unqualified).

4.2 Empty and Non-empty Type

1. The necessary items (its property values in general is not empty). A long radius of
ellipsoid, ellipsoid flat rate, figure number, spatial data describing information
(image height, image width, ground resolution, longitude range of map profile
corner, latitude range of map profile corner), aerial dates, data quality information
(data accuracy, integrity), data received, the identifying information (product
name, the main data source).
2. The necessary items under certain conditions (units may choose to complete attribute
based on the different requirements of the various units). In this paper, items must be
filed under certain conditions according to the given data: reference information,
scale denominator, height datum, geodetic datum, the central meridian, numbers,
coordinate units, projection methods, spectral resolution, band numbers,
elevation system name, image color, grid spacing, data acquisition methods and
instrumentation.
3. Optional items (unit name, access methods, production, publishing, etc.). You can
choose or may not choose [5].

5 Digital Image Metadata inspection methods

5.1 Non-empty Attribute Value Inspection Method

Some attribute values of this project may be empty that is meaningless. If necessary
items are empty that is judged as wrong.
Research on Classified Check of Metadata of Digital Image based on Fuzzy String Matching   283

5.2 Frequency Method

It refers to the value in a file appearing more frequently. Most of the field values appear
once, but only a few values will be repeated. In this, there is a duplication of inspections
(Subordinate units and publishing units are the National Mapping Bureau), others have
emerged once [6].

5.3 Fixed-length Method

The length of the value is fixed, If the length is not compared with their set, it is an error.
For example, Figure number (9), longitude and latitude (DDDMMSS).

5.4 Optional Attribute Value Method

It refers to an attribute value listed. It is the wrong that the checked value is not in the
values. For example, integrity (complete, incomplete), elevation system name (normal
high, the earth high), then edge quality assessment (excellent, good, medium, qualified,
unqualified).

5.5 Property Value Range Method

Property value is in a certain range. It is wrong that the value is not in the range. For
example, Figure contour longitude and latitude range (1120000-1121500, 0375000-
0380000), Gauss - Kruger with a number (6 degrees with 13-23).

5.6 Accuracy Test Method

If precision of value is required two decimal places, while the actual precision is
one decimal places or integer values, which are considered inaccurate. Precision
requirements are different according to use.

6 Fuzzy string matching algorithm described

If the database contains pattern strings, matching the result must contain pattern
strings, that is fuzzy matching with exact match result; When the first part of the target
strings match with pattern strings under the constraints of similarity, the entire target
string returns as a result. Similarity values for the [0, 1], when the similarity value is
284   Research on Classified Check of Metadata of Digital Image based on Fuzzy String Matching

closer to one, matching the results is the more similar. When the value is one, fuzzy
matching matches exactly. In the algorithm, we convert similarity into departure. When
the departure beyond the preset value in Matching process, matching must be given up,
and starts a new search [7].
Supposing the dictionary entries have been stored in a large array of strings Record
[K] [s], where K is the number of entries, s is the maximum string length of each entry.
Result [] [s] save the match results.
Input: Entry array Record [] [s], mode string T, similarity sim.
Handling: the fuzzy matching is in the Record [K] [s], and obtains all the matching
entry with the model of string similarity sim.
Output: Result [ ] [s]. When the set is empty, there is no match.

7 Check and results of Check

As the examination of empty and non-empty type exists in the numeric and character,
empty and non-empty types have also been carried out inspection in checking numerical
types and characters.

7.1 Numeric attribute value check

1. The time and date format check. For example, in the examination, the year is written
four digits, and the month is written with two digits. Sub-string matching can be
required and not be required an exact match.
2. A fixed value of the numeric check. The floating-point value is examined in its
accuracy to see whether it can satisfiy the requirement of accuracy; integer value is
check -ed out whether it is an integer, if that is not an integer, the program returns
an error, if it is an integer, then program checks whether its value is correct. The
main string and the substrings are equal. An error is returned as long as a character
does not match in the matching time. Such as the Central Meridian check: 111.
3. Check of the data for the non-standard written. The type should be followed by the
corresponding number of units, but if there is not to write unit that it does not affect
the correctness of its value.
4. The range check of numerical values. Check to see if the value is in specified range.

7.2 Character attribute value check

1. A fixed value check. For example, geodetic datum(1980 Xi’an coordinate system)
may be such a case as 1980; Xi’an coordinate system is naturally associate to be
correct.
Research on Classified Check of Metadata of Digital Image based on Fuzzy String Matching   285

2. The optional value check. For example, it is wrong that edge quality evaluation is
not in the values.

8 Conclusions

In this paper, the digital image metadata is classified and checked under the premise
of absence of a standard, classified check method which gives a non-empty attribute
value inspection method, frequency method, fixed-length method, optional attribute
value method, property value Range method, accuracy test method, which uses fuzzy
matching strings to improve the accuracy of the inspection. Compared with template
matching check method (means that metadata compared with the standard template
and gave outcome,which can only complete “right”, “error”, “empty” inspection),this
method may complete inspections not only to “right”, “error” and “empty” but also to
type of wrong, such as “precision is not enough”. This method only prompts an error
message and prompt the causes of the errors, the correct information is not treated;
thereby the efficiency of examination is greatly improved.

References
[1] Z.H.Xie,X.Liu,X.Q.Yu,and T.Zhang, “Method for metadata check of digital orthophoto maps,”J.in
Geospatial Information, vol.3,pp.47-49,Jun 2008.
[2] G.X.He,Z.Y.He,and X.Yu “Research on Spatial Data Quality Check System,”J. in Geospatial
Information, vol 2, pp.20, April 2004.
[3] S.D.Wang, J.H.Ren,“A Study of service system based on Metadata for web publicationof city
remote sensing image,”J.in Journal of Handan Vocational and Technical College,vol.17,pp.46-
47,March 2004.
[4] GeographicInformation-Metadata,S.GB/T19333.15-200X/ISO 19115:2003,pp.32.
[5] D.F.Gao,The “Standard research of Geographic Information Metadata,”C.in Beijing Science
Press, 1999, pp.29-35.
[6] Y.W.Zeng, “Research on Spatial Data Quality Control and Evaluation Technique System,”D.in
PhD theis,Wuhan University,2004,pp.84..
[7] J.C.Pan,Y.H.Sun,and Y.M.Xu, “The Implementation of an Easy Fuzzy Matching Algorithm,”J.in
Signal Processing and Pattern Recognition,vol. 3,pp.131-132,2006.
Shao-nan DUAN, Yan-jie NIU*, Yao SHI, Xiao-dong MU
Quantitative Analysis of C2 Organization
Collaborative Performance based on System
Dynamics
Abstarct: To measure the collaborative performance between C2 organizations, a
system dynamics method was used to build a regulation and control index model of
C2 organization in this article. Secondly, the computational method of C2 organization
collaborative performance was proposed by bringing in collaborative efficacy
parameters. In addition, the collaborative performance on the condition of different
collaborative influence parameters and different collaborative net structures were
analyzed by simulated analysis.

Keywords: C2 Organization; Collaborative Performance; System Dynamics;


Quantitative Analysis.

C2 (Command and Control) organization is a military command and control


organization for completing mission operations. It is a significant and nuclear
problem for effectively measuring the C2 organization collaboration efficacy. In recent
years, many countries military forces highly focused on this problem and considered
C2 organization collaboration as every C2 organizations cooperating to complete
missions and tasks. The collaboration among combat platforms and soldiers in
battles could be abstracted into C2 organization collaboration.
Performance of C2 organization collaboration should be judged with quantitative
index. In order to assess C2 organization total performance, most of studies [1,2] now
used an evaluation index system to figure out the system performance by analyzing
or testing. For example, the C2 organization performance index system put forward in
the “the Agility Advantage” by David Alberts [3]. The collaborative quantification has
been conducted in some researches [4-7] which had showed entropy may be used to
analyze the effectiveness measurement of cooperative engagement in network centric
operation.
At present, there is no effective quantitative analysis methods of C2 organization
collaboration. So it is necessary to solve the quantitatively analysis problem of
the performance of C2 organization quantitatively analyzing the influence of
effectiveness from each major factor may be helpful for researchers to optimize the
C2 organization collaborative performance specifically in designing and organizing.

*Corresponding author: Yan-jie NIU, PLA University of Science and Technology Nanjing, 210007, China,
E-mail: niuyanjie@126.com
Shao-nan DUAN, Yao SHI, Xiao-dong MU, PLA University of Science and Technology, Nanjing,
210007, China
 Quantitative Analysis of C2 Organization Collaborative Performance    287

Some methods from the literature, including multi-individual cooperative control [5],
NCW collaborative and system dynamics [6,7] were available. Thus, in this article,
the collaborative ability of different C2 organization net structures was quantitatively
calculated using C2 organization net collaboration performance assessing index,
which could intuitively show the comparison of regulatory capacity and collaborative
performance among different net structures.

1 C2 organization collaboration

Cooperative engagement is an inexorable trend of nature operation development,


and collaboration means gathering 2 or more individuals working together who are
locating in different areas, in other words, it means that operational headquarters
build operation “alliance” which consists of combat units under the rules of certain
structures, according to the requirement of battlefield and mission operations, to
complete mission operations. This alliance consists of combat units and operational
headquarters is C2 organization, and it is able to perceive, decide and act.
The response time is different among every C2 organization for different operation
missions. The condition of building C2 organization is that every C2 organization
could get into collaboration within the response time. For an operation mission, if
the response time is confirmed, the calculation of the whole C2 organization, which is
collaborative according to the parameters like network structure, could be executed.
On the other hand, it can confirm the range of the effective parameters by calculating
the range of the response time of each C2 organization and the requirement of reaching
collaboration to the network structure.
In the aspect of time, collaboration means every C2 organization response to a
certain mission would tend to be uniform, so the closer the reaction processing time of
every C2 organization completing the operation mission is, the better the collaborative
performance of C2 organization net is. According to real mission requirement, this
kind of C2 organization could be regarded as being collaborative that the reaction
time difference of every node is no longer than the threshold of reaction processing
time which is set according to a kind of C2 organization net.
∀i, j max Ti − T j ≤ M (1)

Ti means the time of the number of i C2 organization completing the mission; M


means the threshold, which is valued according to the emergency degree of mission.
Therefore, through the calculation of known C2 organization net collaborative
performance, the merits of this organization collaborative performance could be
Figured out, then whether this C2 organization net reached collaborative could
be judged. At the same time, analyzing the factors of affecting and deciding C2
organization collaborative performance, and improve the design, could reach the
goal of enhancing this C2 organization collaborative performance.
288   Quantitative Analysis of C2 Organization Collaborative Performance

2 Collaborative performance quantitative model

The collaboration defined in this paper implies that the C2-node in the network reacts
uniformly to the same event, or the response time difference is within the allowable
range. If the response time difference between a node and a given reference node
exceeds the threshold, the node will not be considered as cooperative. The threshold
is set according to the specific task requirements. The higher the cooperation
performance required by the C2 organization network, the smaller the threshold is
set.
A regulation and control index is defined to represent the collaborative
performance of the C2 organization network. This index is an evaluation index of a
single C2 organization node, which shows the regulation and control capability of a
single C2 organization in the network so that it can be consistent with other nodes in
the network.
In order to compare the response speed of nodes to affairs, it is stipulated that all
nodes have the same time starting point. When an affair occurs, all nodes begin to
process the affair. Set the affair period equal to 2π . Set a reference node to divide the
θ
period into several parts. i(T ) is the angle which node i passes in the divided section,
the initial
β angle is 0 and the angle at the end of the affair is 2π .
i(T ) means the regulation and control index of node i in the number of T section.
According to the system dynamics model, the regulation and control index model of
a C2 organization is:
N
β = 2πω + σ ∑ Aij sin θ −θ
i(T ) i(T −1) j =1 i(T −1) j (T −1) (2)
ωi(T ) -- The frequency in the section of T node;
ωi(0) -- The inherent frequency of node i , which is decided by the character of each
node in the network;
σ = k N -- The coupling coefficient of node, shows the degree of cooperating and
mutually affecting among nodes;
k -- The number of periodic line, there will be a periodic line if two nodes connecting
with each other;
N -- The number of node in the network;
Aij
--The adjacency matrix of network;
θ
i(T ) --The angle that node i passed in the section T.

Two-node-network is the simplest structure, now we will set two nodes A, B to explain
regulation and control index model. Its coupling coefficient equals to 1/2, the coupling
0 1
matrix is Aij = 1 0 . Set node A as referential node, and divide the whole affair period
π
into 4 section, in section (0 − 2 ) , original regulation and control index of node A as

β = 2πω (3)
A(1) A(0)
Original regulation and control index of node B as
 Quantitative Analysis of C2 Organization Collaborative Performance    289

β = 2πω
A(1) A(0) (4)
π
The time of node A reaches 2 is
t =θ / 2πω
A(1) A(1) A(1) (5)

Angle of node B is
θ = 2πω t
B(1) B(1) A(1) (6)

According to formula (1), and the regulation and control index of node A and node B
could be known. And we also could know when arrival at 2π happens, the time node
A spent and the angle of node B.
In the condition of setting node A as referential node, node A and node B have a
time synchronization, in one case, node A completed an affair period before node B.
In this situation, node A has already finished a period while node B haven’t. Node B
is in the nth section. In the rest angle node B still deal with affairs with the regulation
and control index of the section. For node A was chosen to be the referential node,
time difference is:
∆=t (2π − θ − −θ ) / β
B(1) B(n) B(n) (7)

Another case is when node B finished a period, referential node A still not finish
dealing with the affair. Node B is in the nth section when totally finished dealing with
the affair, the time of node B is:
2π − θ −θ − θ
B(1) B(2) B(n−1)
t= t +t ++ t + (8)
B B(1) B(2) B(n−1) β
B(n)
The time difference is:
n
=∆t ∑ t −t
T =1 A(T ) B (9)

For node time difference, according to C2 organization mission requirement, the


threshold of affair difference can be set down. If the time difference of a node is larger
than the threshold, the node should be considered as not able to be collaborative.
Multi-C2 organization node-network, defined t as the average time of all the
nodes dealing with affairs in the net, it is also an index shows the collaborative
performance. By comparing the average time among different C2 organization net,
we could find out which is faster in dealing with affairs, then we could judge the C2
organization networks. Defined ∆t as the average value of time difference between
every node and referential node.
In order to make the calculation easier, we use one node regulation and control
index to approximate the index of the whole section. Theoretically, the smaller the
section and the larger the number of sections, the closer the effect of the network to
the node is, and the theoretical results will be closer to the actual value.
In the simple network of two nodes, this article choose a node as referential node.
In the C2 organization network with multiple nodes, this article choose a node from
290   Quantitative Analysis of C2 Organization Collaborative Performance

the nodes handling the same event as the referential node. After working out all the
time difference between all the nodes and the referential node, then we can calculate
the average time difference of all the nodes. On the basis of the comparison between
time difference and the threshold, we could judge whether the node could reach
collaboration, if the time difference smaller than the threshold, then we consider that
the node is collaborative, else not. In reality, we take some measures to make those
nodes which still not get into collaboration collaborate with others. Or delete those
nodes.
In modern war, every weapon platform and unit is C2 organization node in
battlefield, they are connected with each other by many kinds of data chain, and carry
out information interaction and transmission. Keep the same judgment and dealing of
affairs in battlefield with each other, those affairs could be the instruction received by
the nodes together, or the battlefield situation real-time received by the nodes, for all
the nodes, the best situation is that the time of dispose to the affairs is the same with
each other, and this is collaboration. Network collaborative performance has a great
key factor—topology of the net, which means the connection method of all the nodes.
How this method will affect the total C2 organization collaborative performance is the
main content of the article.

3 Instance analysis

Because of the differences between the weapon platform and the operating unit in
the real battlefield, and the different battlefield environments in which all nodes are
located, these lead to different characters of the nodes, including inherent frequency.
To simulate the stochastic character of the nodes in the battlefield, the inherent
frequency of every node here is random selection.

3.1 Tree Network C2 Organization Collaborative Performance

Tree net is the commonest topology in many net topology, especially in command and
control network. This kind of net usually possesses one center node, and net structure
is layered. And the center node is set for controlling and adjusting. Topology of C2
organization tree net is showed in the Figure 1, and its adjacency matrix is showed in
the Figure 2.
There are seven nodes in this network, the coupling coefficient is 6/7, and inherent
frequency of all the node is showed in the Table 1.
Set node 1 as the referential node to calculate the collaborative performance of
this net, the inherent frequency of node 1 is 1/100, unit of time is second, which means
that node 1 spends 100s to pass an affair period. In the first π /2 section divided
1
before, regulation and control index of node 1 is β1-1 = * 2π .
100
 Quantitative Analysis of C2 Organization Collaborative Performance    291

Figure 1. Tree network

Figure 2. Tree network adjacency matrix

Table 1. Inherent frequency of all the nodes

node 1 2 3 4 5 6 7

The initial 1 1 1 1 1 1 1
frequency
100 95 98 96 104 94 100

Considered the scale of the C2 organization network, set the threshold of this
organization collaborative performance equals to 5. In reality, valuation of threshold
need consideration of the network scale, and the mission requirement and requirement
of reaction time from all the nodes. By calculation, we could work out the regulation
and control index of every node in every section, and the angle that all the nodes
passed in the time referring to node 1. Then calculate to acquire time differences, the
results are showed in the Table 2.
292   Quantitative Analysis of C2 Organization Collaborative Performance

Table 2. Calculation result of tree network c2 organization collaborative performance

node θ1 θ2 θ3 θ4 t Δt

2π π π π π
1 0.1612 1.0603 2.0165 37.0048
100 2 2 2 2


2 0.2740 2.5180 4.3384 1.6535 2.6700 3.7301 2.9057 26.5125 10.4923
98


3 0.1775 0.8660 1.6709 1.6029 1.7296 1.2829 1.3016 37.2260 0.2180
95


4 0.0803 0.8948 1.4709 1.6362 0.7825 1.3256 1.1458 37.9515 0.9471
96


5 0.1827 0.8486 1.3799 1.5104 1.7803 1.2571 1.0749 37.4834 0.4786
94


6 0.1253 0.5427 0.9377 1.6711 1.2210 0.8040 0.7305 38.9848 1.9800
104


7 0.0903 0.7340 0.9006 1.5708 0.8800 1.0873 0.7016 39.2738 2.2690
100

Time of node 1 is:

t =t +t +t + = 37.0048
A A(1) A(2) A(3) A(4)
The average time of the tree disposing the affair
t + t +t + t + t + t +t
t = 1 2 3 4 5 6 7 = 36.3481
7

The time differences between the time of other 6 nodes disposing the same affair
and the time of referential node is 10.4923, 0.2180, 0.9471, 0.4786, 1.9800, 2.2690. The
average time difference is:
∆t + ∆t + ∆t + ∆t + ∆t + ∆t
∆t = 2 3 4 5 6 7 =2.8010
7

We can find out from it that time difference of node 2 is larger than the threshold,
so node 2 is not able to be collaborative in this C2 organization net. With those time
differences we can Figure out an average time difference, this time difference can
intuitively show the collaborative performance of this C2 organization network. The
lower the average time difference is, the better that the collaboration among nodes
in the C2 organization net is, which means that the uniformity of disposing the same
affair is kept better.
 Quantitative Analysis of C2 Organization Collaborative Performance    293

3.2 Reticular Network C2 Organization Collaborative Performance

Different C2 organization network collaborative performance is mainly affected


by the way of connection among nodes, in other words, topology. To analyze how
different topology affect C2 organization collaborative performance, we calculated
collaborative performance of another net topology to analyze how different topology
affect C2 organization. C2 organization reticular structure is showed in Figure 3, and
its adjacency matrix is showed in the Figure 4.

Figure 3. Reticular network

0 1 1 1 0 0 1
1 0 0 1 1 0 0 

1 0 0 0 0 1 1
 
A = 1 1 0 0 1 0 0
0 1 0 1 0 1 0
 
0 0 1 0 1 0 1
1 0 1 0 0 1 0 

Figure 4. Reticular network adjacency matrix

There are seven nodes in the net, inherent frequency of all the nodes is the same
with the inherent frequency of tree net, it is showed in chart 1. As the connection
between the nodes increases, the coupling coefficient of the network becomes larger.
The coupling coefficient in Figure 3 is 11/7.
Set node 1 as referential node again to calculate the collaborative performance of
this net, for node 1, its inherent frequency is 1/100, unit of time is second, which means
that node 1 spends 100s to pass a period. Calculation result is showed in Table  3.
294   Quantitative Analysis of C2 Organization Collaborative Performance

After calculating we can work out the time of node 1, t A =30.6083. The average
time of this reticular net disposing affair is: t =30.3353. The time differences between
the time of other 6 nodes disposing the same affair and the time of referential node
is 0.2607, 0.1571, 0.1099, 0.7654, 0.6458, 0.0279. The average time difference is:
∆t =0.3278.
After verification we can find out that time difference of nodes disposing the same
affair in the reticular net is lower than the set threshold 5, which means that under
the mission requirement, all the nodes in the net are able to become collaborative,
and the time of every node disposing affair is very close, which shows that nodes
keep a good uniformity on reacting to the affair in the net, in other words, the C2
organization net has a good feature of collaboration.

Table 3. Calculation result of reticular network c2 organization collaborative performance

node θ1 θ2 θ3 θ4 t Δt

2π π π π π
1 0.3458 1.9466 6.0701 30.6083
100 2 2 2 2


2 0.4472 3.2272 6.7955 1.6535 2.0314 2.6043 1.7585 30.3476 0.2607
98


3 0.2721 3.3505 7.2797 1.6029 1.2360 2.7039 1.8840 30.4512 0.1571
95


4 0.3658 2.6402 5.7392 1.6362 1.6616 2.1306 1.4853 30.4984 0.1099
96


5 0.7331 4.8025 8.0079 1.5104 3.3300 3.8756 2.0724 29.8429 0.7654
94


6 0.5827 4.6790 7.7165 1.6711 2.6469 3.7760 1.9970 29.9625 0.6458
104


7 0.2706 2.3608 5.5036 1.5708 1.2292 1.9052 1.4243 30.6362 0.0279
100

We could find out the regulation and control index of all the nodes in all the section,
and the time difference between all the nodes and node 1. Use these time differences
we can work out an average time difference, this difference could intuitively shows
the collaborative performance of the C2 organization net. The lower average time
difference, the better C2 organization collaborative performance is, and the uniformity
of dispose of the same affair would be kept better. After calculating we could find out
 Quantitative Analysis of C2 Organization Collaborative Performance    295

that the average time difference of the net is 0.3278, and it could be intuitively found
out that in this reticular net, the average time difference is lower than the difference
in tree net. This shows that the collaborative performance of tree is better than the
reticular one.
After calculation, we can know that the average time difference of this organization
net is 0.3278, on the sight of time, we can intuitively find out that this reticular net has
a lower time difference comparing with tree. So we can come to a conclusion that
keeping other conditions constant, the collaborative performance of reticular net is
better than the tree one.

4 Conclusion

For different operation mission, the reaction time of every C2 organization node is
different. The condition of C2 organization being able to get into collaboration is in
the reaction time asked by operation mission. All the nodes can be collaborative,
and it means that all the nodes can complete the mission on time. To satisfy the
requirement of collaboration, the reaction time of each node and the topology of net
should be confirmed. This article mainly analyzed the effect of net topology on the
collaborative performance using quantitative analysis. The results Figured out that
reticular topology is better than the tree one. It can help us further analyze how the
reaction time of node and way of nodes collaborate each other and with other factors
which affect collaborative performance.

Acknowledgment: This work was supported by National Natural Science Foundation


of China (61174198, 61273210) and the National Defense Pre-research Foundation of
China (9140A15070414JB25224)

References
[1] XIU Bao-xin, MU Liang, ZHANG Wei-ming, LIU Zhong. The Model and Method of C2 Organization
Structure Adaptive Optimization [J]. Military Operations Research and Systems Engineering.
2012, 26(1): 35-41.
[2] ZHANG Jie-yong, YAO Pei-yang. Model and solving method for collocating problem of decision-
makers in C2 organization [J]. System Engineering and Electronics,2012,34(4): 52-57.
[3] David S. Alberts. The Agility Advantage[M]. Beijing: Weapons Industry Press, 2012.
[4] MIN Hai-Bo, LIU Yuan, WANG Shi-Cheng, SUN Fu-Chun. An Overview on Coordination Control
Problem of Multi-agent System [J]. ACTA AUTOMATICA SINICA, 2012, 38(10):1557-1570.
[5] XU Yang, LI Xiang, CHANG Hong, WANG Yue-Xing. Effects of Complex Network
Characters on the Coordination Control of Large-Scale Multi-Agent System [J]. Journal of
Software,2012,23(11):2971-2986.
296   Quantitative Analysis of C2 Organization Collaborative Performance

[6] ZHAO Liang, LUO Xue-shan. Research on Collaboration and Its Quantifiable Model in the
Network Centric Warfare [J]. Information Command Control System & Simulation Technology,
2005,27(6):35-39.
[7] Alexander Kalloniatis. On the Boyd-Kuramoto Model-Emergence in a Mathematical Model for
Adversary C2 Systems [C], 17th ICCRTS.2012.
Kuai-kuai ZHOU*, Zheng CHEN
Comprehensive Evaluation and Countermeasures
of Rural Information Service System Construction
in Hengyang
Abstract: The process of rapid economic growth, has an important influence on
the construction of rural areas in Hengyang. The optimization of rural information
service system has become an important issue in the construction of new rural
areas in Hengyang. The rural information service system is not only for the rural
construction and service, it is important to promote the rural infrastructure the smooth
development of Hengyang, further narrowing the gap between urban and rural areas.
This paper analyzes the problems involved in the construction and comprehensive
evaluation of the rural information service system in Hengyang, and puts forward
some effective measures to improve the construction of rural information service
system in Hengyang city.

Keywords: the rural information service system; comprehensive evaluation;


countermeasure

Information has been given a new meaning in the development of communication


technology and computing technology. At present, our country is gradually changing
from an agricultural society to an industrial society, and information has become
an important resource in modern society, and it has important significance in the
construction of a rural information service system.

1 The status quo of rural information service system in Hengyang


City

Hengyang is a central city in Hunan province. It is an important agricultural city


in South Central China. The construction of the rural information service system in
Hengyang city has already had a certain scale.

*Corresponding author: Kuai-kuai ZHOU, College of computer and information science, Central South
University of Forestry and Technology, Changsha, Hunan, China, E-mail: 252619855@qq.com
Zheng CHEN, College of computer and information science, Hunan Institute of Technology, Hengy-
ang, Hunan, China
298   Comprehensive Evaluation and Countermeasures of Rural Information Service System

1.1 Diversification of Rural Information Release Form

In the course of the current rural information service system, the establishment of
information dissemination system has important significance for the standardization
of information dissemination. The release of information needs to be implemented
in accordance with regulations, so that it can promote the release of information in
a more standardized direction of development [1]. Hengyang city, in the process of
rural information system construction, according to China’s agricultural information
dissemination system, the city gradually formed a website, a television station, a
newspaper, a publication, a school as the. In the process of Hengyang City, the rural
information dissemination form gradually diversified. In the process of releasing
information, the traditional forms of information, publications and information are
removed, and the use of the Internet or the Internet and radio, television, newspapers
and other related media, are more frequently used to promote the construction of
rural information service system in Hengyang.

1.2 Formation of Relatively STable Information Acquisition System

After years of construction, the Hengyang City Department of Agriculture has


created information collection channels in farming, agriculture, agricultural
machinery, animal husbandry, science and agricultural products market areas and
other fields, the corresponding information collection points are established. At
the same time, the Hengyang City, website of the Ministry of agriculture created a
corresponding network e - newsletter, agricultural products supply and demand
information, analysis and forecast and related information. The published news and
E-newsletter are based on the local, current situation. Through the establishment of
such an information acquisition system, it can be ensured that people can carry out
information filtering in the process of acquiring agricultural information. At the same
time, in our country and Hunan Province, the information processing ability of the
process of continuous enhancement, Hengyang City, the Ministry of agriculture will
have a monthly, quarterly and rural economic situation analysis, but will also carry
out a corresponding seminar, for the city and the province’s agricultural development
and economic situation forecast analysis. The Integrated information sector predicts
and conducts analysis of hot issues.
 Comprehensive Evaluation and Countermeasures of Rural Information Service System    299

2 constraints on the construction of a rural information service


system

2.1 Rural Government Information Public Management and Service is not Enough

At present, the development of cities and towns is a strong priority of the government,
and the related industries in rural areas are created by the township and the farmers
themselves. From the perspective of rural information service system, the government
has the priority in the policy and the exchange of information and market information
and technology information. The output of rural public goods needs the government
to provide help to work better. From this aspect, we can see that the government plays
an important role in the development of the rural economy. Although the information
network in China’s Ministry of agriculture is at an early stage, and while the content
is relatively rich, but because of the late start of the network construction, it is still
backwards. In the activities of the government departments, the construction of the
rural information service system has not been given much attention.

2.2 Public Financial Resources Support is not Enough

The construction of agricultural information facilities in Hunan province has a certain


scale. At the same time, the construction of agricultural information facilities also has
a certain foundation. However, due to the shortage of financial resources, the problem
is prominent. According to incomplete statistics, the cost of agricultural information
network construction in Hengyang can not meet the current needs of rural economic
development for agricultural information. The information infrastructure in county,
township and village is relatively weak [2]. At the same time, through comprehensive
analysis, agricultural resources information is more abundant, but the agricultural
web site which has a huge amount of network access is not much,accounted for only
less than 2% of all the agricultural websites And this data is concentrated in the city,
township two levels of the overall level of information service platform. The degree
of agricultural product market risk is mainly caused by incomplete information and
asymmetric information. The emergence of these reasons has a close connection with
financial resources support.

2.3 The Agricultural Management System is not Conducive to the Integration of


Information Resources

At present, in the process of economic construction in China, the problems in the


agricultural management system have become the important restriction factor
for the coordinated development of economy and society in China. Agricultural
300   Comprehensive Evaluation and Countermeasures of Rural Information Service System

management is difficult to adapt to reform. In Hunan Province, we have merged some


of the institutions in the agricultural management system, and their functions are
in agreement with the Ministry of agriculture, but it still retains some of the original
institutions. After this adjustment, the agricultural management system is not
conducive to the integration of information resources.

3 comprehensive evaluation of rural information service system

3.1 Comprehensive Evaluation of Rural Information Service System

In the course of the construction of rural services, the implementation of comprehensive


evaluation should follow certain principles. First, integrity and systematic principles.
Service system construction must contain many kinds of components and systems.
To ensure the normal operation of the information system, the data and the system
should be integrated with the organic. To make a comprehensive evaluation of the rural
information service system it should be based on the overall system. Second,stage
evaluation principle. The construction of information service in rural areas is a process
of gradual development. In the process of comprehensive evaluation, we should pay
attention to the current stage of development. Different criteria were used in different
stages. Third, the combination of subjective evaluation and objective evaluation must
be used. The construction of the rural information service system is actually a problem
of the human machine environment. Therefore, in the process of comprehensive
evaluation, is not only to carry out the scientific measurement, but also should take
full account of the information system components. Finally, the economic benefit and
social benefit are unified. The construction of rural service information system and the
unification of economic benefit and social benefit can avoid of one-sidedness. In the
process of rural information service system construction, the use of comprehensive
evaluation, not only can effectively reduce the system construction cost, and shorten
the period of system construction, but also can use the comprehensive evaluation of
the region to measure the development level of regional information.

3.2 Comprehensive Evaluation of Rural Information Service System

To carry out a comprehensive evaluation of rural information service system,


the purpose of the evaluation should be clear. The first, is to evaluate the benefit.
The initial goal of building the rural information service is that it can effectively
promote the comprehensive development of the rural economy, and promote the
development of agriculture in the direction of efficient agriculture. An information
system can provide a lot of useful information, to develop, promote and accelerate the
development of agriculture. The benefit of rural information service will be involved
 Comprehensive Evaluation and Countermeasures of Rural Information Service System    301

in two aspects, that is, social benefits and economic benefits. The information system
is to have the opposite unity. Second, to evaluate the construction of an information
infrastructure [3]. Information infrastructure construction includes the development
of information resources in information infrastructure design, application of
information technology and various types of equipment and devices. Information
infrastructure is a prerequisite for the development and utilization of information
resources. It is also an important method to guarantee the information service. In
comprehensive evaluation, evaluations should be carried out. To a certain extent,
this aspect can reflect the service system of the rural information service. Third, to
evaluate the information talent team. An information talent team is the information
resources development and utilization of advanced technology in the construction of
a rural information system. In the rural information service system construction, the
information talented person troop is the successful in the construction most basic
request. With high quality and a reasonable structureare important prerequisites for
building a rural information service system.

4 to speed up the construction of rural information service system

4.1 Government Promotion, Social Participation

In the uneven development of urban and rural areas, the government plays an
important role in narrowing the gap between the two. The overall level of information
in our country is not very high, especially in remote areas, which requires the
participation of the government. In view of this situation, a government department
should formulate a corresponding plan, increase investment, create a strong social
environment, mobilize all social forces to actively participate in, in order to narrow
the gap between urban and rural areas to increase action [4]. In this regard, Hengyang
city can benefit from a few such aspects. Improve the rural information network
infrastructure, effectively forthe basic needs of farmers, limited aggregation of
agriculture related information, build and perfect rural integrated information. In the
process of continuous development in the countryside, we should pay attention to the
development of a comprehensive rural coordination.

4.2 To Pay Attention to the Construction of Information Network Infrastructure

To build a good information service system, one of the important prerequisites is the
construction of a basic network, which is a prerequisite for the construction of a rural
information system. Even though most of the rural areas have achieved good results
in the relevant projects, the popularity of computers is still very low. A computer
network should be constructed. Which include the construction of information public
302   Comprehensive Evaluation and Countermeasures of Rural Information Service System

service platform, training of information talents and the establishment of hierarchy


information service organization and personnel. In the course of the construction of
the rural information service system, only the information network infrastructure is
attached importance.

4.3 Training Information Service Type Talents

At present, the level of knowledge and culture in rural areas is still at a low level. In the
course of their development, the amount of information is limited, and the application
of information is less, so it can be seen that the lack of necessary information
personnel. However, the process of rural information service system construction
needs the talent and the quality requirements of such talents to gradually increase.
To construct an information training service, good training is required The training in
the corresponding talent can speed up the pace of its construction.

4.4 To Promote the Development of Rural E-Commerce and Consulting Services

Rural information service system construction should be the new industry development
in rural areas, and provide advantages to emerging industries in rural construction.
The construction of rural information service system needs a corresponding electronic
commerce and consulting service. In the face of such a development, we should
further promote the development [5] of e-commerce and consulting services in rural
areas, and promote the development of rural economy.

FUND

* Hengyang Social Science Fund general project “Hengyang rural information service
platform construction research” (2014C005), Hengyang Social Science Fund general
project “Hengyang rural information service system construction of comprehensive
evaluation and countermeasure research” (2014C015) and Hunan Institute of science
and technology research project “Hengyang rural information service platform
construction mode research and design implementation” (HY14008).

References
[1] Zou Jihong, Yang Hongjun, Yan Ying, et al. Four levels of network linkage to explore rural
information service model [J]. agricultural network information, 2010,13 (03): 45
 Comprehensive Evaluation and Countermeasures of Rural Information Service System    303

[2] Wu Longting, Ron, Lin Yuan, et al. China’s agricultural information and rural information service
system construction process [J]. China information industry, 2014,8 (15): 13
[3] Chen Bierui. Journal of municipal construction and measures of rural comprehensive
information service system for [J]. Xi’an University of Posts and telecommunications, 2014,10
(03): 461
[4] Miao Runlian, Liu Juan, Peng Gang July, Beijing. The practice and Countermeasures of rural
information service [J]. Guangdong agricultural science, 2010,11 (07): 78
[5] Huai Guo Zheng, Su Fen sun, Cui Ping Tan,. To strengthen the rural information resources
development and promote rural informatization process [J]. Agricultural Library Information
Journal. 2013,9 (05),: 89.
Wen-qing HUANG, Qiang WU*
Image Retrieval Algorithm based on Convolutional
Neural Network
Abstract: A convolutional neural network (CNN) is a kind of artificial neural network.
It is widely used in image recognition and other fields because of its advantages in
weight sharing. An image retrieval algorithm based on CNN is proposed in this paper.
Firstly, the image is input into a pre-trained CNN, and the features of an image are
extracted and normalized. Then the target image is matched with the feature base
image, and the result of matching is sorted by means of cosine similarity method.
Finally, the top 36 images are displayed according to the result of sorting. Experimental
results show that our approach can improve the accuracy of image retrieval compared
with the traditional image retrieval algorithm.

Keywords: image retrieval; Convolutional Neural Network; feature extraction; Cosine


Similarity.

1 Introduction

In the past ten years, content based image retrieval (CBIR) has been discussed by a lot
of scholars and many image retrieval systems [1-2] have been established. But most of
them can’t meet user’s demands completely. The main reason is that there exits the
difference between low-level features and high-level semantics [3]. The underlying
visual characteristics cannot fully reflect and match the user’s query intention.
In recent years, deep neural network [4-5] technology has been used in the field
of image recognition. Excellent results have been achieved with the development of
this method. As a kind of deep neural network, convolutional neural network has the
advantage of greater ease of training than other types of depth networks. In the course
of its development, Zhu et al. [6] used LBP and Gabor algorithm to extract features,
and achieved good results in the field of facial expression recognition. In 1998,
Yann LeCun et al. [7] proposed a classical convolutional neural network algorithm,
which was used to recognize handwritten letters, and it has proved a very successful
algorithm. Afterwards, Alex Krizhevsky et al. [8] won the title in LSVRC by improving
the convolutional neural network algorithm. Meanwhile, Girshick et al. [9] proposed
the Region-based Convolutional Network method and used the trained CNN model
to extract characteristics and adopted SVM [10] to train. It became the standard of

*Corresponding author: Qiang WU, School of Information Science and Technology, Zhejiang Sci-Tech
University, Hangzhou, China, E-mail: 1161941191@qq.com
Wen-qing HUANG, School of Information Science and Technology, Zhejiang Sci-Tech University,
Hangzhou, China
 Image Retrieval Algorithm based on Convolutional Neural Network   305

algorithm in target detection field and recently the Fast Region-based Convolutional
Network method was proposed. Many young scholars have join in the study of this
field in the last two years. For instance, Ruoyu Liu et al. [11] attempted to develop a
new image indexing framework on the basis of CNN features. Instead of projecting
each CNN vector from the original feature space into a binary space, they adapted a
BoW [12] model and an inverted Table to deal with high-dimensional global features.
And Kevin Lin et al. [13] analyzed a deep learning of binary hash codes for fast image
retrieval. Their idea was that when the data labels are available, binary codes can
be learned by employing a hidden layer for representing the latent concepts that
dominate the class labels.

Traditional content-based image retrieval system usually extracts color, texture and
scale-invariant feature transform [14-16] descriptor to retrieve picture content. In this
paper, we analyze the classical convolutional neural network. And on the basis of
previous studies, a new and efficient pre trained neural network model is proposed to
extract the features of the image, and the results of the experiments are compared with
the three traditional algorithms. In terms of ranking, a cosine similarity measurement
method was combined to further improve the retrieval performance of the system.

2 The classical CNN model

A convolutional neural network is proposed by the biological receptive field


mechanism [17], it is a feedforward neural networks [18].
In 1998, the Yann LeCun et al. [7] proposed a neural network model convolution
LeNet-5, as shown in Figure 1.

Figure 1. LeNet-5 model of CNN

There are totally 8 layers in LeNet-5: 1 input layer; 3 convolution layers C1, C3 and C5; 2
sub-sampling layer S2 and S4; 1 complete connection layer and 1 input layer.
a) Input layer is applied to enter picture and picture size is 32×32 uniformly.
306   Image Retrieval Algorithm based on Convolutional Neural Network

b) Convolution layer is used to conduct characteristic mapping for input data. In


LeNet-5, connection Table is designed as shown in Figure 2 to describe dependence
relationship among characteristic mapping of different layers.
c) As data size of characteristic mapping of convolution layer output is relatively large
which is bad for follow-up analysis and processing, it should reduce dimension in
sub-sampling layer.
d) There are 84 nerve cells in whole connection layer F6 and both number of
connections and trainable parameters is 10164.
e) Output layer is composed by 10 Euclidean radial basis functions.

Figure 2. The connection Table of C3 layer in LeNet-5

3 The CNN model

3.1 The network structure model

The convolutional neural network in this paper is based on MatConvNet. It is a


MATLAB toolbox whose to implement CNN for computer vision applications. It
is efficient, and can run and learn state-of-the-art CNN. Many pre-trained CNN for
image classification, segmentation, face recognition, and text detection are available.
Compared to Caffe [19], this toolkit is more compact and efficient, and can be used to
run and learn some of the popular convolutional neural networks.
Figure 3, shows the composition of the convolution network model in the paper
and operational formula of model and operating steps were labeled between each two
network, similar to model in Figure 1, but it was slightly different from the treatment
mode of the model layer. In the Figure, layers from C1 to C5 were convolution layers
in the network and layers from fc1 to fc3 represented complete connection layer in
the network. Picture sizes in this convolutional neural network were required to
be consistent, so the original picture size was uniformly adjusted to 224×224 in the
beginning and the pictures were entered into the convolutional neural network.
Each channel was taken as a characteristic pattern and was processed individually.
Afterwards, it entered into the convolution layer to perform a convolutional operation.
After the convolution operation was conducted for C1 and C2, it would perform the
equation (2) and (3) and (5); the convolution was conducted for C3 and C4 and then
 Image Retrieval Algorithm based on Convolutional Neural Network   307

the equation (2) was performed; after convolution operation of C5, equation (2)
and (5) were performed. Finally, it included three complete connection layers: the
equation (5) was performed after a complete connection layer fc1 and fc2, and after a
complete connection layer fc3, softmax classifier was applied in pro operation. In this
way, it trained picture characteristics were obtained. For the sake of cosine similarity
measurement method applied in the paper, image features extracted from the paper
were normalized by equation (4).

Figure 3. The network structure model

3.2 The Formula definition

In CNN, the effect of the convolutional layer is to extracted a feature on the local area,
wherein each one filter corresponds to a feature extractor. Figure 4 shows a common
two-dimensional convolution layer.

Figure 4. Two-dimensional convolution layer

Given an image xij and a filter yij , the filter is mean filter fuv , so the convolution
formula in this paper can be represented as follow:
m n
yij = ∑ ∑ f uv ⋅ xi −u +1, j −v +1
u =1 v =1 (1)
308   Image Retrieval Algorithm based on Convolutional Neural Network

where 1 ≤ i ≤ M ,1 ≤ j ≤ N and 1 ≤ i ≤ m,1 ≤ j ≤ n, m � M , N � N .


Because of the neurons there are certain characteristics, such as unilateral
suppression, width excited border, sparse activation and so on. In order to enhance
the ability of network expression It is necessary to introduce a continuous nonlinear
activation function to deal with input the x signal, this paper the rectifier function is
calculated as follows:
rectifier ( x) = max(0, x) (2)
Compared with the traditional sigmoid function, the rectifier function has better
sparsity. The sigmoid saturation problem is solved, namely in the depth of the neural
network, before the several layers obtained in the gradient descent gradient are too
low, resulting in the deep neural network at the front being randomly transformed,
and only in the last few layers is the true classification. And across the whole network
only comparison, add, multiply operation, the calculation is also more efficient.
In network training the parameters of each layer are updated constantly, the
parameters of the update will lead to the next layer of input distribution change. In
order to overcome this problem, it can join the normalization function in volume
sediments. That improved the efficiency of learning, made the training sample mean
normalization, namely it calculated the mean on the training sample and stored them,
not to accommodate small-scale dimensions like before. In this paper, two linear
normalization function zero mean and l2 normalization are formulated as follows:
x−µ
z=
σ (3)

n
| x| 2= ∑x
i =1
2
i = xT x
(4)

where μ is the original dataset mean and σ is the variances.


Although the convolution layer can significantly reduce the number of
connections, the number of neurons in each feature map is not significantly reduced.
If followed by a classifier, the input dimension of the classifier is still very high, it is
easy to appear over fitting. In order to solve this problem, in volume build-up and
a pooling operation, which is subsampled. A subsampling layer greatly reduces the
dimensionality of the feature and avoids the over fitting. In this paper, the optimal
operation of the largest pooling of maximum as follows:
poolmax ( Rk ) = max ai (5)
i∈Rk

Where Rk is the divided regions on feature map. A large number of experiments


can be concluded that the largest pool operation not only reduces the number of
dimensions, but also for the variation of displacement has good robustness.
In order to sort out the results of the search results, two kinds of measurement
methods are applied in this paper, Euclidean metric and cosine metric method.
 Image Retrieval Algorithm based on Convolutional Neural Network   309

Although Euclidean distance is able to reflect the absolute difference of the numerical
characteristics of the individual, it has been more frequently used in reflecting the
differences in analysis from the magnitude of the dimension. Cosine distance is more
frequently used to distinguish the differences from the direction, and is insensitive
to the absolute value. It is more often used to evaluate the content to distinguish
the similarity and differences of interest for users, and revises the problem that the
measurement standard is not unified between users. Therefore, after the normalization
of image features, the results of the query image returned by the method of cosine
similarity measure are used to sort the results in this paper.
Typically image retrieval techniques use precision, recall rate and the mean
average precision (MAP) as the evaluation index. The following three formulas are
shown. Among them, A is the number of representatives of the search results relevant
image, B is the search results are not related to the search results, C is the number of
the not retrieved related image, where P is the precision and R is the recall rate. The
formula as follows:
A
precision =
A + B (6)
A (7)
recall =
A+C
1 (8)
MAP = ∫ P( R)dR
0

4 Experimental Results and Analysis

4.1 Image database

In this paper, the Corel1K database and Corel database are used. The Corel1K image
set of which a total of 1000 pictures, a total of 10 categories, each category 100,
respectively, horses, mountains, beaches, tribal people, dinosaurs, elephants, buses,
flowers, food, construction. The Corel image set of which a total of 10000 pictures,
a total of 100 categories, each category are 100 pictures, the same class of images
contain the same semantic information, including the categories of flowers, birds,
beaches, etc.

4.2 Experiment results

In order to measure the overall effects of the proposed algorithm, this paper retrieves
the Top100, that is the first 100 images returned to sort. Average retrieval precision
map respectively calculates the CNN and some previous literature provided by three
traditional visual features, the hue saturation value (HSV) color feature, gray level
Co-occurrence matrix (GLCM) features and scale invariant feature transform (SIFT)
310   Image Retrieval Algorithm based on Convolutional Neural Network

feature of average precision and lists them out in Table 1, Table 2, Figure 5 and Figure
6 as follows:

Table 1. The comparison on map of four algorithms of 10 top 100 in Corel1K picture set

visual features HSV GLCM SIFT CNN

MAP 0.43 0.39 0.53 0.79

Table 2. The comparison on map of four algorithms of 100 top 100 in Corel picture set

visual features HSV GLCM SIFT CNN

MAP 0.21 0.19 0.40 0.70

Figure 5. Precision-Recall Rate of Four Algorithm in Corel1k Picture Set

Figure 6. Precision-Recall Rate of Four Algorithm in Corel Picture Set


 Image Retrieval Algorithm based on Convolutional Neural Network   311

It can be concluded from Table 1, in the Corel1k database, based on the convolution
neural networks retrieval algorithm, the average precision is 0.36, 0.4, 0.26 higher
than the three traditional search algorithms respectively. It can be drawn from Table
2 that when the number of image library expanded, the average retrieval accuracy of
the four algorithms MAP also reduced by 0.15, 0.2, 0.13, 0.09. So it can be concluded
that with the increase of the image database, the corresponding MAP will also be
reduced. In order to further verify the conclusion, the picture on Corel database set
experimentally derived precision – the recall rate can be referred to.
In Figure 5, these four kinds of visual features are given in the precision - recall
rate curve. It can be seen from Figure 5 that the accuracy of the convolution neural
network is larger than that of the other three traditional methods in the recall rate
curve and the area ratio of the coordinate axis. In a normal information retrieval
system, the larger the area between the curve and the coordinate axis, the higher the
accuracy of the system. In the most ideal system, the area should be as close to 1 as
possible, and the area in all of the system should be larger than 0. From Figure 6 it
can be seen that the convolutional neural network of precision - recall rate, recall
rate curve and the coordinate axis area are much larger than that of the other three
kinds of traditional methods. But compared with Figure 5, the ratio curve is more
smooth which indicates a smaller area. It illustrates that the with the bigger picture
frames, the average retrieval accuracy will decline from a different angle. So it can
be concluded from the experiment that in general the convolution neural network
algorithm based on depth study is more accurate than the traditional image retrieval
algorithm.
In order to directly compare the contrasts between the 4 algorithms based on
convolutional neural network algorithm and the other three algorithms, the  Figure
7 shows the direct retrieval Figure of the top36 images in core1K under two target
images on the four algorithms.
Among them, the four maps of the (a) (b) (c) (d) are respectively corresponding
to the retrieval results of the four visual features which are the CNN, SIFT, GLCM and
HSV. In this paper, the unrelated images are marked out by the red border. From the
graph, we can draw the conclusion that the retrieval effect of the convolutional neural
network is obviously better than the other three algorithms.
Correspondingly, the four maps of the (e) (f) (g) (h) are still respectively
corresponding to the retrieval results of the four visual features which are the CNN,
SIFT, GLCM and HSV. The Figure can directly show that the convolutional neural
network retrieval results again is obviously superior to the remaining three algorithms,
which in turn further verifies the conclusions of this paper and illustrates that the
retrieval algorithm based on convolution neural network image retrieval algorithm is
superior to the conventional.
312   Image Retrieval Algorithm based on Convolutional Neural Network

(a) (e)

(b) (f)

(c) (g)

(d) (h)
Figure 7. The results of the 4 algorithms return top36 in Corel1K
 Image Retrieval Algorithm based on Convolutional Neural Network   313

5 Conclusions

This paper presents an image retrieval algorithm based on the convolution neural
network model, and compares it with the three traditional image retrieval algorithms.
The experimental results show that the convolution neural network by pre-trained
model has a good effect on image retrieval, and it is better than the traditional
method. In the paper, cosine similarity was combined with this model to improve the
application of convolutional neural network in image retrieval. However, the method
proposed in the paper needs to be improved. Because the convolutional neural
network model in the paper was trained, the image features extracted from different
photo galleries may not show good retrieved result. Research will be carried out to
further improve accuracy of image retrieval.

References
[1] Do, Thanh-Toan, Kijak, Ewa, Furon, Teddy, et al. Deluding image recognition in sift-based cbir
systems[J]. Acm Multimedia in Forensics Security & Intelligence, 2010, 312(1153):7-12.
[2] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba. SUN Database: large-scale scene
recognition from Abbey to Zoo. CVPR, 2010.
[3] Kundu M K, Chowdhury M, Bulò S R. A graph-based relevance feedback mechanism in content-
based image retrieval[J]. Knowledge-Based Systems, 2015, 73:254-264.
[4] Lin K, Yang H F, Hsiao J H, et al. Deep learning of binary hash codes for fast image retrieval[C]//
IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2015:27-35.
[5] Zhao F, Huang Y, Wang L, et al. Deep semantic ranking based hashing for multi-label image
retrieval[C]// IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer
Society, 2015:1556-1564.
[6] Zhu Z, Zhao C, Hou Y. Texture Image Retrieval Based on NSCT, GLCM, and LBP[J]. Journal of
Convergence Information Technology, 2011, 6(11):418-425.
[7] Lécun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J].
Proceedings of the IEEE, 1998, 86(11):2278-2324.
[8] Krizhevsky A, Sutskever I, Hinton G E. ImageNet Classification with Deep Convolutional Neural
Networks[J]. Advances in Neural Information Processing Systems, 2012, 25(2):2012.
[9] Gkioxari G, Girshick R, Malik J. Contextual Action Recognition with R*CNN[C]// IEEE International
Conference on Computer Vision. IEEE, 2015:1080-1088.
[10] Tong, Simon, Chang, et al. Support vector machine active learning for image retrieval[J]. 2015.
[11] Liu F, Lin G, Shen C. Indexing of CNN Features for Large Scale Image Search[J]. Pattern
Recognition, 2015, 48(10):2983-2992.
[12] Friedlander T, Mayo A E, Tlusty T, et al. Evolution of Bow-Tie Architectures in Biology[J]. Plos
Computational Biology, 2015, 11(3).
[13] Lin K, Yang H F, Hsiao J H, et al. Deep learning of binary hash codes for fast image retrieval[J].
2015:27-35.
[14] Bala A, Kaur T. Local texton XOR patterns: A new feature descriptor for content-based image
retrieval[J]. Engineering Science & Technology an International Journal, 2015, 19(1):101-112.
[15] Wenbing Chen, Qizhou Li, Keshav Dahal, ROI image retrieval based on multiple features of
mean sift and expectatiomaximisation Original Research Article Digital Signal Processing,
Volume 40, May 2015, Pages 117-130.
314   Image Retrieval Algorithm based on Convolutional Neural Network

[16] Guo J M, Prasetyo H, Chen J H. Content-Based Image Retrieval Using Error Diffusion Block
Truncation Coding Features[J]. IEEE Transactions on Circuits & Systems for Video Technology,
2015, 25(3):466-481.
[17] Gilbert C D, Wiesel T N. Receptive field dynamics in adult primary visual cortex[J]. Nature, 1992,
356(6365): 150-152.
[18] Shuguang Liu, Chongxun Zhen, Mingyuan Liu. Back propagation algorithm and its
i-mprovement in feedforward neural network[J]. Computer Science, 1996, 23(1): 76-79.
[19] Y, Shelhamer E, Donahue J, et al. Caffe: Convolutional Architecture for Fast Feature
Embedding[J]. Eprint Arxiv, 2014:675-678.
Bing ZHOU*, Juan DENG
A Cross-domain Optimal Path Computation
Abstract: This article introduces the PCE hierarchical which computates the optimal
end-to-end paths in a cross-domain network environment. The architecture uses a
typical top-PCE, which intercommunicates and cooperates every sub-PCE in the sub-
domain, and completes the establishment of end to end path without route sequence
in advance in the cross domain. Finally, this article analyses the merits of the optimal
path computation architecture and takes the improved direction after comparing to
the existing computational model.

Keywords: optimal path computation, PCE, cross-domain, multi-domain networks

1 Introduction

In many standard documents [3, 4] the definition of requirements and instructions


is built on the cross-domain Multi-Protocol Label Switching Traffic Engineering
(MPLS-TE) and Generalized Multi-Protocol Label Switching (GMPLS) network for the
capacity of end-to-end computation routes in Label Switched Path (LSPs). To meet the
needs of these calculations, the Path Computation Element (PCE) system architecture
is proposed, having been used to establish and control the cross-domain MPLS-TE
LSPs, and GMPLS networks.
According to the demand of management, geographic location and the exchange
of environment, the network can be divided into multiple domains, which connecting
the boundary nodes domain. Due to not absolutely getting all links information
through one path computation node in the multi-domain network environments, a key
problem to be solved is how to determine the end-to-end traffic engineering path. In a
single domain environment, the method and technology using the PCE computation
path between nodes is relatively simple and direct. But computing the path from head
to tail nodes in different domains is needed to the collaboration of PCE and finish end
to end path computation. In many standard documents [1, 2] the relevant solutions
are put forward, including applicable per-domain path computation and based
on Backwards Recursive Path Computation (BRPC) [5] mechanisms computation
approach. However, these methods’ applications are based on the end to end domain
sequences. In fact, to ensure the cross-domain optimal end to end path, a critical
process is the determination of domain sequence and another key is how to computing

*Corresponding author: Bing ZHOU, City College Wuhan University of science and technology, Hubei
Wuhan, 430000, E-mail: 7676614@qq.com
Juan DENG, City College Wuhan University of science and technology, Hubei Wuhan, 430000
316   A Cross-domain Optimal Path Computation

these paths. But the techniques and methods for determining domain sequence in the
relevant standard literature are not explicitly given.
Based on the above description, this article will discuss and introduce a domain
sequences without the need to obtain end to end technologies and methods to establish
the optimal path in advance. This technique shows how to extend the existing PCE
architecture so that it can be used to calculate and choose the best domain sequence
while generating optimal end to end path in this domain sequences.

2 Hierarchical PCE

In the Hierarchical PCE architecture, there is a top parent PCE of maintenance domain
topology view which contains all sub-domains (in the form of node performance) and
the connection relationships between domains (topology links) in the topology view.
The parent PCE do not have any information on the sub-domain. That is domain for
information confidentiality and flooding brought by the scale problem, the parent PCE
is not sub-domain resource availability and connectivity information. The parent PCE
is inter-domain connection link’s TE attribute, and be stored in its own jurisdiction
in the topology view. For not through IGP area link connection, you need to use the
parent PCE topology view appropriate mechanism, and abstract some virtual links,
that the cost of these links can be regarded as unlimited or minimum (0).
Each sub-domain contains at least one of the fields that can be used to calculate
the path through the PCE, and these PCEs are called sub-PCE connecting with the
parent PCE. Each sub-PCE identifies all the neighbors at the domain-domain, and
each of the sub-PCE is only known in its domain topology, and the topology of other
domains is not visible. Sub-PCE does not know the inter-domain connectivity through
the cross-domain network (parent PCE visual topology), but only knows its link
connectivity with adjacent domains.
The parent PCE domain topology view can be constructed by way of configuration
or accepting the information from the sub-PCE. Be noted that, this topology view does
not contain any sub-domain of the domain information.
When needing to compute the cross domain path, the first domain PCE sends
calculation request to the parent PCE through the PCE protocol (PCEP, [RFC5440]),
and the parent PCE based on domain topology and inter-domain link state selects a
set of candidate domain path. And that the parent PCE sends a computing request
to the sub-domain PCE calculation of candidate domain path. These requests can be
sent in parallel or serial depending on implementation.
Each sub-PCE computes a set of candidate path segments through itself domain,
and to send the results back to the parent PCE. Parent PCE use these data to select
the appropriate path segments, stitching them together and produce the optimal end
to end cross-domain paths. Then this path is sent to the first calculation requests
sub-PCE and passed it to an initial calculation request of PCC.
 A Cross-domain Optimal Path Computation   317

3 Hierarchical PCE example

The following will describe the hierarchical of domain topology with an example.

BN21 BN31
BN11 BN23

PCE1 PCE2
S PCE3 D

BN22 BN32
BN24
Domain 1 BN12 Domain 3
BN13 Domain 2 BN33

BN41 PCE4 BN42

PCE5
Domain 4

Domain 5

Figure 1. Level domain topology examples

Figure 1 shows the interconnection of the four domains, the outermost layer has a top-
level domains, each domain contains a PCE, specifically:
–– Domain 1 as the first domain, sub PCE1 is responsible for calculating the path
of the domain. His neighbor domains includes the domains 2 and domain 4, the
domain 1 also includes a first node LSR (S) and three out of the boundary nodes
(BN11, BN12, and BN13);
–– Domain 2 consists of sub-PCE2 responsible for domain calculations, its neighbor
domains include Domains 1 and Domains 3, and includes four boundary nodes
(BN21, BN22, BN23, and BN24);
–– Domain 3 as the last domain, is responsible for passing calculated by sub PCE3
region, its neighbor domains include Domains 2 and Domains 4, the domain
contains last node LSR (D) and the 3 boundary nodes (BN31, BN32, and BN33 );
–– Domain 4 by the sub-PCE4 is responsible for the domain calculation, its neighbor
domain comprises domain 2 and domain 3, and contains two border nodes (BN41,
and BN42).
318   A Cross-domain Optimal Path Computation

All of these fields are located in the top-level domain 5, the corresponding parent PCE
for PCE5.

Domain 5 PCE5

D1 D2 D3

D4

Figure 2. Parent PCE visual abstraction domain topology view

Figure 2 shows a parent PCE visible domain topology view, that a topological
abstraction, through this topology view, PCE5 seen connectivity between domains,
but can not know the topology information subdomain.

3.1 Hierarchical PCE Early Information Exchange

Based on the topology of Figure 1, the following description will be given hierarchical
PCE architecture, the topology construction build the initial information exchange:
–– ConFigure the parent PCE, PCE5 address on PCE1;
–– Parent PCE5 establish communication links between the sub PCE1;
–– PCE1 by listening to the interior gateway protocol, access to the link between the
domains in which the domain of its neighbors;
–– PCE1 will be their domain connection with the neighbor domain information field
reported to PCE5;
–– If the domain in which the inter-domain link PCE1 on resource availability has
changed, need to be reported to the PCE5;
–– Processing of the other fields with step 1 to 5, the final PCE5 by using the PCE
reporting information, generate the domain topology as shown in Figure 2.
 A Cross-domain Optimal Path Computation   319

3.2 Hierarchical PCE Architecture End to End Path Calculations

Based on the topology presented in Figure 1, below will be given in the hierarchical
PCE architecture, a typical computation request issued by a source PCC which
calculating the end-to-end process cross domain path. This assumes that architecture
has completed the initial communication process as described to establish a
communication link between parent PCE and son PCE, domain topology has finished
building.
–– Source PCC, is usually the first node LSR to the first domain sub PCE transmits the
calculated surrender node S to node D request calculation of wake;
–– The first domain PCE1 need to determine whether the last node in the domain, is
not here;
–– PCE1 sent path calculation request to the parent PCE5;
–– Parent PCE determinate the last node in domain 3;
–– Parent PCE according to their visual cross-domain topology information,
computing domain path, exclusion off unavailable during the determination of
inter-domain links, enumerate all available domain path, in this case claimed
that BN12-BN22 is not available, so the total three domain path is calculated;
–– Parent PCE calculation request to the domain path, all subdomains PCE after
sending boundary to boundary within the path;
–– Parent PCE transmission source to the first domain to the domain path
computation request;
–– Parent PCE to send last domain boundary to persistent intra domain route
computation request;
–– After the parent PCE to collect all sub-PCE calculation result are returned,
combined with the inter-domain links, calculate the request and count routing
policies configured locally,, spliced into a complete end to end cross-domain
path;
–– Parent PCE obtained from the path above splicing, select the optimal path to meet
the end-computing constraints, sends the results to the first domain PCE1;
–– PCE1 will return the path computation results to the PCC.

4 Performance analysis

Two cross-domain path computation schemes have been proposed: BPCA(Basic Path
Computation Algorithm) and OPCA (Optimal Path Computation Algorithm). BPCA is
computing the shortest path for the service by Dijkstra algorithm in cross-domain.
OPCA is computing the optimal path by top PCE and sub PCE. As shown in Figure 3,
block probability of OPCA is obviously better than BPCA in cross-domain service
along with the increase of traffic load. And the average hops of OPCA are lower than
BPCA. It proved that OPCA can get better performance in cross-domain service.
320   A Cross-domain Optimal Path Computation

(1) Blocking probability

(2) Average hops

Figure 3. Performance of two path computation schemes in cross-domain sevice

5 Conclusion

This article describes a cross-domain network environment, calculating the optimal


path hierarchy of PCE architecture. The architecture uses a typical top PCE, sub PCE
coordinated network of each sub-sub-domain, mutual communication, collaboration,
without knowing in advance in cross-border route domain sequence, the completion
of the establishment of cross-domain end to end path. This article provides a way
to correctly and efficiently make cross-domain Optical path computation. The
architecture can be applied to a variety of network environments, and has many
advantages according to the existing calculation method based on the PCE. There is
room for increasing thought and research against efficient computation, additional
maintenance overhead and a single point of failure risk.
 A Cross-domain Optimal Path Computation   321

References
[1] [RFC5541] Roux, J., Vasseur, J., and Y. Lee, “Encoding of Objective Functions in the Path
Computation Element Communication Protocol (PCEP)”, RFC5541, December 2012.
[2] [RFC5520] Brandford, R., Vasseur J.P., and Farrel A., “Preserving Topology Confidentiality in
Inter-Domain Path Computation Using a Key-Based Mechanism RFC5520, April 2015.
[3] [RFC5152] Vasseur, JP., Ayyangar, A., and R. Zhang, “A Per-Domain Path Computation Method
for Establishing Inter-Domain.
[4] BRANDFORD R, VASSEUR, J.P., FARREL A. RFC5520, Preserving Topology Confidentiality in
Inter-Domain Path Computation Using a Key-Based Mechnism [Z]. Internet Society, 2009.
[5] [RFC5441] Vasseur, J.P., Ed., “A Backward Recursive PCE-based Computation (BRPC) procedure
to compute shortest inter-domain Traffic Engineering Label Switched Paths”, RFC5441, April
2014.
Feng LIU, Huan LI, Zhu-juan MA, Er-zhou ZHU*
Collaborative Filtering Recommendation Algorithm
based on Item Similarity Learning
Abstract: Collaborative filtering recommendation algorithm is the classic and widely-
used recommendation algorithm in business. The performance of the collaborative
filtering algorithm that based on items is depending on the computation of items’
similarities. However, the data sparsity and cold-start problems in new item
recommendations usually pose huge effect on the item similarity computation. Aiming
at these problems, this paper proposes a collaborative filtering recommendation
algorithm based on item similarity learning. Specifically, by evaluating the attribute
similarity, the similarity matrix of all items is firstly figured out. Secondly, in order
to avoid the cold-start problem caused by new items, the top K most similar items
of each item are selected. Thirdly, by receiving the score vector of K adjacent items
of each item, the RBF neural network obtains the item similarity training model. As
a result, the new score vector of each item is generated. Lastly, by putting the new
score vector of K adjacent item of each item in the test data set into training model,
the predicted score vectors of all items are derived. Several comparative experimental
results have shown that the proposed algorithm not only able to solve the cold-start
problems in new item recommendations, but also shows good performance in dealing
with the sparse data and meanwhile achieves better recommendation results.

Keywords: Item Similarity Learning; RBF Neural Network; Collaborative Filtering;


Recommendation Algorithm

1 Introduction

With the rapid development of mobile technology and the Internet, users are facing
with many choices. Which book is valuable? Which restaurant has more delicious
food? Which way to travel incurs less cost? Which movie is more exciting? How to
provide high-quality recommendations to users in the big data is the main challenge to
the recommendation systems. Now, recommendation systems have been widely used
in many fields, such as the friend recommendation of Sina Micro-Blog, watercress
movie recommendation, Amazon book recommendation and Net Ease cloud music

*Corresponding author: Er-zhou ZHU, School of Computer Science and Technology, Anhui University,
Hefei 230601, Anhui China, E-mail: ezzhu@ahu.edu.cn
Feng LIU, Huan LI, Zhu-juan MA, School of Computer Science and Technology, Anhui University, Hefei
230601, Anhui China
 Collaborative Filtering Recommendation Algorithm   323

recommendation. Recommending systems have already penetrated into our society


ranging from business to people’s daily life.
In recent years, scholars have proposed many kinds of recommendation
algorithms, mainly divided into three categories: collaborative filtering
recommendation (CF), content-based recommendation, and hybrid recommendation
[1]. Collaborative filtering algorithm is one of the most popular methods. Collaborative
filtering algorithms are mainly divided into three classes: collaborative filtering based
on users, collaborative filtering based on items and collaborative filtering based on
the model [2]. Collaborative filtering algorithms based on users or items are focus on
the data of users’ behaviors, they calculate the adjacent of the users (items) using
a similar measurement method, then calculate the assessment of the items given
by users according to the weighted summation of rating data of these neighboring
users (items). The method based on the user behavior does not need the content
information of items, and the domain knowledge. Based on this, the forecast accuracy
will continue to increase with the time. But it also has some shortages: the new-user-
cold-start problem, the new-item- cold-start problem, and the data sparsity problem.
In order to solve these problems, scholars put forward a model-based (mainly
include: matrix decomposition model [3], latent semantic model [4], Bayesian network
model [5] and probabilistic factor models [6]) collaborative filtering algorithm. But
they are relying on the established models and are more complex to implement.
Matrix factorization model can alleviate the problem of sparse data by reducing the
dimension of the score matrix, but it also lead to the loss of information and the
decline in the accuracy of the recommendation. Furthermore, some researchers use
matrix filling technology. References [7] and [14] use the learning ability of BP neural
network to fill the lack of scoring in score matrix. This method can effectively solve
the data sparse problem. But the BP neural network has some problems such as
slowly learning speed and easily been trapped into local minimization. Reference [8]
learns the item similarity by reducing the square prediction error. This optimization
problem is accomplished by the stochastic gradient descent. But the learning speed
of this kind of method is very slow. Meanwhile, due to not considering the attribute
of the item itself during the calculation of item similarity, this method cannot reach
satisfactory results.
Aiming at these problems, this paper proposes a collaborative filtering
recommendation algorithm based on RBF neural network. Under this algorithm, the
initial neighbor set of all the items based on the similarity measure of their attributes
are calculated at first. Then the RBF neural network is used for learning the similarity
of the item with each item in the neighbor set. Finally satisfactory similarities of the
item with each item in the neighbor set are obtained. Experimental results have shown
that the proposed algorithm not only able to solve the new-item-cold-start problem,
but also shows better performance when dealing with sparse data, and finally gets
more accurate results.
324   Collaborative Filtering Recommendation Algorithm

2 Related Work

2.1 Description of Collaborative Filtering Recommendation Algorithm Based on Item

Collaborative filtering recommendation based on the item said: a user will like similar
items with items he likes before, therefore, collaborative filtering recommendation
based on the item mainly lies in the calculation of similarity between items [1].

2.1.1 Establishing the user-item rating matrix


User rating data is stored in a two-dimensional matrix Rm×n, m represents the number
of users, n is the number of items. U={u1,u2,u3,…,um} represents a collection of m users,
I={i1,i2,i3,…,im} represents a collection of n items. The model of a user-rating matrix is
shown as Table 1.

Table 1. User-item rating Table Rm×n

Item i1 ... ij … in
User
u1 r1,1 r1, j r1,n
… … …
ui ri ,1 ri , j ri , n
… … …
um rm ,1 rm , j rm , n
… …

In Table 1, ri,j indicates the user ui’s score on the item ij, the score reflects the user ui’s
preferences for the item ij. When the item is not scored by the user, we define ri,j =0.

2.1.2 Methods of item similarity measurement


There are four main methods of similarity measurement: Adjusted cosine similarity,
Pearson similarity, Jaccard similarity and item attribute similarity. U(i) represents the
collection of users who score the item i, r*,i means the average score of the item i,
sim(i,j) is the similarity between the item i and item j.
a) Adjusted cosine similarity
Adjusted cosine similarity measure method is used to calculate the similarity between
the item i and item j, as shown in (1):

u∈U ( i ) U ( j )
(ru ,i − r ∗,i )(ru , j − r ∗, j )
sim(i, j ) =
∑ (r
u∈U ( i )
u ,i − r ∗, i ) 2 ∑
u∈U ( j )
(ru , j − r ∗, j ) 2 (1)

b) Pearson similarity
 Collaborative Filtering Recommendation Algorithm   325

Pearson similarity measure method is used to calculate the similarity between item i
and item j, as shown in (2):


u∈U ( i ) U ( j )
(ru ,i − r ∗,i )(ru , j − r ∗, j )
sim(i, j ) =


u∈U ( i ) U ( j )
(ru ,i − r ∗,i )2 ∑
u∈U ( i ) U ( j )
(ru , j − r ∗, j )2
(2)

c) Jaccard similarity
The rating data is not considered in Jaccard similarity measure method, but the ratio
of the number of users who score both two items to the number of users who score
either of two items is computed [12]. The similarity between item i and item j is shown
in (3):
Ui  U j
sim(i, j ) =
Ui  U j
(3)

d) Item attribute similarity


Different items can be classified into different categories according to the attributes
themselves, such as films can be divided into romance, action, thriller, adventure,
fantasy, comedy and so on; books can be divided into categories of youth, children’s
books, life, humanities and Social Sciences, economics and management, science
and technology and so on. An item may also belong to several categories, such as the
movie “The Professional” belongs to the drama, action movies and crime films. So we
define the item class collection C={C1,C2,…,Ck}, Ck is a class that defines the category
set (CiÍC) for the item i. The similarity between item i and item j can be calculated as
(4):

Ci  C j Ci  C j
sim
= (i, j ) ×
Ci  C j C
(4)

In this formula, |Ci∩Cj|/|Ci∪Cj| means the proportion of the intersection of category


collection of item i and item j in the union of category collection of item i and item j.
|Ci∩Cj|/|C| means the proportion of the intersection of category collection of item i and
item j in the entire item category set [9]. Compared with Adjusted cosine similarity
and Pearson correlation similarity, item attribute similarity doesn’t consider rating
data, only the attributes of the item been considered. Furthermore, consideration of
rating during the process learning can greatly improve the accuracy of item similarity
computation.

2.1.3 Producting collaborative filtering recommendation results


According to above similarity measure methods of the item, we calculate the
similarity between each item, which is stored in a two-dimensional and symmetric
326   Collaborative Filtering Recommendation Algorithm

matrix simn×n. Then the first K of the most similar items of each item are taken out to
constitute the neighborhood set Nk(i). I(u) represents the score set of user u for all
items and Nk(i)∩I(u) represents the first K of the most similar items of item i which is
scored by user u. The score of user u for item i is predicted by (5):

=ru ,i ∑ sim(i, j ) × ru , j
j∈ N K ( i )  I ( u )
(5)

2.2 Radial Basis Function Neural Network

RBF (Radial Basis Function) neural network is a local approximation forward network,
which has strong self-learning ability and nonlinear fitting ability [10].
RBF neural network has three layer perceptron: input layer, hidden layer (middle

layer) and output layer. Its structure is shown in Figure 1. X = [ x1 , x2 ,, xn ]T is the
input vector of the network, each hidden layer node has a center vector Cj, the hidden
 
layer unit function is called radial basis function f j =f j ( X − C j ),( j =1, 2,, m) . • is
Euclidean norm. ωi(i=1,2,…,m) is connection weights between the hidden layer units
and the output layer units. The final output of the network is noted as (6):
m
=y ∑ω j × fj
j =1
(6)

x1 f1
ω1

ω2 yi
x2 f2 ∑

ωm
xn fm

Input Hidden Output


Layer Layer Layer

Figure 1. Structure of RBF neural network


RBF neural network has a unique best-approximation characteristic, which
can fundamentally solves the local optimal problem of BP neural network. As
a consequence, its convergence speed is fast and the learning speed is thousands
 Collaborative Filtering Recommendation Algorithm   327

times than the BP neural network. RBF neural network requires less time in dealing
with large data sets and will get more accurate results [10]. Therefore, a collaborative
filtering recommendation algorithm based on item similarity learning in combination
with RBF neural network is proposed in this paper.

3 Collaborative filtering recommendation algorithm based on item


similarity learning
3.1 Algorithm Analysis

Since the traditional similarity calculation method takes into account the rating data,
the sparse problem of the rating data and the new-item-cold-start problem will lead to
the lack of accuracy of the item similarity calculation. In order to solve this problem,
algorithm that been proposed in this paper firstly calculates the first K most similar
items as the initial neighbor set of all the items based on the item’s attribute similarity
measure, not taking into account the rating data, but only considering the attributes
of the item.
As for the item similarity learning, the reference [8] uses the method of reducing
the square prediction error to study the item similarity, and this optimization problem
is solved by the stochastic gradient descent. But this kind of learning speed is slow,
the recommendation effect is not very ideal. In this paper, a collaborative filtering
recommendation algorithm based on RBF neural network is proposed for this
problem. In the algorithm RBF neural network is used to learn the similarity between
each item and items in its neighboring.
User rating for item reflects the user’s preferences for the item. Because of
uncertainty of user preferences, user’s scoring on the item can be regarded as a finite
probability distribution (Gaussian distribution). But literature [8] does not consider
the distribution characteristics of rating data, which makes similarity learning
inaccurate. In the hidden layer of RBF neural network unit function is also called
radial basis function, here Gauss function is adopted as the radial basis function (7):
 
 || x − c j ||2 
f j exp  −
= 
 2b j 2  j = 1, 2 , m
 (7)

Among them, bj is the width of the Gaussian Basis Function; m is the number of the
hidden layer nodes.
328   Collaborative Filtering Recommendation Algorithm

3.2 Algorithm description

The steps of collaborative filtering recommendation algorithm based on item similarity


learning can be describes as follows:
a) Calculate the similarity matrix Simn×n of all items according to (4), it is Sim(n+1)×(n+1)
when a new item is added.
b) Select the top K similar items of the target item ix as its neighbor set NK(ix)(0<x<n).
c) The rating vector P(ix) of the target item ix will act as the desired output, and the
rating vector x1,…,xk of the K adjacent items of the target item ix will be input into
the RBF neural network for learning item similarity. Finally the training model
net(ix) is obtained.
d) The rating vector x1’̍,…,xk’̍ of the K adjacent items of the target item in the test
dataset will be input into training model net(ix), and finally the predicted rating
vector P’̍(ix) of the target item will be output. The weight ωi(0<i<K) of output layer is
the similarity between the target item and its K adjacent items.
e) These K items will be sorted according to the similarity from large to small, in the
end they will be recommended to the user.

Figure 2 outlines the workflow of the algorithm.

Start

u.item Simn× n N K (ix )

P (ix ) u.data x1 ,  , xK

Expected Output
RBF − Neural − Network Input

net (ix ) x1 ',  , xK ' u1.base

P ' (ix ) Item Sorting Recommend to the


user

End

Figure 2. Workflow of collaborative filtering recommendation algorithm based on item similarity


learning
 Collaborative Filtering Recommendation Algorithm   329

4 Experimental results and analysis

4.1 Dataset

In the Experiments, three datasets are adopted. They are MovieLens100k,


MovieLens1M and MovieLens10M [13]. Each user scores at least 20 films and the
scores range from 1 to 5. The higher score indicates that the user’s preference for the
film is higher. MovieLens100k dataset contains one hundred thousand rating data, a
total of 943 users score for 1682 films, and the density of data is 6.3%; MovieLens1M
dataset contains one million rating data, a total of 6040 users score for 3952 films,
and the density of score data is 4.19%; MovieLens10M dataset contains ten million
rating data, a total of 72000 users score for 10000 films, and the density of score data
is 1.39%.
Three datasets are relatively sparse and are suitable for the proposed algorithm.
In the experiments, five training sets and test sets which are divided by 80% and 20%
from original data are used for cross examination, and the average value of the five
experiments are taken as the final results.

4.2 Metrics

In the experiments, MAE (Mean Absolute Error), Precision and ROC (Receiver Operating
Characteristic) are used as the measurement standards. MAE measures the accuracy
of the prediction through calculating the degree of deviation between the predicted
score and the actual score. The relationship between the size of MAE and the degree
of precision is inverse, namely the MAE value smaller, the recommendation accuracy
is higher. The formula is shown as (8).

MAE = ( |
ri ,j ∈T
pi ,j - ri ,j |) / N (8)
(8)

Among them, pi,j is the predicted score of the item, ri,j is the actual score of the item, T
is test set, and N is the number R of elements in the dataset.
MAE Pr
Precision = (eci
is
si on pi ,j =- of
the| ratio rTi ,jthe N number(9)
|) /equal of
(8)the predicted score and the actual
MAE = (in
score the| test
pi ,ji ,j -setri and
r ∈T
,j
|) / N
calculation (8)
formula is shown in (9). |T| is the total number of
rating in the test set, |R| is the equal number of the predicted score and the actual score
ri ,j ∈T

R test
in the = set. ri ,j And
: ri ,jthe ∈higher 
T, pi ,j the= ri ,j
R precision,
(10)
the higher the quality of recommendation.
Pr eci siRon = (9)
Pr eci si on = n T (9)
T ui
(9)
R = rR ri ,j n∈ T, pi ,j = ri ,j 
OC:  i 1
(11)
(10)
R = ri ,j : ri ,j ∈ T, p
i ,j
v=i r  (10)
(10)
i ,j i ,j
i 1
n


n  ui
1, pi ,RjOC i  n r i ,j  4
u 4 and
i 1
(11)
ui   (12)
C0, ot her
RO  (11)
i 1
n
wi se vi
 v i
i 1


1, pi , j  4
i 1

i 1,p ,j  4 and r i ,j  4


v (13)
ui   i  0, ot her wi se (12)
330   Collaborative Filtering Recommendation Algorithm

ROC sensitivity refers to the proportion of randomly selected “good” items in the list
of recommended systems. ROC sensitivity [11] value ranges from 0 to 1, the greater the
value, the better the performance of the system. The formula is shown as (11).
n

∑u i
ROC = i =1
n

∑v
i =1
i (11)

1, p ≥ 4 and r ≥ 4
ui =  i , j i ,j

0, ot her wi se (12)

1, pi , j ≥ 4
vi = 
0, ot her wi se (13)

4.3 Experimental results and analysis

In this paper, we design four groups of experiments. First of all, traditional


collaborative filtering recommendation algorithm is compared from the item similarity
measurement methods (Adjusted cosine similarity, Pearson similarity, Jaccard
similarity and Item attribute similarity). At the same time, the proposed collaborative
filtering algorithm is compared from the item similarity measure methods. Then
experiments are implemented on the three groups of datasets with different sparse
degree, verifying the efficiency of proposed algorithm in dealing with sparse data.
Finally, some classical algorithms are compared with proposed algorithm.
In this paper, abbreviations of all kinds of algorithms are used to represent the
corresponding algorithms, as shown in Table 2.

4.3.1 Traditional collaborative filtering recommendation algorithm


The traditional collaborative filtering algorithm is compared from the item similarity
measurement method (Adjusted cosine similarity, Pearson similarity, Jaccard
similarity and Item attribute similarity). The result is described as Figure 3.
From Figure 3, we can see that MAE achieves minimum value 0.7654 for Adjusted
cosine based ICF and the MAE value decreases with the increase of K value for Pearson
based ICF. For Jaccard based ICF, the MAE value decreases at first and then increases,
reaching a minimum value of 0.7538 when K is 100. The change law of MAE value is
also so for IAS based ICF, reaching a minimum value of 0.7326 when K is 300. From
 Collaborative Filtering Recommendation Algorithm   331

the overall point of view, the MAE values for Jaccard based ICF and IAS based ICF are
smaller, the MAE values for Adjusted cosine based ICF and Pearson based ICF are
relatively higher. For IAS based ICF, the MAE value is increasing with the increase of
K value when 300<K<500, but the MAE value tends to be stable when K>500. This
is because there are a lot of items which do not have high degree of similarity in the
neighbor set when the K value is large, resulting in a larger error of the predicted
score.

Table 2. Algorithm Abbreviation Table


Algorithm Abbreviated representation

Collaborative filtering algorithm based on adjusted cosine similarity Adjusted cosine based ICF

Collaborative filtering algorithm based on Pearson similarity Pearson based ICF

Collaborative filtering algorithm based on Jaccard similarity Jaccard based ICF

Collaborative filtering algorithm based on item attribute similarity IAS based ICF

Item similarity learning collaborative filtering algorithm based on Adjusted cosine based ISL-CF
adjusted cosine similarity

Item similarity learning collaborative filtering algorithm based on Pearson based ISL-CF
Pearson similarity

Item similarity learning collaborative filtering algorithm based on Jaccard based ISL-CF
Jaccard similarity

Item similarity learning collaborative filtering algorithm based on IAS based ISL-CF
item attribute similarity

Figure 3. Comparison chart of the MAE values of different item similarity measurement method on
traditional collaborative filtering recommendation algorithm
332   Collaborative Filtering Recommendation Algorithm

4.3.2 Collaborative filtering recommendation algorithm based on item similarity


learning
The item similarity learning collaborative filtering algorithm is compared from the
item similarity measurement method (Adjusted cosine similarity, Pearson similarity,
Jaccard similarity and Attribute similarity). The measured data is described as Figure
4.
From Figure 4, we can see that the MAE value has reduced to a certain extent after
learning item similarity for the four methods. Among them, Pearson based ISL-CF
has the most reduction, down 8 percent. The MAE values of Jaccard based ISL-CF and
IAS based ISL-CF are smaller and the MAE value of IAS based ISL-CF is the smallest
as 0.7006. This is because item attribute similarity and Jaccard similarity do not
consider the rating data and only consider the attributes of the item. In the learning
process the rating data is considered so as to improve the accuracy of item similarity
computation.

Figure 4. Comparison chart of the MAE values of different item similarity measurement method on
proposed collaborative filtering recommendation algorithm

4.3.3 Experiments on datasets with different sparse degrees


Three datasets are adopted, they are MovieLens 100k, MovieLens 1M, MovieLens 10M.
The data densities are 6.3%, 4.19%, 1.39% respectively, so that the degrees of sparsity
are 93.7%, 95.81%, 98.61%. Item similarity learning collaborative filtering algorithm
based on item attribute similarity is implemented in the three datasets.
Experimental results are described as Figure 5. From Figure 5 we can see the
MAE value will gradually increase when the degree of sparsity of the data gradually
increases. This shows that the lower the degree of sparsity, more abundant the rating
information, more accurate the item similarity computation, eventually leading to
smaller error between predicted score and actual score; when the degree of sparsity of
the data gradually increases, the speed of reaching the smallest MAE value becomes
 Collaborative Filtering Recommendation Algorithm   333

slower, that is K value becomes bigger. This is because that the higher the degree
of sparsity, item similarity computation needs more abundant rating information,
resulting in bigger neighborhood set; when the degree of sparsity of the data increases,
the rate of decline of MAE becomes faster. It shows that the item similarity learning
collaborative filtering algorithm based on item attribute item has good performance
in dealing with sparse datasets, proving the reliability of the algorithm.

Figure 5. Comparison chart of the MAE values of IAS based ISL-CF on three datasets

4.3.4 Compared with other classical algorithms


There are many classical algorithms in recommendation algorithms, such as matrix
decomposition (SVD) algorithm, SVD++ algorithm and SlopeOne algorithm. The
experiment compares item similarity learning collaborative filtering algorithm based
on item attribute with these classical algorithms on dataset MovieLens1M. The four
algorithms of this paper take its minimum MAE value.
Figure 6 shows that the MAE value is smaller, the precision is bigger and ROC is
bigger when implementing item similarity learning collaborative filtering algorithm
based on item attribute with respect to those classical algorithms. Results show that
the proposed algorithm has higher accuracy and better recommendation effect.

Figure 6. Comparison chart of different recommendation algorithms


334   Collaborative Filtering Recommendation Algorithm

5 Conclusion

Aiming at resolving problems of sparse data and cold-start of the new item, this
paper proposed a collaborative filtering recommendation algorithm based on item
similarity learning. By calculating the initial neighbor sets of all the items based on
the item attributes similarity measurement, the proposed algorithm firstly solved the
new-item-cold-start problem. Then the RBF neural network was used to learn the
similarity between each item and items in its adjacent set. By doing this, the data
sparseness problem was resolved and satisfactory similarity obtained.
Although the improved algorithm has obtained great recommendation effect,
we intend to further research in some aspects. Firstly, since users who are newly
added or don’t score on items, we will consider user tags and user browsing records
to recommend to avoid new-user-cold-start problem. Secondly, we will optimize item
similarity calculations and recommend more accurate results. Another interesting
topic is to implement the algorithm under the parallel environment.

Acknowledgment: This paper is supported by National Natural Science Foundation


of China(No.61300169) and the Natural Science
Foundation of Education Department of Anhui province(No.KJ2016A257). 

References
[1] ZHU Yangyong and SUN Jing, “Recommender system: up to now,” Jounal of Frontiers of
Computer Science and Technolog y, vol.9, no.5, pp. 513-525, 2015.
[2] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Item-based collaborative filtering
recommendation algorithms,” in Proceedings of the 10th international conference on World
Wide Web, Hong Kong, China, 2001, pp. 285-295.
[3] Y. Koren, R. Bell, and C. Volinsky, “Matrix factorization techniques for recommender systems,”
Computer, vol.42, no.8, pp.30-37, 2009.
[4] Hofmann T, “Latent semantic models for collaborative filtering,” ACM Transactions on
Information Systems, vol.22, no.1, pp. 89-115, 2004.
[5] X. Su and T. M. Khoshgoftaar, “Collaborative filtering for multi-class data using belief nets
algorithms,” in Tools with Artificial Intelligence, 2006. ICTAI’ 06. 18th IEEE International
Conference on. IEEE, 2006, pp. 497-504.
[6] AJB Chaney, DM Blei, and T Eliassi-Rad, “A Probabilistic Model for Using Social Networks
in Personalized Item Recommendation,” in Proceedings of the 9th ACM Conference on
Recommender Systems, Vienna, Austria, 2015, pp. 43-50.
[7] Zhang Lei, Chen Junliang, Meng Xiangwu, Shen Xiaoyan and Duan Kun, “BP Neural Networks-
Based Collaborative Filtering Recommendation Algorithm,” Journal of Beijing University of Posts
and Telecommunications, vol.32, no.6, pp. 0042-05, 2009.
[8] Feng Xie, Zhen Chen, Jiaxing Shang, Wenliang Huang and Jun Li, “Item Similarity Learning
Methods for Collaborative,” in Proceedings of the 29th IEEE International Conference on
Advanced Information Networking and Applications, Gwangju, Korea, 2015, pp. 896-903.
 Collaborative Filtering Recommendation Algorithm   335

[9] Wang Peng, Wang Jingjing, and Yu Nenghai, “A Kernel and User-Based Collaborative Filtering
Recommendation Algorithm,” Journal of Computer Research and Development, vol.50, no.7, pp.
1444-1451, 2013.
[10] Liu Jinkun, Adaptive control of RBF neural network. Beijing: Tsinghua Press, 2014, pp. 227-280.
[11] Yigit M, Bilgin B E, and Karahoca A, “Extended topology based recomendation system for
unidirectional social networks,” Expert Systems with Applications, vol.42, no.7, pp. 3653–3661,
2015.
[12] Ren Kankan and Qian Xuezhong, “Research on User Similarity Measure Method in Collaborative
Filtering Algorithm,” Computer Engineering, vol.41, no.8, pp.18, 2015.
[13] GroupLens,MovieLensDataSets.[Online].Available:http://www.Grouplens.org.
[14] Zhang Feng and Chang Huiyou, “Employing BP Neural Networks to Alleviate the Sparsity Issue
in Collaborative Filtering Recommendation Algorithms,” Journal of Computer Research and
Development, vol.43, no.4, pp. 667-672, 2006.
Huo-wen JIANG*, Hai-ying MA, Xin-ai XU
The Graph Merge-Clustering Method based on Link
Density
Abstract: Clustering is an important method to analyze and detect data, and graph
clustering is a very important feature-pattern clustering. To achieve good custering for
a kind of undirected and unweighted graph, this paper presents a merge-clustering
method based on the link density between nodes. Firstly we define inter-cluster
edge,link-cluster edge and intra-cluster edge reasonably, and their computation
formula are given. Then we define a vital parameter indicating link density between
nodes. Furtherly we give a merge-clustering method based on link density. The
graph after clustering can cover the most information of the original graph, finally
experiments also show that the algorithm achieves high clustering quality.

Keywords: Graph clustering; link density; cluster link ratio; merge clustering

1 Introduction

Graph is used to model all kinds of social network structure extensively. It maps the
entity of network in the real world to the nodes of the graph, and the relationship
between the entities is mapped to the edge of the graph. In practical applications, most
of the data objects with interaction or mutual relation can be seen as graph model
in some sense. Graph clustering is to partition all the nodes in a graph into some
clusters(also known as class, or collection) based on some similar principle, which can
be deduced from analyzing the intrinsic relationship between data objects by using
related knowledge of graph theory. In general, the similarity among nodes in a cluster
is more high, but the similarity is small between one node and the other from different
clusters. At present, there are a variety of applications in a wide range of domains for
graph clustering. For example, in the literature co-citation analysis, you can show
users successive layers of huge co-citation relationships in the limited screen space by
using graph hierarchical clustering, to help users get access to valuable information;
In the network visualization of topology structure, it can avoid roughness at the level
of interconnected autonomous systems and excessive intricacy at the routing level by

*Corresponding author: Huo-wen JIANG, College of Mathematics & Computer Science, Jiangxi
Science & Technology Normal University, Nanchang, China, E-mail: jhw_604@163.com
Hai-ying MA, College of Computer Science and Technology, Nantong University, Nantong,China
Xin-ai XU, College of Mathematics & Computer Science, Nanchang Normal University, Nanchang,
China
 The Graph Merge-Clustering Method based on Link Density    337

using graph clustering, hence it is more conducive to the prediction of prefix growth
model, routing traffic analysis, and is helpful to people design better agreements. In
the analysis of social network, graph clustering is helpful to discover the small world,
small group [1], so it is helpful for people to understand the characteristics of the group
structure, and so on. Graph clustering has many different methods, some of which
are more representative: Markov clustering [2,3], Spectral clustering [4], clustering
based on density [5], Clustering based on searching probability of similarity [6], and
distributed clustering algorithm based on adjacent points of k layers [7]. In Most of the
existing graph clustering methods, the similarity between two nodes can be defined
according to topology structure of their neighborhood or their attribute values. This
paper also proposes a graph merge-clustering method on the basis of investigation on
graph clustering problems. In our method, link tightness among nodes is defined as
a ratio value of edges-count between inter-cluster and intra-cluster, and all the nodes
of original graph are clustered into some clusters, which consist of at least k nodes
according to it.

2 Related preliminaries about graph clustering

In general, the graph can be represented by a tuple G=(V,E),where V={v1,v2,…vn} is the


set of nodes, and E ⊆ V×V is the set of edges. V and E can represent the numbers of
nodes and edges respectively. In graph, every edge e∈E corresponds to a pair of nodes
(vi,vj), i, j∈V, if the node pairs of (vi,vj) and (vj,vi) represent the same edge, the edge is
considered as undirected, if not it is directed. when every edge in a graph is assigned
a weigh value, it is called as weighted graph, instead, it is unweighted graph. The
graph investigated in this paper is undirected and unweighted one, and it is assumed
as a kind of simple graph,in which there is no multiple edges and multiple loops.
Where multiple edges refer to at least two edges existing between two nodes, and
multiple loops refer to edges linking same a node. Assuming G’=(Vi,Ei)(i=1,2,…,m) is
the result of graph G after clustering, it is subject to  V = V(G) and Vi  V j =∅ ( ∀i, j ∈
m
i
i =1

{1, 2, , m} ). Obviously, it corresponds to a kind of partition of all the nodes in


original graph G. Here a sub-graph (Vi,Ei) is seen as a cluster Ci, (i∈{1,2,…,m}). We
give several definitions as follows:

Def.1 (inter-cluster edge): ∀Ci ∈ G ' , inter-cluster edge of Ci refers to that both
linking nodes are in the same a cluster Vi.

Def.2 (link-cluster edge): ∀Ci ∈ G ' , link-cluster edge of Ci refers to that only one
linking node is in cluster Vi and the other node is not.

Def.3 (intra-cluster edge): ∀Ci , C j ∈ G ' (i≠j), intra-cluster edge between Ci and Cj
refer to that both linking nodes are in Vi and Vj respectively.
338   The Graph Merge-Clustering Method based on Link Density

Assuming that the number of inter-cluster edge of Ci is denoted by I Ci , the number


of link-cluster edge of Ci is denoted by X C , the number of intra-cluster edge between
Ci and Cj is denoted by d (Ci , C j ) , the degree of vertex v is denoted by deg(v), the new
i

cluster obtained by merging Ci and Cj is denoted by Ci  C j . We can speculate the


following formulas:

X Ci = ∑ deg(v)
I Ci = ECi v∈Ci I Ci
; -2 ;
I Ci C j = I Ci + I C j + d (Ci , C j )
;
X Ci =
Cj X Ci + X C j -2d (Ci , C j )
.
Based on the above formulas, furtherly we define a vital parameter, that is:

I Ci X Ci
LR(Ci)= / .

In general, the graph clustering is to partition those nodes with relative strong
relationship among them and their link edges into a cluster that form a sub-graph.
So it is assured that the nodes in the same a subgraph is with higher connectivity,
however, the connectivity degree between nodes in different subgraphs is relatively
low. it can be seen that LR (C) can comprehensively reflect the internal link tightness
of a cluster and the connectivity degree to the other cluster. If LR(C) value for a cluster
is high, it means that the cluster has more high internal link tightness and relatively
lower external link tightness. So, The two clusters Ci and Cj with the maximal value of
LR( Ci  C j ) are selected to merge into a new cluster every times during clustering.

3 The graph clustering algorithm based on link density

Moussiades et al. [8] proposed a graph clustering method based on inter connection
ratio, involving the idea of maximizing the parameter of inter connection ratio.
specifically, the algorithm partition the graph according to inter connection ratio, it
merge the most closely related clusters into a new cluster to obtain a perfect partition.
The paper improves the mothod presented by Moussiades et al., and proposes the
Graph Clustering Algorithm based Link Density, called GCA-LD for short. Assuming
that S denote the set of clusters including less than k nodes, and at the beginning
every node is seen as a initial cluster, the main ideas of the algorithm can be described
as follows: randomly selecting a cluster Ci from S, computing LR( C  C ) for each
i
C of the other clusters in S, selecting the cluster Cj with the maximal LR( Ci  C ) to
merge into a new cluster, i.e. Ci  C j , repeating this merging procedure until that
all the cluster contain at least k nodes. The algorithm GCA-LD is given as follows by
pseudo-code:
 The Graph Merge-Clustering Method based on Link Density    339

Algorithm 1: GCA-LD(G,k)
Input: graph G, a threshold value k;
Output: a set of clusters each of which consists of at least k nodes.
1. Vi={vi};
2. While there exists a cluster of size<k do
3. Randomly choose a cluster C of size<k;
4. Evaluate every value LR(C∪Ci), Ci is each of other clusters of size<k;
5. Find the Ct with the maximum LR(C∪Ci) value;
6. Merging C and Ct into a cluster;
7. End while

4 The experiment and analysis

To prove our algorithm feasible, in this chapter, we conduct experiments to analyze


the practical efficiency of GCA-LD and compare our algorithm with SCAN[5]. The data
used in our experiment come from the data set of Political Blogs [9] (http://www-
personal.umich.edu/∼mejn/ netdata/), a widely-used data source for research on
graph clustering algorithm. The density of cluster is introduced as a parameter of
evaluating clustering effectiveness, it is defined by the following formula:

m {(v p , vq ) | v p , vq ∈ Vi ,(v p , vq ) ∈ E}
density({Vi}m i=1)= ∑ E .
i =1

The experiment is carried out in such hardware environment: Intel Pentium double-
core E2140 @1.60GHz CPU, 1GB (DDR ) memory. The algorithm is implemented on
Microsoft Visual C++ 7.0. In view of that randomly picking an initial cluster may
lead to different result, each experiment runs 10 times and the final result of each
experiment is the average of the 10 outcomes.
We conducted two sets of experiments. The first experiment is to reveal the
effectiveness of clustering density and the rule of density changed with k. the result
of it is shown as Figure 1. It can be seen from Figure 1 that clustering density increases
with k. This can be explained as follows: the number of clusters decreases with k
increasing, then the relationship among clusters is reduced, it means that the number
of total intra-cluster edges decreases, accordingly the number of all the inter-cluster
edges increases, hence the clustering density becomes larger. Figure 1 also shows that
the density of GCA-LD is larger, though the density of them is close overall. This reflects
that our algorithm has achieved better clustering quality. The second experiment is
to evaluate the runtime effectiveness of our algorithm, Figure 2 reveals the result. As
is shown in Figure 2, the runtime increases with k for both SCAN and GCA-LD, and
it is more obvious to increase for GCA-LD. This reflects that cluster size can affect
340   The Graph Merge-Clustering Method based on Link Density

the runtime of SCAN only with a very small rise, yet the runtime of GCA-LD grows
obviously since much more merging computation need to be done with k increasing.

Figure 1. Clustering density comparison

Figure 2. Runtime comparison

5 Conclusions

This paper investigates a graph clustering method according to the link density
between nodes. This method at first assume each node as an initial cluster, it select
two cluster with less than k nodes to merge by the rule of maximizing the ratio of
inter-cluster edges to link-cluster edges every times, until every cluster contains at
least k nodes. Two graphs may have distinct clusterings even under they having
(asymptotically) the same degree distribution according to the existing work [10],
it can be confirmed by this investigation. Moreover, the algorithm proposed in this
paper can satisfy usual the need of graph clustering. For some large-scale complex
networks, which need much high quality of graph clustering, it is necessary to further
improve and explore on this method.
 The Graph Merge-Clustering Method based on Link Density    341

Acknowledgement: This work was supported by the the National Natural Science
Foundation of China under grant No. 71561013, 61402244; the science & technology
research subject of Jiangxi Provincial Education Department under grant No.
GJJ151255; and the science & technology research Subject of Nanchang Normal
University under grant No. JGKT-15-27.

References
[1] Z. W. Jia, J. Cui, H. J. Yu. Graph-clustering method based on the Dissimilarity. Journal of ShangXi
agriculture university(natural science edition)2009,29(3):284-288.
[2] V. Satuluri, S. Parthasarathy. Scalable graph clustering using stochastic flows: applications
to community discovery.Proceedings of the 15th ACM SIGKDD international conference on
Knowledge discovery and data mining. ACM, 2009: 737-746.
[3] S. V. Dongen. graph clustering by flow simulatiion. Utrecht: University of Utrecht, 2000.
[4] J. Shi, J. Malik. Normalized cuts and image segmentation. IEEE Transactions on pattern analysis
and machine intelligence, 2000, 22(8): 888-905.
[5] X. Xu, N. Yuruk, Z. Feng, et al. Scan: a structural clustering algorithm for networks. Proceedings
of the 13th ACM SIGKDD international conference on knowledge discovery and data mining.
ACM, 2007: 824-833.
[6] M. Kathy, S. Ambuj. Scalable discovery of best clusters on large graphs. The 36th International
conference on very large data bases, Singapore, 2010: 693 -702.
[7] H. C. Wang, J. Ma. Study of efficient clustering algorithm on large graphs. Journal of Chinese
computer systerms.2013,34(6):1417-1423.
[8] L. Moussiades, A. Vakali. Clustering dense graph: A web site graph paradigm. Information
Processing and Management, 2010,46(3):247-267.
[9] Y. Wu, Z. N. Zhong, W. Xiong, et al. An efficient method for attributed graph clustering. Chinese
Journal of computers. 2013,36(8):1704-1713.
[10] Y. L. Shang. distinct clusterings and characteristic path lengths in dynamic
small-world networks with identical limit degree distribution. Journal of Statistical
Physics.2012,149(3):505-518.
Qing-yun QIU, Jun-yong LUO, Mei-juan YIN*
Person Name Disambiguation by Distinguishing
the Importance of Features based on Topological
Distance
Abstract: Nowadays finding information about people from internet is more and more
popular. However, the results may be relevant to many namesakes, especially given
the explosive growth of internet data in the era of big data. To address the challenge
caused by name ambiguity in internet data, this paper proposed a framework of
name disambiguation for textual data. In the framework, we extracted the instances
of people entity profile from documents based on Topological distance, a multi-level
clustering algorithm was then developed to cluster the instances of people entity
profile, where a new method, Closeness, was introduced to measure the importance of
NE-types features for clustering, and final cluster of people entity profiles represents
a unique people entity. Extensive experiments showed that our method significantly
outperform baseline approaches. Further, our methods can contribute greatly to
extract people attributes and build knowledge base of people in the era of big data.

Keywords: name disambiguation; people entity profile; connectivity strength;


hierarchical clustering;

1 Introduction

In the age of information, people tend to find information about a particular person
from internet. However, due to the increasing availability of internet data, find the
relevant information about a particular person becomes more and more difficult.
People information is often scattered across variety of data sources, and names are
heavily ambiguous. For example, when Baidu Encyclopedia is queried with a person
name of “Jun LI”, there are up to 200 people called the queried name, including
famous actor, official, professor and common people. In order to find the information
about the particular person, it is necessary to conduct name disambiguation among
internet data.
Numerous approaches have been proposed for name disambiguation, among
which there are three kinds of means in general. In [1-4], authors developed clustering
techniques over the rich extracted biographic features, such as gender, nationality,

*Corresponding author: Mei-juan YIN, State key Laboratory of Mathematical Engineering And
Advanced Computing, ZhengZhou,450000, China, E-mail: raindot_ymj@163.com
Qing-yun QIU, Jun-yong LUO, State key Laboratory of Mathematical Engineering And Advanced
Computing, ZhengZhou,450000, China
 Person Name Disambiguation by Distinguishing the Importance of Features   343

origin, date of birth, family relationships, address, title, etc. The approach taken by
[5-7] and [8] applied hierarchical clustering over the similarity weighted by vector
space model between two namesakes. In [9], Lili Jiang changed documents into graph
based on the co-occurrence of features, and then graph clustering and partitioning is
developed to realize name disambiguation.
Above all kinds of methods, in terms of making use of named entity features to
disambiguating people, they did not distinguish the importance between features and
person directly. To solve the problem, we propose a framework of name disambiguation
for textual data. Different from the methods above, we compute the connectivity
strength between two namesakes based on distinguishing the importance of named
entity features. In terms of clustering algorithm, considering the diversity of features
and the fact that features are distributed unevenly in different documents, a three-
stage clustering is developed to cluster documents.
The rest of this paper is organized as follows. In section II, we describe the overall
framework and methods in each step used. Section III presents the experimental
results. We conclude our paper and discuss some feature work in Section IV.

2 Methodological framework

This paper discusses the method of personal name disambiguation for textual data.
A collection of documents about a people name collected from all kinds of data source
is denoted by D={d1,d2,......dm}. Assume the namesakes within the same document
correspond to the same person entity, our approach for name disambiguation is
to group the documents into different clusters, such that each resulting cluster
corresponds to a people entity.
Figure 1 depicts the framework of this method. Based on the people entity profile,
we extract the intances of people entity profile from original documents, and then
a clustering algorithm is developed to cluster the instances of people entity profile
to achieve the results of name disambiguation. In this section, we describe the
framework in detail.

People Entity
Profile

Intances of People
Extraction Entity Profile

Original Documents
Results of Name
Disambiguation
Clustering

Figure 1. The method framework


344   Person Name Disambiguation by Distinguishing the Importance of Features

2.1 People Entity Profile

In order to distinguish different namesakes in documents, we need to create a model


for the document firstly. In this paper, we propose a model called person entity profile,
in which the features which can contribute to name disambiguation are defined. Since
that different people entities usually have different attributes expressed in the form
of various type of named entities and bag of words, the person entity profile consists
of named entities features (NE-type features) and bag of words features (BOW-type
features).
Furthermore, NE-type features is divided into ordinary named entity, network
identification and extended named entity. Following is the details of each one:
–– Common named entity: person name, location, and organization which can be
recognized by natural language processing tools.
–– Network identification: people’s network identification such as email address,
phone number and other account registered in network platform.
–– Extended Named Entity: people’s works and honorary title which is marked by
special punctuation, such as double quotation and book title in Chinese text.

With the help of people entity model, one document can be represented as people
entity profile = {E, K}, where E is the NE-type features set and K is the BOW-type
features set.

2.2 Extracting Instances of People Entity Profile Based on the Topological distance

2.2.1 Text preprocessing


Firstly, we turned the documents into uniform string through preprocessing, including
removing noisy tags and irrelevant characters. Further, to reduce the computational
cost of name disambiguation, it is necessary to limit the context range around each
occurrence of the given people name. The traditional methods based on single
character, result in semantic debris easily. To solve the problem, we firstly develop
text segmentation, POS tagging and named entity recognition based on natural
language processing tools, while taking advantage of the regular match to recognize
extended named entity. Thus, the document is decomposed into a series of terms with
independent semantic. Now we define the topological list in the following.
Definition 1 topological list: A topological list is composed of a series of linear
adjacent nodes which correspond to terms with independent semantic generated by
document preprocessing.
The distance between two adjacent elements in topological list is defined as 1.
The process of generating topological list is as shown in Figure2 below.
 Person Name Disambiguation by Distinguishing the Importance of Features   345

Yong JIN comes from Hong Kong,China.


LTP and Regular Expression

Figure 2. Topological list

2.2.2 Extracting people entity profile from documents


In the following, we first give some definitions, then introduce how to extract the
instances of people entity profile from documents.
Definition 2 Topological distance: The Topological distance between feature f and
the given name pn, denoted by TD(f,pn), is defined as the smallest distance where
they appear in the topological list.
People entity profile is then extracted to represent a document based on
Topological distance, containing NE-type features and BOW-type features. Assume
that the window threshold of NE-type features and BOW-type features is denoted
by WE and WK respectively, and fei and fkj indicate NE-type element and BOW-type
element within the document respectively, the NE-type set E and BOW-type set K
defined in people entity profile are extracted according to formula (2) and (3) below:

= { |TD ( , pn)<WE, i=1,2…; } (2)


(2)

= { |TD( , pn)<WK, j=1,2…; } (3)


(3)

ሺ ሺͶሻ
2.3 Three-stage Clustering
Ž‘•‡‡••ሺ ’ሻൌ ሺͷሻ
After extracting the instances of people entity profile, the original documents set is
changed into the set of instances of people entity model, which is denoted by PEPD =
{pep1,pep2,,...pepM}, where pepi is made up of NE-type set and BOW-type set, according
to the difference
Pur= of features contributing
(6) to name disambiguation, a three-stage
clustering algorithm is developed to cluster the set of instances of people entity profile.

InvP= (7)
2.3.1 Clustering based on personal titles
A complete personal title consists of an organization name and a title keyword. In
F= (8)
the example of “The professor of Tsinghua University”, “Tsinghua University” is an
organization name, and “professor” is a title keyword. In general, a person with the
same organization name and title keyword is sole. Firstly, the clustering method is
based on personal titles within people entity profile.
346   Person Name Disambiguation by Distinguishing the Importance of Features

2.3.2 Clustering based on NE-type features


Considering that a large number of various types of named entities around the queried
name, including personal names, locations and organizations, etc. are likely to be
closely associated with the queried name, we don’t distinguish the difference between
various types of named entities. We think that there is a great possibility that two
queried name corresponds to the same person if they share more important named
entities. Next, we will define the Connectivity Strength between two namesakes to
weight the possibility.
Connectivity Strength: vector space model (VSM) is usually used in the calculation
of similarity between objects, but in case that a large of irrelevant named entities
is around the queried name, the VSM may be lead to a small similarity between
namesakes, making it difficult to achieve good effect on name disambiguation. To
this end, we weight the connectivity strength between two namesakes by measuring
the connectivity strength between named entity set Ei and Ej within people entity
profile. If=the
{ connectivity
|TD ( , pn)<WE, i=1,2…;
strength } a certain
exceeds (2) threshold, it is considered that
two namesakes belong to the same person entity. Connectivity Strength between Ei
= { |TD( , pn)<WK, j=1,2…; } (3)
and Ej is defined in formula (4) below.

ሺ (4)
ሺͶሻ
Where ∑km=1 fem is the common entities between Ei and Ej and Closeness(fem,pn) is the
Ž‘•‡‡••ሺ ’ሻൌ ሺͷሻ
closeness between fem and pn. This paper proposes a new measure Closeness(fem, pn),
which considers that the smaller Topological distance between named entity and the
queried name, the closer between them. To avoid that some irrelevant named entities
near the queried name are mistaken for important features, we take a number of
documents Pur= (6)
where the entity appear into consideration so that the smaller the ratio
= { |TD ( , pn)<WE, i=1,2…; } (2)
of the sum of TD(fe,pn) in multiple documents to the number of instance of people
entity profile containing the entity, the closer the entity related to the queried name.
= { |TD(InvP= , pn)<WK, j=1,2…; } (3)
The Closeness(f e
, pn) is defined in formula(7)(5), where N is the number of instance
of people entity profile that contains the entity fe and TD(fe, pn)i is the Topological
ሺ ሺͶሻ
distance between
F= fe and pn in the ith document. (8)
Ž‘•‡‡••ሺ ’ሻൌ ሺͷሻ
(5)

Combining (4) and (5), based on the results of the first-stage clustering, a single-link
hierarchical clustering is employed using NE-type features. The optimal threshold λ1
is acquired through training data.
Pur= (6)

InvP= based on BOW-type features


2.3.3 Clustering (7)
BOW-type features around a personal name can reflect many aspects of people to
some extent.
F= Therefore, the more similar
(8)two BOW-type features set, the more likely
two namesakes correspond to the same person. Since the method based on vector
 Person Name Disambiguation by Distinguishing the Importance of Features   347

space model constructed by TF * IDF has an excellent effect on the processing of


document similarity, we adopt it to weigh the similarity of two namesakes. Lastly, a
single-link hierarchical clustering is developed using BOW-type features. The optimal
threshold λ2 is acquired through training data.

3 Experimental Study

3.1 Data Sets

In this paper, we evaluated our framework using three different data sets. The first
dataset is Sogou news, which is labeled by Sogou corporation. The other two datasets,
Baidu Encyclopedia and web page, are private datasets labeled by myself. The details
about dataset is shown in Table 1 below. Next, extensive experiments were conducted
on a mixture data of three data sets to verify the effectiveness of the method.

Table 1. The number of documents about three kinds of corpus

Name/corpus type Sogou news Baidu encyclopedia Pages Sum

Jing Li 583 87 193 863

Jun Li 222 192 181 595

Ming Li 391 157 189 737

Li Li 71 44 187 302

Wei Zhang 246 178 189 613

Yan Zhang 60 47 193 300

Ping Wang 225 138 111 474

Lei Wang 770 97 140 1007

Wei Wang 304 161 188 653

Yong Wang 249 162 189 600

Li Wang 89 64 187 340

3.2 Evaluation Measures

1. The relation between the size of WK, WE and name disambiguation.


2. The paper adopts Purity and Inverse Purity to access the accuracy rate, recall rate
and comprehensive evaluation of the method, denoted by Pur, InvP and F value.
The detail is as follows:
= { |TD( , pn)<WK, j=1,2…; } (3)
= { |TD( , pn)<WK, j=1,2…; } (3)
ሺ ሺͶሻ
ሺ ሺͶሻ
ሺ ሺͶሻ
348 Ž‘•‡‡••ሺ
 Person Name
’ሻൌDisambiguation by Distinguishing
ሺͷሻ the Importance of Features
Ž‘•‡‡••ሺ ’ሻൌ ሺͷሻ
Ž‘•‡‡••ሺ ’ሻൌ ሺͷሻ
Accuracy:

Pur= (6)
(6)
Pur= (6)
Pur= (6)
Recall:
InvP= (7)
InvP= (7)
InvP= (7)
(7)
F value: F=
F=
(8)
(8)
F= (8)
(8)

We set α = 0.5, taking Fα=0.5 a comprehensive assessment of the precision and recall,
where S = {S1S2,…} is a clustering set generated  by experiment and R={R1,R2…} is a
standard clustering set labeled by manual work.
3. The contribution of personal title, NE-type features and BOW-type features to
name disambiguation.
4. The superiority of the method compared with baseline method.

3.3 Analysis of Results

Firstly, we need to select a proper size for WE and WK. To this end, we select five
person names to explore the relation between F-value and WE as well as WK. The
results are shown in Figure 3 and Figure 4 below. We can see that, when the WE and
WK are increasing, the F-value of name disambiguation increases in volatility, but
when they exceed a certain threshold, the effect will stay in a stable value.

Figure 3. The results of relation between F value and WE


 Person Name Disambiguation by Distinguishing the Importance of Features   349

Figure 4. The results of relation between F value and WK

Based on the results of Figure  3 and Figure  4, we selected WE = 500, WK = 100 as


window size. By a large number of experiments on Li Jing training set, when λ1 = 0.10
and λ2 = 0.50, we obtain a result of 93.6% accuracy rate, 88.3% recall rate and 90.9%
F value. Next, 10 other person name were regard as test set to assess the effectiveness
of the proposed method. The experimental results of are shown in Table 2 below:

Table 2. The results of accuracy rate,recall rate, and f value of the method

Name Pur (%) InvP (%) F (%)

Jun Li 89.2 83.3 86.2


Ming Li 91.3 88.2 89.7
Li Li 100 94.4 97.1
Wei Zhang 94.7 82.9 88.4
Yan Zhang 90.0 81.7 85.6
Lei Wang 98.7 85.2 91.5
Wei Wang 88.2 86.8 87.5
Yong Wang 96.8 86.7 91.5
Ping Wang 98.2 88.0 92.8
Li Wang 96.6 88.7 92.5

Extensive results on multiple names show the proposed framework can obtain a
stable and excellent effect on name disambiguation.

3.3.1 Feature analysis


F values ​​were recorded at each stage of clustering and the results were shown in
Figure 5 below.
350   Person Name Disambiguation by Distinguishing the Importance of Features

Figure 5. The results of feature analysis

As the results indicate, the personal title, NE-type features and BOW-type features all
make a contribution to name disambiguation, showing that the people entity profile
proposed in this paper is effective.
We used TF-IDF[5] as the baseline method. TF-IDF employs the vector space
model in clustering based on the feature frequency and inverse document frequency.
The experimental results are presented in Figure 6.

Figure 6. Comparison results of two methods

As the results show, the proposed method makes an improvement over the method
TF*IDF in name disambiguation.
 Person Name Disambiguation by Distinguishing the Importance of Features   351

4 Conclusion and future work

This paper proposes a novel framework to disambiguate personal names in textual


data from internet. The people entity profile is devised to model the document, and
Topological distance model is defined to extract instances of people entity profile and
weight the importance of NE-type features. Lastly, a three-stage clustering algorithm
is developed to cluster all documents for each person entity, which effectively
solves the problem of name disambiguation for textual data. Specially, the method
takes features weight into account to meature the importance of NE-type features.
Furthermore, an extensive performance study was performed using several datasets
and the proposed approach outperforms the basic methods.
In the future, we have two main research interests. One is the extraction of
features related to people. As we know, most natural language processing tools were
previously used to train a model for a news corpus, such as ICTCLAS used in this paper,
yet the internet data is noisier. Thus, some specific extraction models are needed for
more accurate features extraction in variety of textual data from internet. Secondly,
in the process of weighing the connectivity strength between two namesakes using
NE-type features, the same entity can appear in diverse forms in different documents,
such as “BeiDa” and “Peking University” corresponds to the same organization entity,
resulting in the result of connectivity strength between two namesakes inaccurate.

References
[1] Anna Lisa Gentile, Ziqi Zhang, Lei Xia, and José Iria. “Graph-based semantic relatedness for
named entity disambiguation”, 2009.
[2] Li Li, “Chinese personal name disambiguation based on attribute information”, 2012.
[3] Man LAN, Yuzhe ZHANG, Yue LU, Jian Su, and Chew Lim TAN, “Which Who are They? People
attribute extraction and disambiguation in the web search results”, 2010.
[4] Feifei ZHANG, Zonghai LI, Xiaohui ZHOU, and Xiaoge Li, “Cross-document Chinese personal
name entity disambiguation based on hierarchical clustering”, Computer Engineering &
Applications, 2015.
[5] YuChuan Wei, Ming-Shun Lin, and Hsin-Hsi Chen, “Name disambiguation in person information
mining”,2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main
Conference Proceedings)(WI’06), pp.378-381, 2009.
[6] Huizhen Wang, Haibo Ding, Yingchao Shi, Ji Ma, Xiao Zhou, and Jingbo Zhu, “A multi-stage
clustering framework for Chinese personal name disambiguation”, 2010.
[7] Masaki Ikeda, Shingo Ono, Issei Sato, Minoru Yoshida, and Hiroshi Nakagawa, “Person name
disambiguation on the web by two-stage clustering”, 2009.
[8] LANG Jun, QIN Bing, SONG Wei, LIU Long, LIU Ting, LI Sheng, “Person name disambiguation of
searching results using social network”, Chinese Journal of Computers, 32(7):1365-1374,2009.
[9] Lili Jiang, Jianyong Wang, Ning An, Shengyuan Wang, Jian Zhan, and Lian Li, “Grape-A
graph-based framework for disambiguating people appearances in web search”, 2009 Ninth
IEEE International Conference on Data Mining, pp.199-208, 2009.
Bo HU, Yu-kun JIN, Jun LIU, Ai-jun FAN, Hong-bo MA*, Chong CHEN
A Security Technology Solution for Power Interactive
Software Based on WeChat
Abstract: This paper introduces a kind of security scheme of instant interaction
messaging software for the electric power customer based on WeChat platform service,
including the system security, network security and data security, to strengthen
information interaction between inner and outer net security, we use the data network
storage, application, networks outside isolation equipment, information encryption
methods, etc.. In guarantee under the premise of safety, construct interactive service
channel with the help of mobile Internet technology, providing customers with
diverse information, rapid response, easy operation, and flexible interaction of
service experience, improve electricity information transmission speed and scope so
as to improve user satisfaction for the electric power enterprise services.

Keywords: eChat; power system; security technology network

1 Introduction

WeChat is a mobile phone application software with high penetration, it has


characteristics like cross-platform and across the network, it can send free voice,
text, video, images and text, etc. In the aspect of information interaction it has the
characteristics of economical, convenient, fast and intuitive, easy to share. The instant
interaction messaging software for the electric power customer based on WeChat
platform service provide power charge enquiry, power outage notice, electricity
bills, pay the electricity purchasing, business consulting, and other functions, fully
consider convenience, timeliness, operability and interactive information security
requirements in the design of interactive customer service. Users only need to install
WeChat client and pay attention to the electric power enterprise service account
without having to install other programs; It supports for voice, simplify client text
input; Integrated intelligent robot and the knowledge base, quickly respond to
user needs; Using data network storage, application, networks outside isolation
equipment, information encryption methods, to strengthen information interaction
between inner and outer net security. Based on the mature interface of WeChat

*Corresponding author: Hong-bo MA, NARI Group Corporation (State Grid Electric Power Research
Institute), Beijing Kedong Electric Power Control System Co, Ltd, Beijing, China,
E-mail: mahongbo2@sgepri.sgcc.com.cn
Bo HU, Yu-kun JIN, Jun LIU, State Grid Liaoning electric power supply Co., Ltd., Anshan, Liaoning, China
Ai-jun FAN, Chong CHEN, NARI Group Corporation (State Grid Electric Power Research Institute),
Beijing Kedong Electric Power Control System Co, Ltd, Beijing, China
 A Security Technology Solution for Power Interactive Software Based on WeChat   353

public platform to realize the integration with the WeChat platform, making full use
of existing secondary development methods such as intelligent recognition, robot
technology to shorten the software development cycle, reduce development and
maintenance costs.
Anshan power supply company relying on its scientific and technological
project, spearheaded instant communication platform of interactive services software
development and interactive service channel construction base on WeChat with the
help of mobile Internet technology, providing customers with diverse information,
rapid response, easy operation, flexible interaction of service experience, improve
electricity information transmission speed and scope, improve user satisfaction for
the electric power enterprise services.
This article explains the security technology from three aspects:
1. System security design.
2. Network security design.
3. Data security design.

2 Related work

Network isolation and intrusion detection are two system security technologies
commonly used in network security, when it comes to network resources and the
Internet communication [1], in order to protect the internal data security, we need to
use to secure access technology, it provides the whole internal network security and
the network connection.

2.1 Network isolation

Network isolation technology refers to realize information exchange and resource


sharing among two or more computers or network on the basis of disconnected, that
is to say, the network isolation technology can make two network physical isolation
[2], and do safe data exchange under the network environment. Network isolation
technology is to make sure the isolation of harmful attacks, complete the security
exchange of data between network under the premise of outside the trusted network
and ensure the trusted network internal information won’t leak [3].

2.2 Intrusion detection

Intrusion Detection (Intrusion Detection) is to detect Intrusion behavior. It can get


a lot of key information in a computer system by collecting and analyzing network
behavior, security logs, audit data and other information available on the network,
354   A Security Technology Solution for Power Interactive Software Based on WeChat

check the network or system whether there is a violation of security policy and the
signs of being attacked [4].

2.3 Security access technology:

Border access platform is an IT network system to provide Internet access to the


network platform, it provides the whole internal network security and the network
connection. The user can access the Internet and external personnel can access to a
part of the network resource.

3 Security design

Topology of the network is as follows:

Figure 1. Network Topology

As is shown in Figure  1, it’s the physical device topological logic diagram, the
software needs to transmit data and information on the Internet, intranet and outer
network. Client use mobile devices to connect to the network information via the
Internet, cross the firewall and connect to the WeChat server cluster deployed in the
information network, when do data operation, it needs to access to the information
network database for data query and use the isolation device between the information
network and information network outside, ensuring information security. In the
whole process, we need to use a variety of methods like security isolation, intrusion
 A Security Technology Solution for Power Interactive Software Based on WeChat   355

detection, security access to ensure that the system will not be from internal or
external security threats, and keep the entire software system security, network
security and data security.

3.1 security architecture design

The security architecture follows the SGC overall information security strategy,
meet the requirement of information security, at the same time, and pay attention to
operation security, safety management and safety protection measures to prevent the
spread of security risk [5].
The safety system design consists three aspects from the system security,
network security and data security, and use five types of specific methods(protection,
detection, response, recovery and improve) to carry out the safety measures, as is
shown in Figure 2.

Figure 2. Security architecture


356   A Security Technology Solution for Power Interactive Software Based on WeChat

3.2 Network security design

According interactive services network structure characteristics of the software based


on WeChat, the network layer security measures in accordance with the requirements
such as structure security, access control, security audit, boundary integrity check,
intrusion prevention, network equipment protection, and the overall protective
structure as shown on Figure 3.

Figure 3. Network security protection architecture

3.2.1 Network isolation measures


The use of firewall security technology that security isolation on different networks
[6]. According to the requirements of safety isolation of WeChat interactive
services,security access platform using mobile Internet and information network,
the information between the intranet and information network, the network security
isolation device realizing safety isolation.

Network equipment safety measures


Use access control (such as access to equipment control of the Console and aux,
Telnet and SNMP) means to enhance network security with the device itself, specific
include:
a) Prevent illegal access to Console
Set the non-privileged password for lower privilege level network management
personnel; Set privileged password, for only the privileged password can login
to routers and switches and modify the device configuration; Using password
encryption to avoid the password on the Internet to plaintext transmission and
avoiding configuration file to display password codes; Set the timeout limit, eliminate
trespassing when manager left and leave the terminal configuration window opened;
 A Security Technology Solution for Power Interactive Software Based on WeChat   357

equipment placed in a locked cabinet, switches and routers are placed in the cabinet
with a lock, to prevent the irrelevant personnel close to them.
b) Prevent illegal access to Telnet
Set a password on a telnet virtual logic port, only enter the correct password can log
in the network equipment; Setting the ACL control allowed telnet address and subnet;
Set the access control server ACS certification to control the network equipment of
Telnet login operation.
c) Prevent illegal SNMP access
Set the access-list and access-class to limit the IP address of the network management
equipment, allows only the specified access of SNMP network management
workstation; Set no service tcp-small-servers, no service finger and access list to
restricting TCP access to Cisco network equipment; Set the SNMP management
password (community string), so that only with legitimate password management
network equipment to the equipment management.
d) Prevent illegal remote login through the AUX port
Set password on the AUX port to ensure that the phone line is not connected to the AUX
port, only on the router for remote maintenance to establish a physical connection.
e) The other protective measures
Modify the login banner configuration, hide the true information of the router system,
prevent information leakage; use the enable secret configuration to encrypt the secret
password; configuring timeout function on the vty, console to increase the system
access security; configuring vty port access list to increase system accessing security;
configuring vty access methods, such as SSH to increase the system access security;
configuring user authentication methods to enhance the security of system access;
configuring AAA to increase user access security; AAA accounting of configuration
command to enhance the system access security.

3.2.2 Network intrusion detection


Network intrusion detection provides real-time intrusion detection and the
corresponding protective measures can be found illegal access, blocking network
connections, internal unauthorized access and so on, and it can find more covert
attack.
The specific measures of network intrusion detection are as follows:
a) Real-time intrusion detection, comprehensive detection of possible intrusion
behavior, timely identification, blocking, weakening the attack behavior, and can
record in detail, generate intrusion detection report and warning;
b) In accordance with the need to carry out multiple levels of scanning, in accordance
with the specific time, breadth and fineness of the requirements to configure multiple
scanning;
c) Detection and scanning behavior can not affect the normal network connection
services and network efficiency;
358   A Security Technology Solution for Power Interactive Software Based on WeChat

d) The characteristics library of the detect should be comprehensive and timely


update;
e) Security detection strategy can be set by the user, the detection intensity and the
degree of risk management, users can choose according to different needs of the
corresponding detection strategy [7].

3.3 System security design

System security mainly for the system server environment for security reinforcement,
specific security measures such as firewall, intrusion detection, etc.

3.3.1 Software and hardware environment security reinforcement


Based on micro interactive services software of the software and hardware environment
(such as server, database, middleware, etc.), at the time of purchase meet the safety
requirements of State Grid Corporation, before put into operation for security settings
and security reinforcement. During the operation for access control and security audit
of unified management.

3.3.2 Database security reinforcement


Database security reinforcement using security zone settings and access control
measures to achieve.
Application server and database server deployment in the State Grid Corporation
Information Intranet environment and the micro channel can only be used by safety
isolating device access application server to get data services, to avoid the application
of micro channel to bypass the application services directly to the database access
occurs.
The database server resource management strategy are used to prevent some
users of the specific business operations of server resources excessive use. System
of multiple service modules should correspond to multiple users of the database. In
order to achieve the resource management function, the database software security
features for each module of database users granted access to business data Table s,
views, stored procedures, database object privileges.

3.4 Data security design

Business data is the basis of the normal operation of the system. It must ensure
the confidentiality, integrity and availability of business data in the process of
transmission, processing and storage.
 A Security Technology Solution for Power Interactive Software Based on WeChat   359

Data security measures mainly include database login authentication, data


transmission encryption and data access control.

3.4.1 Database login authentication


Login database, provide the user name and password credentials, database
management system for the user login authentication, and authentication through
the access to have access to the information within the data.
Authentication is a process of identification and verification of the client, the
client end user, service, process or computer, through the authentication of the client
is known as the main body. Authentication can occur across multiple layers of an
application. Micro channel users at first by a front-end application for authentication,
mainly according to the user name and password are; then at the request of the user
terminal through the security isolation devices, by the background application for
validation of user permissions.

3.4.2 Data transmission encryption


In order to ensure the micro channel application data confidentiality and security, on
the Internet to information network data transmission in the process of using RSA and
DES encryption algorithm to encrypt the data transmission, the information network
to information network data transmission in the process of the plaintext [8]. The
safety of the transmission channel, Internet information to outside the CMMA/GPRS
dedicated channel, combined with the Secure Hypertext Transfer Protocol HTTPS
interactive data transmission, in the information network to information network the
safety isolating device for data transmission.
Encryption procedure is as follows:

Figure 4. Data encryption process


360   A Security Technology Solution for Power Interactive Software Based on WeChat

4 Security test

4.1 Network security test

Test purpose The isolation ability of illegal access

Test method 1. Illegal access to inner information network equipment through console
2. Illegal access to inner information network equipment through telnet
3. Illegal access to inner information network equipment through SNMP
4. Illegal login inner information network equipment through AUX port

Test result 1. Fail to Illegal access to inner information network equipment through console
2. Illegal access to inner information network equipment through telnet
3. Illegal access to inner information network equipment through SNMP
4. Illegal access to inner information network equipment through AUX port

4.2 System security test

Test purpose 1. System access control and security audit ability and database management
functions

Test method 1. Normal and illegal access to system respectively, inspect for illegal access to the
isolation and the normal access to the security audit.
2. Attempt to access data from other business modules in the database

Test result 1. The system can isolate the illegal access and audit the normal access.
2. Users can only access specific business system data, and can not access the
database of other business module data

4.3 Data security test

Test purpose Identity authentication ability and data transmission encryption of database

Test method 1. Brute force login database system


2. Intercept data packets, trying to parse the data and obtain sensitive information

Test result 1. Can not break the database identity authentication system
2. Can not resolve the data encryption, can not get the sensitive data
 A Security Technology Solution for Power Interactive Software Based on WeChat   361

5 Conclusion

This paper provides a security architecture for a power system software, From the
point of view of the system security, network security and data security, use the
security isolation, intrusion detection, network security and other security technology
to protect the information transmission channel between the Internet, information
network and information network, and ensure intranet data will not be because of the
view that the leak or for other reasons. Although the scheme is designed according to
the power industry, the use of the method and technology can also be used in other
industries to provide reference.

References
[1] Tang J G, Zhang S J, Jiang J. Research on Network Security Issues and Security Model[J]. Applied
Mechanics & Materials, 2014, 519-520:128-131.
[2] Ruan J, Zhang P, Ding H B. Network Security-Related Technology Research and
Implementation[J]. Applied Mechanics & Materials, 2013, 433-435:1720-1723..
[3] Zhang D G, Wu Y, Zhang W B, et al. The Design of a Physical Network Isolation System[J]. Applied
Mechanics & Materials, 2014, 687-691:2192-2195.
[4] Wu K, Zhang T, Li W. Research and Design of Security Defense Model in Power Grid Enterprise
Information System[C]// International Conference on Multimedia Technology. IEEE, 2010:1-4.
[5] Xiaoli G, Hui W. Research on the Network Security Situation Awareness Model for the Electric
Power Industry Internal and Boundary Network[J]. Journal of Applied Sciences, 2013, 13(16).
[6] Yang Luming, Xiao Xiao.Network security and firewall technology[J]; Computer And Information
Technology,2004,03
[7] Zhang Mingqing,Wang Jindong,Han Jihong.Information system security technology strategy
research.Computer Application Research 2001,05.
[8] Xia Xiaozhu, Chen Hui.Power grid construction safety management and control based on the
WeChat platform.China New Technology And New Products.2016,21.
Miao FAN*, Jia-min MAO, Jao-gui DING, Wei-feng LI
Two-microphones Speech Separation Using
Generalized Gaussian Mixture Model
Abstract: In this paper we present a novel spatial speech separation scheme by using
two microphones. The technique utilizes the estimation of interaural time difference
(ITD) statistics for the separation of mixed speech sources. The novelties of the
paper consist in the use of Generalized Gaussian Mixture Model (GGMM) for speech
separation frame by frame and cross-correlation coefficient for distributed parameter
selection. These are done frame-by-frame, which provides a dynamically changing
time-frequency masking. The proposed model can be extended to audio enhancement.
Our objective quality evaluation experiments demonstrate the effectiveness of the
proposed methods and show significant quality improvements over the conventional
ICA and dual ITD based methods.

Keywords: interaural time difference (ITD) statistics; Generalized Gaussian Mixture


Model; cross-correlation coefficient; time-frequency mask.

1 Introduction

All despite great progress in recent decades, the performance of sound separation
systems is still not good enough when compared to human auditory system. For
instance, the ’cocktail party’ effect has demonstrated the ability of human listeners
to segregate the voice of a particular speaker from the other spatial sources, such as
unrelated speakers, background music, and environmental noise. A common example
of the well-known ’cocktail party’ problem is the situation in which the voices of two
speakers overlap. How to solve the ’cocktail party’ problem and obtain enhanced voice
of a particular speaker in machines have grabbed serious attention of researchers.
Separated speech can obtain in time domain and frequency domain. A time
domain algorithm can obtain the more independent separated signal. While it
requires large amount of calculation and it has poor convergence. A frequency domain
algorithm usually uses STFT (short time Fourier transform) changing speech into
frequency domain. It changes the sequence of original signal and it requires of power
normalization. The frequency domain algorithm is more complexity than algorithms
in time domain.

*Corresponding author: Miao FAN, Department of Electrical Engineering, Graduate School at


Shenzhen, Tsinghua University, Shenzhen, China, E-mail: fanm14@mails.tsinghua.edu.cn
Jia-min MAO, Jao-gui DING, Wei-feng LI, Department of Electrical Engineering, Graduate School at
Shenzhen, Tsinghua University, Shenzhen, China
 Two-microphones Speech Separation Using Generalized Gaussian Mixture Model   363

Depending on the number of microphones, speech separation can be divide


into signal channel speech separation, dual channels speech separation and multi-
channels speech separation. The dual channels model of speech signal is the most
similar one to the human auditory system, so that dual channels speech separation
algorithms are more suitable to solve practical problems. In this paper, we use ITD
of two microphones to separate speech. It is one of dual channels speech separation
methods.
In [1], Chanwoo Kim et al. presents a new algorithm which selects a fixed ITD
threshold across the whole utterance by minimizing the correlation of nonlinearity
power from the masked and non-masked spectral regions. Instead of a fixed
threshold, in [2], the authors employed a statistical modeling of angle distributions
together with a channel weighting to determine which signal components belong to
the target signal and which components are part of the background noise. In [3], the
author presents the Laplace mixture model to fit the distribution of the ITD statistics.
In this paper we present a new ITD statistics based technique capable of
separating speech signals through two microphones. Moreover, there are another two
contributions in this paper: 1) We novelly employ Generalized Gaussian Mixture for
estimating the ITD statistics, which can provide the different PDFs of the target and
interfering source at a given frame; and 2) We use correlation coefficient to select the
best distribution. The framework of our approach is illustrated in Figure 1.

Figure 1. Block diagram of the proposed approach. STFT: Short Time Fourier Transform, ITD:
Interaural Time Difference, GGMM: Generalize Gaussian Mixture Model, IFFT: Inverse Fast Fourier
Transform, OLA: OverLapping and Adding.

This paper is organized as follows. In Section 1, we introduce the background and


some common algorithms of speech separation. Then we build the time difference
model in Section 2. In Section 3 and Section 4, our approach is described in detail. In
Section 5, we use the experimental results to explain the effectiveness of the proposal
method. Finally, we draw our conclusions in Section 6.
364   Two-microphones Speech Separation Using Generalized Gaussian Mixture Model

2 Time Difference Model

We suppose that there are I (I = 2) sources in a sonic environment. The signals from
two different microphones are defined respectively as:

(1)

where and denote the weighted coefficients of the recordings of the left and
right microphone from i-th source separately. is the time delay of arrival (TDOA) of
i-th source between two microphones. By the short-time Fourier transform (STFT), the
signals can be expressed as:

(2)

where m is the frame index and = 2πk/K. Here k and K are the frequency index
and total frequency bins respectively. As the weighted coefficients do not affect the
size of phase difference, the time delay measured in a particular time-frequency [m,
k] can be expressed as:

(3)

where ∠· indicates the phase of signal. r is an integer which


makes the value of limited between [−π, π].

3 Proposed approach

Several mixture models are used to obtain the distribution characteristics, e.g..
Gaussian Mixture Model [4] (GMM), Laplacian Mixture Model (LMM). As both the
Gaussian distribution and Laplace distribution are special cases of Generalized
Gaussian distribution. So we can utilize the Generalized Gaussian Mixture Model
(GGMM) to obtain a precise fitting. In our Generalized Gaussian Mixture Model
(GGMM), τ[m, k] is modeled as:

(4)

where is the set of Generalized Gaussian Mixture Model parameters and


. More specially


 Two-microphones Speech Separation Using Generalized Gaussian Mixture Model   365

and

where, , . The E-step [5] calculates the conditional


expectation of the log-likelihood of the complete data:

(7)

where [m, k] can be obtained by Bayesian principle:

(8)

The M-step includes updating the collection of all unknown parameters. We have

(9)

(10)

(11)

Normally, the parameter of λ can be updated refer to [6] using the Newton-Raphson
method. Considering that λ = 1 and λ = 2 responding to Laplace and Gaussian
distribution respectively, here we optimize λ in view of the whole speech instead of
each frame.
The algorithm of GGMM is summarized in Algorithm 1. The detailed information
about the selection of λ will be described in the following section.

Algorithm 1 GGMM

1) Initialize the estimation of [m]


2) Update the values of [m], [m] and i [m] using update Eqs. (9), (10) and (11).
3) Repeat 2), until the log-likelihood function (7) converges. 4)Select the optimal λ using (18)
366   Two-microphones Speech Separation Using Generalized Gaussian Mixture Model

4 Source Separation

After obtaining the probabilistic fittings of the ITD, we adopt the masking method to
separate the target and interfering sources, in which how to determine the masking
weights at a given time-frequency point is very important. In our studies, we employed
the Likelihood Ratio Criterion, which provides a binary masking. The Likelihood Ratio
Criterion (LRC) is used to generate a binary mask. Two hypothesis H0 and H1 which
indicate the source plays a dominant role in the mixtures or not can be described as:
H0: target is dominant
H1: interference is dominant

The LRC criterion suggests the following decision rule in GGMM:

(12)

where superscript G indicates the likelihood term associated with GMM and we use
the subscripts i, κ to represent different sources. Let

(13)

be the mask indicator function of source i for time-frequency point [m, k]. Then the
speech of source i can be separated as:

(14)

where [m, k] is defined as:

(15)
Motivated by [1], here we perform an exhaustive search to find the optimal λ using the
cross-correlation coefficient,

(16)

where are defined as:

(17)

where = 0.3. The optimal is then obtained by minimizing the ,

(18)
 Two-microphones Speech Separation Using Generalized Gaussian Mixture Model   367

Finally, we can obtain the separated speech waveforms using the Inverse Fast Fourier
Transform (IFFT) and Over Lapping and Adding (OLA).

Figure 2. The illustration of microphone and source placement.

5 Experimental Evaluations

Dual-channel distorted speech signals were used to evaluate our proposed algorithm.
The source signals (100 sentences) were recordings of 2s length obtained from
concatenating sentences randomly drawn from the TIMIT database [7] at 16KHz
sampling rate. The set of experiments was conducted using simulated reverberant
environments in which the target speaker is masked by a single interfering speaker.
The distance between two microphones is 2.05 cm. Reverberation simulations were
accomplished using the Room Impulse Response (RIR) open source software package
[8] based on the image method. In the experiments in this section, we assumed room
dimensions of 6 × 4 × 2.5, with microphones that are located at the center of the
room. The reverberation time is about 0.1s. For all speakers, the distances between
the measures positions (speaker locations) and the center of the microphones are
1.5m. We generate 100 mixtures respectively for two environments (S1 is the target
source, S2 and S3are the interferer source respectively). We evaluate the quality of the
recovered speech using Perceptual Evaluation of Speech Quality (PESQ). In addition,
the performance of the speech is measured by composite objective measures Csig,
Cbak and Covl [9].
With an overlap of 75%, we assign the window length as 1024 samples. In order
to obtain enough information of τ, we statistic every 2N + 1 frames. Namely, when
368   Two-microphones Speech Separation Using Generalized Gaussian Mixture Model

we obtain the information of τ[m] for the frame m, we will statistic . Here we
set N as 2. Furthermore, we make the constraint that the angle between target source
and interfering source is larger than 3°. When we initialize the update equations, K −
means can be used to obtain the rough initial values of µ.
We compare our approaches with other existing dual channel speech separation
approaches. For convenience, these comparing approaches are referred to as GMM,
LMM [3], AUTO [1] and SMAD [2] in our experiments. Figure 3 and Figure 4 show the
separation results of S1. We do not assume that the probabilistic density functions
(PDFs) of the spectrum are Gaussian or Non-Gaussian. The results of by AUTO are
better than GMM, LMM, SMAD. The Auto utilize a fixed threshold by minimize the
correlation coefficient, the scope of the threshold is strict because if one separated
speech including two speakers and another including almost nothing, the correlation
coefficient will also be small. The nearer the distance between two sources, the more
possibly this situation happens. As a result, the performances of by AUTO are well,
while performances by AUTO of S1 in S1S2 situation are poor. The methods based on
statistics avoid the drawback of AUTO.
Results indicate that both the performances by SMAD of S1 in two situations are
poor. Unlike traditional model based on ITD, SMAD is based on statistical angles,
which requires the situation that distance between two microphones is close while our
database does not strictly meet this condition. As illustrated in [3], the performances
of LMM are better than GMM, which indicates that the value of affects the separation
performance. Our proposed method performs better both in S1S2 and S1S3 situations
than other methods.

Figure 3. The results of S1S2.


 Two-microphones Speech Separation Using Generalized Gaussian Mixture Model   369

Figure 4. The results of S1S3.

6 Conclusions

In this paper we have proposed a novel source separation which can calculate a
suitable threshold to separate the mixed speech signals. Our method, for the first
time, employs Generalized Gaussian Mixture Model (GGMM) for estimating the
statistical information about ITD with each frame. Using Generalized Gaussian
Mixture Model, a rough expression of the probabilistic density function (PDF) of
the ITD can be obtained. Then the accurate expression of the probabilistic density
function (PDF) can be obtained by using the correlation coefficient and a masking
filter can be calculated based on the so-obtained probabilistic distributions. Objective
evaluations on two-source speech separations demonstrated the effectiveness of our
proposed methods in terms of Perceptual Evaluation of Speech Quality (PESQ) score,
Csig, Cbak, and Covl.

References
[1] C. Kim, R. M. Stern, K. Eom, and J. Lee, “Automatic selection of thresholds for signal separation
algorithms based on interaural delay,” in INTERSPEECH, 2010, pp. 729–732.
[2] C. Kim, C. Khawand, and R. M. Stern, “Two-microphone source separation algorithm based
on statistical modeling of angle distributions,” in IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP). IEEE, 2012, pp. 4629–4632.
370   Two-microphones Speech Separation Using Generalized Gaussian Mixture Model

[3] M. Cobos, J. Lopez, and D. Martinez, “Two-microphone multi-speaker localization based on a


laplacian mixture model,” Digital Signal Processing, vol. 21, no. 1, pp. 66– 76, 2011.
[4] G. Xuan, W. Zhang, and p. Chai, “Em algorithms of gaussian mixture model and hidden markov
model,” in Image Processing, 2001. Proceedings. 2001 International Conference on. IEEE, 2001,
vol. 1, pp. 145–148.
[5] G. McLachlan and T. Krishnan, The EM algorithm and extensions, vol. 382, John Wiley & Sons,
2007.
[6] M. S. Allili, “Wavelet modeling using finite mixtures of generalized gaussian distributions:
Application to texture discrimination and retrieval,” Image Processing, IEEE Transactions on,
vol. 21, no. 4, pp. 1452–1464, 2012.
[7] v. Zue, s. Seneff, and J. Glass, “Speech database development at mit: Timit and beyond,”
Speech Communication, vol. 9, no. 4, pp. 351–356, 1990.
[8] S. T. Neely and J. Allen, “Invertibility of a room impulse response,” The Journal of the Acoustical
Society of America, vol. 66, no. 1, pp. 165–169, 1979.
[9] Y. Hu and P. C. Loizou, “Evaluation of objective quality measures for speech enhancement,”
Audio, Speech, and Language Processing, IEEE Transactions on, vol. 16, no. 1, pp. 229–238,
2008.
Zhi-qiang LI, Sai CHEN*, Wei ZHU, Han-wu CHEN
A Common Algorithm of Construction a New Quantum
Logic Gate for Exact Minimization of Quantum
Circuits
Abstract: Sincenon-permutative quantum gates have more complex rules than
permutative quantum gates, direct use of non-permutative quantum gates should
be avoided in the efficient synthesis algorithm because it is very hard to synthesize.
The key method is using quantum gates to create new permutative quantum gates
to replace non-permutative quantum gates. In this paper, we propose an algorithm
using CNOT and non-permutative quantum gates to construct new optimal quantum
logic gates library automatically. Our method based on the idea of exhaustion finds
the all combinations of quantum logic gates with lower quantum cost no matter how
many the quantum lines are.

Keywords: automatically; non-permutative quantum gates; exhaustive

1 Introduction

Cascading and combining the quantum logical gates are the basic elements of
reversible quantum logic circuits, and then, a quantum computer is constructed by
quantum reversible logic circuits. According to the characteristics of the input and
the output, quantum gates can be divided into non-permutative quantum gates and
permutative quantum gates. In a quantum logic circuit, if the input is logical, the
output must be logical and vice versa. But when the input and output are all logical,
the internal circuit allows the superposition of quantum information and quantum
entanglement that is non-permutative values. If only using quantum logic gates in
a quantum logic circuit, the synthesis algorithms of the circuits are like the classic
reversible logic synthesis algorithm. In addition, non-permutative quantum gates are
also used, the superposition of information will make the process of synthesis more
complicated and low-performance. Non-permutative quantum gates, such as NCV
quantum gates library (including NOT gates [1], controll-NOT gates and controlled-
square-root-of-NOT gates [2]), are used to construct new quantum permutative gates
and gate libraries.

*Corresponding author: Sai CHEN, College of Information Engineering, Yangzhou University Yangzhou,
China, E-mail: cs386978@126.com
Zhi-qiang LI, Wei ZHU, College of Information Engineering, Yangzhou University, Yangzhou, China
Han-wu CHEN, School of Computer Science and Engineering, Southeast University,Nanjing, China
372   A Common Algorithm of Construction a New Quantum Logic Gate

To reduce the cost of the circuits, an excellent combination and optimization


techniques is the key. The essence of constructing less cost quantum logic gates
is the reversible logic synthesis [3]. With the further research of reversible circuit,
many synthesis methods of reversible circuits also appeared [4-8]. The ultimate goal
of quantum reversible logic synthesis algorithms is to efficiently construct optimum
quantum logic circuits and automatically design reversible quantum logic circuits
with less cost. However, these methods generally were designed for the entire logic
circuits optimization. And the research of the foundational quantum logic gates
construction is less. The optimization of quantum logic gates will directly affect
the entire quantum logic circuits optimization. If the quantum logic gates can be
optimized automatically, the synthesize algorithm which using this gates will have
better performance and minimum cost. This will play an important role in the
optimization of the entire circuits.
People have done a lot of research and put forward many quantum circuits
synthesis algorithms in which most of 3-qubit synthesis algorithms based on
quantum logic gates have been presented [9-13]. However, the algorithms based on
non-permutative quantum gates are few. There are several synthesis algorithm based
on NCV gates right now [14-17]. Although a variety of 4-qubit algorithms have been
proposed, these algorithms still based on quantum logic gates. In [18], new quantum
logic gates can be constructed by using NVC gate library to compose four types Peres-
like gates. But this method is not universal, and can only be used when the number of
lines is fewer. If the quantum lines are increased, such as the number of lines greater
than five, this method cannot be realized.
Therefore, in this paper we proposed a universal algorithm to generate any lines
of optimal new quantum logic gates automatically. Here the optimum means that in
the new quantum logic gates can be no longer broken down into several cascading
quantum logic gates with equivalent quantum cost.

2 Backgroung

A quantum gate is the basic unit of quantum information processing and its cascade
constitutes a quantum circuit. A quantum circuit is reversible. In the quantum
computation, a quantum gate is corresponding to a unitary transformation.
It is well known that the operation of each gate in an n-line reversible or quantum
circuit can be represented by a square matrix of dimension 2n. The matrix of the NOT
0 1 1 0
gate is N =  1 0  false, and I =  0 1  false represents the identity circuit.
 
As shown in Figure 1 is the NOT gates,  control-NOT gates and Toffoli gates
[19]. They are all typical permutative quantum gates.
 A Common Algorithm of Construction a New Quantum Logic Gate   373

x3 x3
x2 x2 x2 x2
x1 x1 x1 x1 ⊕ x2 x1 x1 ⊕ x2 x3
(a) (b) (c )

Figure 1. The permutative quantum gates.

The Controlled-square-root-of-NOT gates contain a controlled-V (CV) gate and a


controlled-V+ (CV+) gate as shown in Figure 2. They are all typical non-permutative
quantum gates.

Figure 2. Basic quantum algebra rules for CV/CV† gates.

If the input and the output are logical, the gate must be a permutative quantum gate.
If the input is not logical and the output is logic, the gate must not be a permutative
quantum gate andvice versa. If the input and output are all not logical, the gate cannot
be sure a permutative quantum gate. For example, control-NOT gate in Figure 1 (b),
1 1 + i 1 − i 1  1 1 + i 
set x2 to 1 and x1to v0 which v0 is 2 1 − i 1 + i   false, then x ⊕=
 1 0  2 1 − i  2 1 + i  false. We
 0 1  1 1 + i  1 1 − i 
=  x 
1 2 ×  =   
 0  2 1 − i 
can clearly see the result is not logic. If the gate is replaced by a controlled-V gate,
0 +1
set x2 to 1 and x1to 2 false, the result is 12 11 +− ii 11 +− ii  × 2 false. It is clearly also not logical.
0 +1

3 The common algorithm ofr ealizing an ewquantum logic gate

In [18], new quantum logic gates were constructed by using NVC gate library to
compose four types Peres-like gates. The new quantum logic gates along with the
NOT gates and control-NOT gates together constituted the new quantum logic gate
library (NCP4). The NCP4 gate library can construct optimal 3-qubit quantum logic
circuit which the NCV gate library can also construct the equivalent. That means the
function of the two methods are the same, but the construction methods are different.
We can also say that the two gate libraries are equivalent when they synthesize 3-qubit
logic circuits. In Figure 4, the new quantum logic gate was constructed by our hands
when the quantum lines were five. This method totally spent 11 CNOT gates. The Table
374   A Common Algorithm of Construction a New Quantum Logic Gate

1 is the all kinds of inputs. We can clearly see that each line only eight U gates are to be
used. So we can sure the U gates are controlled-Kth-root-of-NOT when K=8.

Table 1. All kinds of inputs in fig. 4

X3X2X1X0 U1 U2 U3 U4 U5 U6 U7 U8 U9 U10 U11 U12 U13 U14 U15

0000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0001 1 0 0 0 0 0 1 0 0 1 1 1 1 1 1
0010 0 1 0 0 0 1 0 1 1 0 0 1 1 1 1
0011 1 1 0 0 0 1 1 1 1 1 1 0 0 0 0
0100 0 0 1 0 1 0 0 1 1 1 1 0 0 1 1
0101 1 0 1 0 1 0 1 1 1 0 0 1 1 0 0
0110 0 1 1 0 1 1 0 0 0 1 1 1 1 0 0
0111 1 1 1 0 1 1 1 0 0 0 0 0 0 1 1
1000 0 0 0 1 1 1 1 0 1 0 1 0 1 0 1
1001 1 0 0 1 1 1 0 0 1 1 0 1 0 1 0
1010 0 1 0 1 1 0 1 1 0 0 1 1 0 1 0
1011 1 1 0 1 1 0 0 1 0 1 0 0 1 0 1
1100 0 0 1 1 0 1 1 1 0 1 0 0 1 0 1
1101 1 0 1 1 0 1 0 1 0 0 1 1 0 0 1
1110 0 1 1 1 0 0 1 0 1 1 0 1 0 0 1
1111 1 1 1 1 0 0 0 0 1 0 1 0 1 0 1

x3 x3
x2 x2
x1 x1
x0 x0
z U1 U2 U3 U4 U5 U6 U7 U8 U9 U10 U11 U12 U13 U14 U15 z'

Figure 4. The new quantum logic gate with four control lines was constructed by hands.

Because the lines of the circuits are few and there are certain rules in the circuits, we
can construct these circuits by our hands. But when the lines of the new quantum
logic gates are increased, this method is difficult to implement. We must find a
common algorithm.
In [20], non-permutative quantum gates constructed new permutative quantum
gates by using controlled-Kth -root-of-NOT gates. Firstly, the article absorbs the idea
of Gray code, which is any two adjacent codes differ from only one bit binary in a set of
binary numbers, to constructed new permutative quantum gates by using controlled-
NOT gates and controlled-Kth-root-of-NOT gates. Then we found that these quantum
 A Common Algorithm of Construction a New Quantum Logic Gate   375

logic gates exists recursive inside. According this, without using the Gray code, a
recursive construction was presented to directly and efficiently construct the same
new quantum logic gates. In Figure 5, the new quantum logic gate was constructed
by the recursive construction when the quantum lines were five. This method totally
spent 14 CNOT gates.

Figure 5. The new quantum logic gate with four control lines was constructed by the recursive
construction.

Thus it can be seen when the quantum lines are all five, the manual construction has
3 CNOT gates less than the recursive construction. This is because the manual method
obtained x3, x2, x1 and x0 directly. In the recursive method, only x3 is obtained
directly, and x2, x1 and x0 were obtained by XOR of quantum bits. So we can get that
when we construct new quantum logic gates with n+1 quantum lines, the manual
construction has n-1 CNOT gates less than the recursive construction.
It can also prove that the recursive method can construct new quantum logic
gate with any quantum lines, however the cost of quantum gates is not a minimum.
Although quantum gates cost of the manual construction is a minimum, this method
can only synthesize when the quantum lines less than six. For this reason, similar to
[18], we propose a common exhaustive algorithm to find all potential new quantum
logic gates. Apparently these new logic gates can efficiently synthesize optimum
reversible logic circuits.
For example, to construct the new quantum logic gates with n+1 quantum lines,
the first step is to construct the quantum gates library with n+1 quantum lines. Here
we do not consider the target above the control. Put the identical circuit into a stuck
and point the stack pointer to the bottom in the following second to fourth step. Step
five, when the stack is not empty, the program starts loop. The step six and seven is
to assign the circuit in the stuck to c1 and set false to the flag of bok which indicate
whether the circuit adds additional gates. From the eight to twenty-four, looping in
the quantum gate library, under the premise of the current circuit c1cascades a gate
from the library and the flag bok set to be true. Then the program determines whether
the output of the new control circuit c2 appears in the front. If the front has appeared,
bok will be set to false and the program will bounce this cycle. Re-cascade the next
376   A Common Algorithm of Construction a New Quantum Logic Gate

gate from quantum gate library from the step nine and Repeat the step from eleven to
sixteen. If the front has not appeared, the program will do the seventeen to twenty-four
n
steps. That is if bok is true and the gate number of the control circuit is 2 − n − 1 false
which is the minimum number of gates to construct the desired circuit, the circuit in
the stuck is which we want. If the gate number of the control circuit is 2n − n − 1 false,
the program will out of the loop. In the following twenty-six to thirty step, if there is
no next gate in the library, the program will remove the front gate of the current gate
in the stuck and cascade the next gate of the front gate of the current gate. Then start
from the fifth step again.

Algorithm A common construction for a new quantum logic gate

Input: n+1
Output: optimal novel quantum logic gates based on CNOT gates

1: Lib[n(n-1)/2][2];
2: itop =0;
3: stack[itop] =equality comparator;
4: idx_gate =0;
5: whileitop>=0do
6:c1 =stack[itop];
7: bok =false;
8: fori =idx_gateto n(n-1)/2-1)do ,
9: c2=c1 cascade i quantum gate in Lib;
10: bok=true;
11: forj=0 toitopdo
12: ifthe function of c2== the function of stack[j]then
13: bok=false;
14: break;
15: end if
16: end for
17: ifbok is true then
18: stack[++itop] =Lib[i];
19: ifitop==2n-n-1then
20: idx_gate=0;
21: return stack[1], stack[2],…, stack[itop];
22: end if
23: break;
24: end if
25: end for
26: ifbok is false then
27: c3 =stack[itop--];
28: idx_gate=the number of c3’s the next gate in Lib
29: end if
30: end while
 A Common Algorithm of Construction a New Quantum Logic Gate   377

4 The result andanalysisof experiments

We do experiments to compare the manual method, recursive method and the


algorithm proposed in this paper. The results are shown in Table 2.

Table 2. The results of the experiment

The number of The number of CNOT The number of CNOT The number of CNOT
Quantum lines in our method in recursive method in manual method

4 4 6 4
5 11 14 11
6 26 30 No solution
7 57 62 No solution
8 120 126 No solution
9 247 254 No solution

From Table 2 we can clearly see that when the quantum lines are n+1, our algorithm
always has n-1 CNOT gates less than the recursive construction. And the manual
method cannot be realized when the quantum lines are greater than five.

Acknowledgment: The work was supported by the National Natural Science


Foundation of China under Grant 61070240 and Grant 61170321; and the Specialized
Research Fund for the Doctoral Program of Higher Education under Grant
20110092110024. The authors would like to thank Portland State University Quantum
Logic Group for having useful discussions.

References
[1] Z. Li, H. Chen, and X. Song, A Novel Hash-based Algorithm for Reversible LogicCircuits
Synthesis, Journal of Computational Information Systems,2012, 8(11): 4485–4493.
[2] A. Barenco, C. Bennett, R. Cleve, D. DiVinchenzo, M. Margolus,P. Shor, T. Sleator, J. Smolin,
and H. Weinfurter,Elementary gates for quantum computation, Physical Review A, 1995,
52(5):3457-3467.
[3] V VShende,A K Prasad,I L Markov,et al.Synthesis of reversible logic circuits[J].IEEE Trans on
Circuits and Systems-I,2003,22(6):723-729.
[4] SongX, YangG, PerkowskiM, WangY. Algebraic characteristics of reversible gates[J]. Theory of
Computing Systems, 2006,39(2):311-319.
[5] W Q Li, H W Chen,Z Q Li.Application of semi-template in reversible logic circuit[A].Proceedings
of the 11th International Conference on CSCWD[C].Melbourne,Australia,2007:155-161.
[6] Zhiqiang Li, Hanwu Chen, Baowen Xu, WenjieLiu,Xiaoyu Song, XilinXue, Fast algorithm for
4-qubit reversible logic circuits synthesis, IEEE World Congress on Computational Intelligence
(WCCI2008), Hong Kong, 2008:2202-2207.
378   A Common Algorithm of Construction a New Quantum Logic Gate

[7] Wan S,Chen H, Cao R.A novel transformation-based algorithm for reversible logic synthesis[A].
Proceedings of Advances in Computation and Intelligence[C].2009.70-81.
[8] Zhiqiang Li, Chen, Hanwu; Song, Xiaoyu, Perkowski, Marek, A Synthesis Algorithm for 4-Bit
Reversible Logic Circuits with Minimum Quantum Cost, ACM Journal on Emerging Technologies
in Computing Systems, 2014, 11(3):1-19.
[9] Miller. D M, Wille R, Sasanian Z. Elementary quantum gate realizations for multiple-control
toffoli gates [C]//Proceedings of 41st IEEE International Symposiumon Multiple-Valued Logic.
Tuusula, finland: IEEE, 2011: 288-293.
[10] Liu Y, Long G L, Sun Y. Analytic on e-bit and CNOT gate constructions of general n-qubit
controlled gates [J]. International Journal of Quantum Information, 2008, 6(3):447-462.
[11] Tsai E, Perkowski M. Synthesis of permutative quantum circuits with toffoli and TISC gates [C]
//Proceeddings of IEEE 42nd International Symposium on Multiple-Valued Logic. Victoria, BC,
Canada: IEEE, 2012: 50-56.
[12] Sasanian Z, Miller D M. Transforming MCT circuits to NCVW circuits [C] //Workshop on
Reversible Computation 2011. Gent Belgium Berlin, Heidelberg: Springer, 2011: 163-174.
[13] Hung W N N, Song X, Yang G, etal. Optimal synthesis of multiple out putBoolean functions
using a set of quantum gates by symbolic reach ability analysis [J]. IEEE, Transactions on CAD,
2006, 25(9):1652-1663.
[14] Yang G W, Hung W N N, Song X, et al. Exact synthesis of 3-qubit quantum circuits from
non-binary quantum gates using multiple-valued logic and group theory, Proceedings of the
conference on Design, Automation and Test in Europe-Volume 1. IEEE Computer Society, 2005:
434-435.
[15] Li Zhiqiang, Chen Hanwu, Liu Wenjie, Liu W, etal. Efficient algorithm for synthesis of optimal
NVC 3-qubit reversible circuits using new quantum logic gate library [J]. Acta Electronica Sinica,
2013, 41(4):690-697.
[16] Maslov D, Miller D M. Comparison of the cost metrics through investigation of the relation
between optimal NCV and optimal NCT three-qubit reversible circuits [J]. IET Computers &
Digital techniques, 2007,1(2):90-104
[17] Yang G W, Song X, Perkowski M, etal.Four-level realization of 3-qubit reversible function [J]. IET
Computers &Digial Techniques, 2007, 1(4):382-388.
[18] Li, Zhiqiang; Song, Xiaoyu; Perkowski, Marek; Chen, Hanwu; Feng, Xiaoxia, Realization of a new
permutative gate library using controlled-kth-root-of-NOT quantum gates for exact minimization
of quantum circuits, International Journal of Quantum Information, 2014, 12(5):2418–2420.
[19] ToffoliT. Reversible computing[J/OL]. Technical Memo MIT-LCS-TM-151, MIT Lab for Comp. Sci.
New York: Springer, 1980.
[20] Li, Zhiqiang, Feng xiaoxia, Chen Hanwu. Realization of Toffoli-like gates using controlled-kth-
root-of-not quantum gates, Journal of Data Acquisition & Processing. 2014, 29(06):975-980.
Qian WANG, Xiao-guang REN*, Li-yang XU, Wen-jing YANG
A Post-Processing Software Tool for the Hybrid
Atomisitc-Continuum Coupling Simulation
Abstract: The hybrid atomistic-continuum (HAC) coupling fluid simulation can
gain both simulation accuracy and efficiency. The post-processing procedure of the
HAC coupling simulation is quite important in the whole multi-scale simulation
process and requires the support of an efficient, user-friendly interface tool. Under
the demand of efficient analyzing coupling simulation data and visualizing results,
in this paper we design and implement a visualized post-processing framework
based on SALOME software platform for the unified HAC coupling simulation. The
experimental verification indicates that our post-processing tool can offer efficient,
easy-to-use post-processing for the HAC coupling simulation, and effectively improve
the analyze of the coupling simulation results.

Keywords: hybrid atomistic-continuum; post-processing; SALOME;

1 Introduction

Generally, the accuracy of the fluid simulation is in inverse proportion of the scale
of the model abstraction [1]. When the scale of simulation shrinks to the molecular
mean free path, the continuity assumption which is governed by the Navier–Stokes
equations will no longer be accurate [2]. In order to simulate physical problems
with a large length scale as well as to capture the microscopic physical phenomena,
the so-called hybrid atomistic-continuum (HAC) multi-scale simulation method is
proposed [3, 4].
The key idea of geometric coupling based HAC method is to split the simulation
domain into the continuum region,the overlap region and the atomistic region[5, 6]
and use a continuum solver [7, 8] as well as a atomistic solver [9] to simulate cases.
The post-processing, which combining and analyzing the output data of each
simulation method, turns out to be an important component of the whole simulation
software. In macroscopic CFD simulation, the physical properties time step files are

*Corresponding author: Xiao-guang REN, College of Computer, National University of Defense


Technology, State key Laboratory of High Performance Computing, National University of Defense
Technology, Changsha, China, E-mail: renxiaoguang@nudt.edu.cn
Qian WANG, Li-yang XU, Wen-jing YANG, College of Computer, National University of Defense
Technology, State key Laboratory of High Performance Computing, National University of Defense
Technology, Changsha, China
380   A Post-Processing Software Tool

analyzed in post-processing, while in MD simulation, the post-processing resolves the


configuration of particle in certain output steps.
The post-processing tool is a subtle necessity for users in the coupling simulation.
There are two kinds of simulation methods involved in one coupling simulation. Each
of these will output different data information and files. Efficiently resolving these
files will help experts analyzing the simulation results and advancing research. The
currently available post-processing tools, which always in single scale, cannot meet
the demand of the hybrid atomistic-continuum coupling simulation and other multi-
scale coupling [10-12].
In this paper we present a hybrid atomistic-continuum coupling simulation
oriented post-processing tool based on the SALOME platform. First, we analyze
the basic software framework, execution mechanism and development method of
SALOME software. Second, based on these analyses and combining the statistical
methods, we design and implement a user-friendly and effective post-processing tool.
The main contributions are summarized as follows:
–– We design the main framework of the user-friendly post-processing tool under
the visualization demand of the hybrid atomistic-continuum coupling simulation
and the analysis of SALOME.
–– We implement the post-processing tool including the user interfaces and the core
functions, which would simplify the operations and improve the efficiency of the
post-processing.
–– We verify the post-processing tool through the benchmark case. The result
indicates that our post-processing tool is able to offer enough data analysis power
for users.

The rest of the paper are organized as follows. Section II presents the framework of
the post-processing tool. Section III gives detailed implementation of tool. Section IV
uses a benchmark case to show the efficiency of our post-processing tool. Section V
concludes the paper and brings up future research expectation.

2 Framework of the Post-Processing Tool

In this section, we present the design detail of our post-processing tool, including
data averaging method, based software framework and the time sequence and classes
of our tool.
 A Post-Processing Software Tool   381

2.1 Data statistics from microscopic to macroscopic

Due to there are two kinds of simulation method, each of these will output different
kinds format of result files. In order to unify analyze these, we must transform the
microscopic data from the MD simulation to the macroscopic data.
In the single particle MD simulation, particle positions and velocities should be
sampled from one sample bin and in one sample duration. Because of the smallest
module of CFD simulation is mesh cell, we choose the sample bins the same volume
as the cell. The sampling velocity from one bin can be calculated by the following
equation:

which is the kth particle in bin i, is the number of particles in the bin i, and the bracket
represents the temporal average. The most widely used sampling average methods
are SAM and CAM [13]. We use the CAM method in our post-processing tool with the
following equation, and the temporal average is performed over S sampling points.

The other microscopic properties can be handled using the same CAM method.

2.2 Introduction of SALOME software

SALOME is an open-source software that provides a generic pre-processing and post-


processing platform for the numerical simulation [14]. It is based on an open and
flexible architecture made of reusable components and applied the CORBA technology
of the distributed system in the software architecture [15].

Figure 1. The software framework of SALOME.


382   A Post-Processing Software Tool

As shown in Figure 1, all the modules perform communication and exchange data
through the CORBA protocol with other modules. The central modules are KERNEL
module, GUI module, GEOM module, SMESH module and VISU module. All these
modules are independent completely and located in the root folder of SALOME in the
form of software packages. The KERNEL module provide an interface for integration
and is the base module. Other details can be referenced in [14].

2.3 Post-processing time sequence and classes

In this section, we design the time sequence of our post-processing tool. The time
sequence of the post-processing includes one kernel procedure, i.e. manipulating the
coupling output data files. We create the sequence diagram in Figure 2.
Base on the time sequence, we present the associated classes of the post-processing
tools which include the classes of the key sequences and some assistant classes for
proper functioning. The main configuration process is described as follows, we add
a DATASTDL class to the main framework of our tool [16]. The framework call the
OnDATASTDL() function to initiate the DATASTDL class. The OnDATASTDL() function
call the setMDinfilepath() function to get the MD output file path, the setCFDinfilepath()
function to get the CFD output file path, the setoutfilepath() function to set unify
output file path and final use the MDTOVTK() function to transform microscopic data
to macroscopic data.

Figure 2. Time sequence of post-processing.


 A Post-Processing Software Tool   383

3 Implemetation of Post-Processing Software Platfoam

In this section, we will describe the implementation of our post-processing tool and
give the details of implementation.

3.1 Description of data file structure

There are two kinds of output files in the post-processing procedure, i.e. the VTK
files of CFD simulation and The OUT files of MD simulation. This the kernel part for
DATASTDL class to handle the data transform. We will describe the configuration of
these two files in the following.
–– The OUT files
The OUT files include the configuration of particles, the velocity of the particles, etc.
The file layout of the OUT files is shown in Table 1.

Table 1. Layout of OUT files


Item Keyword Meaning

TIMESTEP timestep The certain time step of output

NUMBER OF ATOMS number Total partile in one simulation

BOX BOUNDS xx yy zz Configuration of simulation domain


xlo xhi
ylo yhi
zlo zhi

ATOMS id ID for partile


DETAIL
type Partile type
MESSAGE
xs ys zs Positions of particle

vx vy vz Velocities of particle

fx,fy,fz Forces of particle

q Charge of particle

mux,muy,muz moment of dipole

By default, the value of xs, ys, zs is between 0~1 and not the actual position
information. The file format changes the simulation box into a standard cube of
length 1 and (0,0,0) as the origin. The xs, ys, zs are the relative position in order to deal
with the information with CFD simulation. The vx, vy, and vz are dimensionless value
and it should need a reasonable conversion to get the true value. The other physics
properties are also needed conversion.
384   A Post-Processing Software Tool

–– The VTK files


VTK format file is VTK (Toolkit Visualization) software provided by other visual
software used to read and write files. VTK format file is the output format of the
CFD simulation results. SALOME platform data display module, i.e., PARAVIEW
module supports VTK file format. The unify data transformation process of the post-
processing tool is using the statistical physics of the relevant properties of the MD
simulation method, generating VTK format data, unifying the VTK files of CFD and
MD simulation and outputting final VTK files to the PARAVIEW module for unified
displaying. The detail VTK format can be referenced in [17].

3.2 Implemetation of the user interface

The DATASTDL class is designed for post-processing data manipulating. We will


describe the detailed in the following as shown in Figure 3.

DATASTDL
-CFDinfilepath
-outfilepath
-MDinfilepath
+DATASTDL()
+~DATASTDL()
+setMesh()
+MDTOVTK()
+SHELLTOVTK()
+setMDinfilepath()
+setCFDinfilepath()
+setoutfilepath()
+precalculate()
+judge()
+getsit()

Figure 3. DATASTDL class

The whole process of document processing is the time step for a treatment cycle. The
choice of time step size will affect the accuracy of simulation and should be chosen
a proper value. The data of two simulation method have no contact until the output
time point as shown in Figure 4. The many MD output files and one CFD output file are
generated into one final file. Generally, the time step of CFD is much larger than MD,
i.e., 1:500. As shown in Figure 5, the associate MD files are as much as 500 and CFD is
one. The post-processing tool has to be able to handle all these files.
 A Post-Processing Software Tool   385

Figure 4. Single time step data manipulating

Figure 5. Time advancing coupling output

The kernel implementation of the post-processing tool is the MDTOVTK() function.


This function reads the output data from MD output files, samples and averages data,
and generate the VTK format to combine with CFD output files.

Table 2. The pesudo code of MDTOVTK() function

MDTOVTK()

1 declare variable in function


2 for 3 ~15 until finish
3 generate file path
4 unsing fstream open CFD and MD files
5 initiate structure
6 for 7 ~ 12
7 generate MD output files
8 open MD files
9 read total particle number
10 read the boundary of the simulation domian
11 using precalculate() to set partly varialbes
12 sample and average
13 end
14 output associate macroscopic data
15 output VTK files according to cell ID
16 end
386   A Post-Processing Software Tool

4 Exprimental Verification

We verify our post-processing tool using the computation of the Couette benchmark
flow [18, 19] and illustrate the results through snapshots of the software. We will show
the function and module of the post-processing tool together. The decomposition of
the computation domain of the Couette benchmark is depicted in Figure 6.

Figure 6. Decompostion of the computation domain

The third button on the toolbar is the post-processing function to implement coupling
data manipulation as shown in Figure 7. When user click the data manipulating
button, the tool will firstly pop up the dialog box which is the configuration path of
the output MD files. This dialog box is used to configure the storage path of the MD
simulation results files as show in Figure 8.
After configuration of the storage path, user also need to configure the path that
storages the CFD results files so that the tool can use this to manage the MD VTK
output files as shown in Figure 9. The next step is to configuration the path of unify
post-processing output files which is together with the CFD output files to visualize by
Paraview as shown in Figure 10. When finishing all the configuration of files path, the
post-processing module manipulate data using different classes. In the bash shell,
the detail message of manipulating data are output to screen which facilitate control
as shown in Figure 11.
In the end, we visualize the unify output data using Paraview module. Because of
our data processing based on sampling and statistics, the data allows the existence of
a certain error. The validation of data can be observed by whether the phenomenon
is consistent with the real physical phenomenon. In Figure 12, the upper part is the
results of CFD simulation which is smoother than the lower MD part. Due to the need
to control the error, the sampling area can not be too small. The statistical area is
larger so that the display is not smooth. But data distribution is linear distribution
 A Post-Processing Software Tool   387

which meets the analytical solution. The results show that our post-processing tool
can manipulate and visualize the coupling results correctly and efficiently.

Figure 7. Post-Processing configuration interface

Figure 8. Post-Processing configuration interface


388   A Post-Processing Software Tool

Figure 9. Configuration storage path of the CFD output files

Figure 10. Unify post-processing output files path


 A Post-Processing Software Tool   389

Figure 11. Message ouput on the bash shell

Figure 12. Paraview visualization


390   A Post-Processing Software Tool

5 Conclusion

In this paper we present a hybrid atomistic-continuum coupling simulation oriented


post-processing tool based on the SALOME platform. We analyze the basic software
framework, execution mechanism and development method of SALOME software,
design and implement a user-friendly and effective post-processing tool. The
experimental results indicate that our post-processing tool is able to offer enough
data analysis power for users. In the future, we will generate the pre-processing
tool [16] and the post-processing tool into one software platform, offer an integrated
processing tool for users.

Acknowledgment: The authors declare that there is no conflict of interest regarding


the publication of this paper. The authors would like to thank the National Natural
Science Foundation of China (Grant no. 61303071 and 61120106005) and the Open
fund from State Key Laboratory of High Performance Computing (no. 201303-01,
201503-01 and 201503-02) for funding.

References
[1] T. M. Squires and S. R. Quake, “Microfluidics: Fluid physics at the nanoliter scale,” Reviews of
Modern Physics, vol. 77, pp. 977-1026, Jul 2005.
[2] C.-M. Ho and Y.-C. Tai, “Micro-electro-mechanical-systems (MEMS) and fluid flows,” Annual
Review of Fluid Mechanics, vol. 30, pp. 579-612, 1998.
[3] K. M. Mohamed and A. A. Mohamad, “A review of the development of hybrid atomistic-
continuum methods for dense fluids,” Microfluidics and Nanofluidics, vol. 8, pp. 283-302, Mar
2010.
[4] M. Kalweit and D. Drikakis, “Multiscale simulation strategies and mesoscale modelling of gas
and liquid flows,” IMA journal of applied mathematics, p. hxr048, 2011.
[5] S. T. O’Connell and P. A. Thompson, “Molecular dynamics-continuum hybrid computations: A
tool for studying complex fluid flows,” Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip
Topics, vol. 52, pp. R5792-R5795, Dec 1995.
[6] E. Flekkøy, G. Wagner, and J. Feder, “Hybrid model for combined particle and continuum
dynamics,” EPL (Europhysics Letters), vol. 52, p. 271, 2000.
[7] (2016). OpenFOAM. Available: http://www.openfoam.org/
[8] S. Zou, X. F. Yuan, X. J. Yang, W. Yi, and X. H. Xu, “An integrated lattice Boltzmann and finite
volume method for the simulation of viscoelastic fluid flows,” Journal of Non-Newtonian Fluid
Mechanics, vol. 211, pp. 99-113, Sep 2014.
[9] S. Plimpton, “Fast Parallel Algorithms for Short-Range Molecular Dynamics,” Journal of
Computational Physics, vol. 117, pp. 1-19, 3/1/ 1995.
[10] (2016). Paraview. Available: http://www.paraview.org/
[11] (2016). Ovito. Available: http://www.ovito.org/
[12] X.-H. Xu, X.-W. Guo, Y. Cao, X.-G. Ren, J. Chen, and X.-J. Yang, “Multi-scale simulation of
non-equilibrium phase transitions under shear flow in dilute polymer solutions,” RSC
Advances, vol. 5, pp. 54649-54657, 2015.
 A Post-Processing Software Tool   391

[13] M. W. Tysanner and A. L. Garcia, “Measurement bias of fluid velocity in molecular simulations,”
Journal of Computational Physics, vol. 196, pp. 173-183, May 1 2004.
[14] (2016). Salome: The Open Source Integration Platform for Numerical Simulation. Available:
http://www.salome-platform.org/.
[15] M. Henning and S. Vinoski, Advanced CORBA® programming with C++: Pearson Education,
1999.
[16] Q. Wang, W. Yang, F. Li, X. Ren, and Y. Tang, “HACP2: The Pre-processing Software Tool for the
Hybrid Atomistic-Continuum Coupling Simulation,” International Journal of Hybrid Information
Technology, vol. 9, pp. 301-318, 2016.
[17] W. J. Schroeder, L. S. Avila, and W. Hoffman, “Visualizing with VTK: a tutorial,” IEEE Computer
graphics and applications, vol. 20, pp. 20-27, 2000.
[18] E. Flekkøy, S. McNamara, K. Måløy, J. Feder, and G. Wagner, “Hybrid Models: Bridging Particle
and Continuum Scales in Hydrodynamic Flow Simulations,” in Novel Methods in Soft Matter
Simulations. vol. 640, M. Karttunen, A. Lukkarinen, and I. Vattulainen, Eds., ed: Springer Berlin
Heidelberg, 2004, pp. 190-218.
[19] X. B. Nie, S. Y. Chen, W. N. E, and M. O. Robbins, “A continuum and molecular dynamics hybrid
method for micro- and nano-fluid flow,” Journal of Fluid Mechanics, vol. 500, pp. 55-64, Feb 10
2004.
Jun XU*, Xiao-yong LI
Achieve High Availability about Failover in Virtual
Machine Cluster
Abstract: Cloud computing has recently emerged as the next generation solution for
hosting and delivering services over the internet. Compute is consumed as a service
just like electricity. Thus, there comes the need of availability and reliability. In
traditional environment, physical machines are the basic nodes of a cluster. But in
the era of cloud computing, virtualization becomes the basic technology, it means
that virtual machines play the base roles in a cloud computing environment. Different
from physical machines, virtual machines have some characteristics that can be
easily configured to bulk copy, migrate, recover and so on. Based on these features,
failover can be achieved. So in this paper, we present three ways to get rid of node fail
scenarios, evacuate, fencing and live migration.

Keywords: Cloud Computing; High AvailabilityVirtual Machine Cluster; Failover

1 Introduction

Cloud computing [1-3] is a type of Internet-based computing that provides shared


computer processing resources and data to computers and other devices on demand. It
is a model for enabling ubiquitous, on-demand access to a shared pool of configurable
computing resources (e.g. computer networks, servers, storage, applications and
services), which can be rapidly provisioned and released with minimal management
effort [4].
As for virtualization, it is a foundational element of cloud computing and helps
deliver on the value of cloud computing. Virtualization is software that manipulates
hardware, it makes the cloud more elastic and flexible. There are several kinds of
virtualization technology, such as KVM, Xen, VMware, Hyper-v etc. KVM and Xen are
open to the public, the experiments in the next sections are based on them.
KVM (for Kernel-based Virtual Machine) is a full virtualization solution for Linux
on x86 hardware containing virtualization extensions (Intel VT or AMD-V). It is now a
loadable kernel module in Linux release.

*Corresponding author: Jun XU, Key Laboratory of Trustworthy Distributed Computing and Service
(BUPT), Ministry of Education, Beijing University of Posts and Telecommunications, Beijing, China,
E-mail: x943401153@163.com
Xiao-yong LI, Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of
Education, Beijing University of Posts and Telecommunications, Beijing 100876, P.R. China. Beijing,
China
 Achieve High Availability about Failover in Virtual Machine Cluster   393

Xen supports a form of virtualization known as paravirtualization, in which guests


run a modified operating system. The guests are modified to use a special hypercall
ABI, instead of certain architectural features. It can achieve high performance through
paravirtualization [5].
No matter what kinds of virtualization used in a cloud computing, the QoS of cloud
service should be ensured to users. Thus high availability is the basic requirement.
When speaking of high availability, we usually mean the “Carrier Grade Standard”:
measured availability is above 99.999% [6,7].
Availability = (Service Time−Downtime) * 100/Service Time
However, cloud computing is still growing up, there’re still a lot of issues. For
instance, issue 1: how to recover the failed node if the host vms running on crashes
suddenly? Issue 2: how to avoid the node corrupting the shared storage data if a node
failed in a cluster? Issue 3: how to guarantee the service quality with just extremely
few downtime if a virtual server is down?
In this paper, three experiments were tested to solve these three issues respectively.
Nova evacuate for the issue 1, virtual fencing for the issue 2, and live migration for
the issue 3. These three methods reached the requirement of failover. In this paper,
failover means evacuate, fencing and live migration.
The rest of the paper is organized as follows: Section II surveys the related
work on nova evacuate, virtual fencing and live migration. Section III describes the
implementation of each experiment. Section IV includes the evaluation.

2 Related Work

2.1 Evacuate

OpenStack [8] community has become one of the fastest-growing Open Source
communities in the world. OpenStack is typically deployed as an infrastructure-as-
a-service(IaaS) platform. Nova is a compute module of OpenStack, it supports KVM,
VMware, Xen, Hyper-v together with Linux container technology such as LXC.
Evacuate is a nova API, which supports vm rebuilds on another host when vm’s
original host is down. The vm instance requires a shared storage or block storage
volume. Otherwise, previous disk cannot be accessed by other compute nodes.
Another requirement is the vm must be in an operational status, or it will end in
failure. Evacuate rebuilds the instance from the original image or volume, and it
preserves the original configuration, which includes the instance ID, name, uuid, IP
address and so on. So this technology is very suitable for host failure recover.
Masakari [9] is an open source project aimed at virtual machine high availability.
It rescues a KVM-based vm from a failure events of the following:
–– VM process down - restart vm (use nova stop API, and nova start API)
394   Achieve High Availability about Failover in Virtual Machine Cluster

–– Provisioning process down – restart process or changes nova-compute service


status to maintenance mode(use nova service-disable API)
–– Nova-compute host failure – evacuate all the vms from failure host to reserved
host(use nova evacuate API)

Figure 1 shows the architecture of Masakari.

Figure 1. The architecture of Masakari

As the Figure  1 shows above, there are 4 daemon processes that make the recover
feasible. Masakari-controller is running on a controller node which processes for
failure notification. Masakari-instancemonitor is running on the compute node which
detects the vm’s process down. Masakari-processmonitor detects the fatal control
process down on nova compute node. Masakari-hostmonitor detects the host failure.
Masakari DB keeps all needed notification info. Once the failure is detected, Masakari
will call corresponding API, recover process starts then.

2.2 Fencing

Fencing is the component of cluster project that cuts off access to a resource (hard
disk, etc.) from a node in your cluster if it loses contact with the rest of the nodes in
the cluster [10].
For a high-availability cluster, fencing is the process of isolating a node of
a computer cluster or protecting shared resources when a node appears to be
malfunctioning. There are three methods to fence in physical environment, power,
network and configuration. As for vm, power off is an easy and efficient way.
 Achieve High Availability about Failover in Virtual Machine Cluster   395

Fencing is a useful solution when a cluster is suffering from split-brain status.


Split-brain indicates data or availability inconsistencies originating from the
maintenance of two separate data sets with overlap in scope, either because of servers
in a network design, or a failure condition based on servers not communicating and
synchronizing their data to each other [11].
Figure 2 shows the fencing using virtual machine fencing agent.

vm1 vm2 vm1 vm3


agent

agent
Fencing
vm4 vm3 vm2 vm4

Fence_virtd Fence_virtd

Control message

Figure 2. Fencing virtual machines

As Figure  2 shows above, virtual machine fencing is based on the common low
level daemon – fence_virtd. Host and vms are running the fence_virtd daemon in
background. Control message is dispatched via this daemon. Fencing agent plays
the important role of controlling all the vms which connected into the same network
environment. Once vm failure is detected, the agent can turn it off directly.

2.3 Live Migration

Live migration [12] is the movement of a virtual machine from one physical host to
another while continuously powered-up. It is always adopted for fault tolerance and
load balancing. We can quickly deploy live migration to another host if one vm is
malfunctioning. There are two main ways to do the migration. One is pre-copy, the
other is post-copy. They can be used in different scenarios. In pre-copy live migration,
memory is transferred first, then allocate vm. However it copies dirty memories over
and over. In post-copy live migration, memory is transferred after allocating vm only.
DRBD(Distributed Replicated Block Device) [13] is chosen as a replicated device.
DRBD ensures a real-time replication on another host via network sync. Figure  3
shows its architecture.
396   Achieve High Availability about Failover in Virtual Machine Cluster

Figure 3. Architecture of DRBD

As we can see from Figure 3, DRBD maintains a real-time replication on another host,
and it ensures the primary node can be read and write meanwhile the secondary node
is read-only. It cooperates well with remus, an API in Xen technology. Remus uses
DRBD as its disk device, thus we can use remus to migrate a vm in a very short time
because of DRBD’s real-time replication.

3 Implementation

3.1 Experiment 1 for evacuate

In this experiment, three nodes cluster was deployed, one is controller node, the
other two are nova compute nodes. Figure 4 shows the process of how a failed node
evacuates to another reserved host. The three-member cluster consists of a virtual
OpenStack cluster, controller node detects the status of each compute node while
virtual nodes run on the other two nodes. There is a virtual Node1 on compute1, it can
evacuate from compute1 to compute2 in this scenario. So three nodes are enough for
this experiment.
We use vagrant (a tool for deploying virtual environment) to deploy a three-
member cluster. Controller node runs some core OpenStack services like Nova,
Neutron, Cinder etc. While the two compute nodes runs Nova service controlled by
the controller node. First, we should deploy OpenStack to this cluster. Next, Masakari
should do the following steps in order.
1. First add compute2 as a reserved host in controller node.
–– Disable nova scheduler of Compute2. Now Compute2 is a reserved host for
evacuating.
–– Create a new instance named Node1 on Compute1.
–– Emulating failure of Node1 by stopping Compute1 suddenly, controller node will
receive the failure of Compute1 by hostmonitor service.
 Achieve High Availability about Failover in Virtual Machine Cluster   397

–– Controller starts nova API to evacuate Node1 from Compute1 to Compute2, nova
scheduler in Compute2 will automatically start, Node1 will recreate in Compute2
after a while.

Controller

Node1 Node1
Evacuate

Compute1 Compute2

Figure 4. Three nodes cluster

3.2 Experiment 2 for fencing

In this experiment, two kinds of scenarios were tested. One is all vms including the
fence agent were running on one host, the other is fence agent was on different host
from other vms. These two scenarios are almost the same except for the network
mode, vms on two hosts must be in the same intranet.
–– Fence_virtd libs should be installed at first
–– Generate a key file and dispatch them into every host and vms
–– Set the multicast querier of host on
–– ConFigure host and vms’ firewall’s policy

In this scenario, two physical machines are required as well as at least two virtual
nodes. One virtual node plays the role of fencing agent which can do fencing actions
to other vms running on the two physical machines. So the experiment can be done
successfully with two physical machines and at least two virtual nodes running on
them.
398   Achieve High Availability about Failover in Virtual Machine Cluster

After all these steps are done, fence_xvm command is available on every node
including the host. We can do several fencing actions like list, on, off, reboot, status
and metadata. The fencing agent plays the role of controlling other vms’ life cycle.

3.3 Experiment 3 for live migration

In this experiment, a live migration process was displayed. We tested two scenarios,
one is live migration without Remus, the other is Remus using DRBD as its disk device.
Xen is adopted as virtualization technology.
In this scenario, two physical machines are required. One is for vm running on,
the other is for the place vm migrates to. So two physical machines are totally enough.
Following steps should be done on both hosts.
–– First configure the network bridge and add hosts in /etc/hosts
–– Compile Xen source code into linux kernel then restart
–– Install DRBD utils on both hosts and set one of them as the primary node(the
other will be secondary node automatically)
–– Configure vm’s configure file, set disk to DRBD
–– Run remus command to shadow a copy on another host

Remus ensures fault tolerance by triggering a migration of the active virtual machine
to a second host at intervals of 200ms. The server buffers all the network connections
in this 200ms and keeps back the packets. Remus maintains a shadow copy, in this
case, DRBD has done the replication job instead of Remus itself.

4 Evaluation

All these three experiments are tested on two hosts. The two host machines are
equipped with Xeon E5-2620 v2 CPU, 40GB of RAM and 300GB of storage volumes.

4.1 Result 1 for evacuate

This experiment environment is CentOS 6.4, vagrant is used to build a three member
cluster automatically. The result is shown below.
As Figure 5 shows above, the instance once in Compute1 finally evacuates to
Compute2 after the action of create and evacuate.
 Achieve High Availability about Failover in Virtual Machine Cluster   399

Figure 5. The result of evacuate

4.2 Result 2 for fencing

This experiment is tested on two hosts with totally 5 vms with the environment
of CentOS6.5. We choose one of them as the fence agent. The result is shown in
Figure 6.

Figure 6. Fencing result

As the Figure 6 shows, we can list or do other commands to operate other virtual
machines in a cluster. Once a node in a cluster fails to response to fence agent, fence
agent will fence it according to predefined strategy. Thus, the cluster can stay in a
healthy state without destroying shared storage.
400   Achieve High Availability about Failover in Virtual Machine Cluster

4.3 Result 3 for live migration

This experiment is tested on two hosts with the environment of Xen kernel installed.
Once Remus has started up, the VM will exist on both the primary and the secondary.
If you run “xm list” or “xl list” on the secondary, you should see the VM sitting in the
p (paused) state, consuming no CPU unless the primary fails.

Figure 7. The status on secondary node

It is shown above in Figure 7, the secondary node is running as a shadow copy


consuming no CPU. Once the vm on primary node fails, Remus will detect the failure
and live migrate to the secondary node, the status will change to r (running) state.

Table 1. Live migration time

Round 1 2 3 4 5 6

Non-Remus(ms) 3320 3642 3796 3456 3496 3675


Remus-DRBD(ms) 443 584 642 486 568 552

As the Table 1 shows, Remus-DRBD is much faster than the No-Remus one. Remus
plays an important role in fault tolerance. Users cannot feel the live migration process
when Remus-DRBD is on.

5 Conclusion

This paper introduces three kinds of issues in virtual machine cluster or cloud
computing environment and then we proposed some tests accordingly. All these tests
are successful and meaningful to help improve the performance.
Evacuate test solved host failure scenario successfully. Fencing test realized
fencing a node using virtual fence agent. Meanwhile, live migration test with remus-
DRBD got a good result in a very short migration time, it could be widely used in fault
 Achieve High Availability about Failover in Virtual Machine Cluster   401

tolerance scenarios. In a word, these tests give people some ideas on how to reach
high availability requirement and achieve failover purpose.

Acknowledgment: This work was supported by the National Nature Science


Foundation of China (No. 61370069), Fok Ying Tung Education Foundation (No.
132032), Beijing Natural Science Foundation (No. 4162043), and the Cosponsored
Project of Beijing Committee of Education.

References
[1] M. Jammal, T. Singh, A. Shami, R. Asal, and Y. Li, “Software defined networking: State of the art
and research challenges,” Elsevier Computer Networks, vol. 72, pp. 74-98, 2014.
[2] H. Hawilo, A. Shami, M. Mirahmadi, and R. Asal, “NFV: state of the art, challenges, and
implementation in next generation mobile networks (vEPC),” IEEE Network Magazine, vol. 28,
no. 6, pp. 18-26, December2014.
[3] ITU, “Cloud Computing Benefits from Telecommunication and ICT Perspectives,” http://www.
itu.int/dmspub/itu-t/opb/fg/T-FG-CLOUD-2012-P7-PDF-E.pdf, February 2012. [September 14,
2014]
[4] https://en.m.wikipedia.org/wiki/Cloud_computing#Infrastructure_as_a_service_.28IaaS.29
[5] https://en.wikipedia.org/wiki/XenM. Young, The Technical Writer’s Handbook. Mill Valley, CA:
University Science, 1989.
[6] Vargas, E. (2000). “High availability fundamentals.” Sun Blueprints series.
[7] Tam, F. (2009). Service Availability Standards for Carrier-Grade Platforms: Creation and
Deployment in Mobile Networks. Doctoral Thesis. Tampere: University of Tampere
[8] https://en.wikipedia.org/wiki/OpenStack
[9] https://github.com/ntt-sic/masakari
[10] https://fedorahosted.org/cluster/wiki/FAQ/Fencing
[11] https://en.wikipedia.org/wiki/Split-brain_(computing)
[12] https://en.wikipedia.org/wiki/Live_migration
[13] https://en.wikipedia.org/wiki/Distributed_Replicated_Block_Device
Kai-peng MAO, Shi-peng XIE*,Wen-ze SHAO
Automatic Segmentation of Thorax CT Images with
Fully Convolutional Networks
Abstract: Automatic segmentation in thorax CT images plays an important role for
quantitative analysis in large-scale studies in health and disease of thorax organs,
which is a challenging task for it is related to the judgment of state of organs. In this
paper, we proposed to use fully convolutional networks for the automatic segmentation
of tissue classes. This method, trained end-to-end, can take input of arbitrary size
and produce correspondingly-sized output which can simplify the step of CT image
segmentation. Though don’t depend on hand-engineered features, the method can
learn from the training data to recognize the information that is important for the
automatic segmentation. We defined the model of fully convolutional networks, and
explained their application to spatially dense prediction tasks. The segmentation
architecture was experimented on 50 sets of different CT data sets with different CT
Images scanned from 50 different people. Images of our dataset were segmented into
3 classes: skin, lung and trachea & bronchi, and could get dice coefficient of 0.97, 0.96
and 0.67 respectively. The segmentation results show that the method gets accurate
segmentation results in three classesand its robustness to different thorax CT Images
segmentation.

Keywords: CT image segmentation; fully convolutional networks; CT Images; deep


learning

1 Introduction

Accurate medical image segmentation is of great importance prerequisite for the


judgment of whether health or disease state of organs. The fact that the segmentation
of thorax medical images is different from that of color images with RGB channels
is widely accepted, making the general image segmentation method unavailable.
Medical images often have the property of lower tissue contrast and more image
voxels, which make the segmentation very challenging. In this work, we focus on the
segmentation of CT images, which have same level of intensity, and segment the CT
images into three different parts: skin, lung and trachea & bronchi.

*Corresponding author: Shi-peng XIE, College of Telecommunications and Information Engineering,


Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu 210003, China,
E-mail: xie@njupt.edu.cn
Kai-peng MAO, Wen-ze SHAO, College of Telecommunications and Information Engineering, Nanjing
University of Posts and Telecommunications, Nanjing, Jiangsu 210003, China
 Automatic Segmentation of Thorax CT Images with Fully Convolutional Networks   403

Segmentation of CT images is a kind of visual recognition task. That means


we should segment images into parts to provide convenience for further analyzing
of health status of organs. Visual recognition has always been a hot research
area, especially when we transmit multimedia on the Internet. Traditional image
segmentation methods need hand engineered methods, such as HOG and SIFT,
to extract features from the regional block, but with short comings that the hand-
engineered features lack robustness for diversity changes of images. Region proposal,
including selective search [7] and edge boxes, greatly reduces the time complexity
of the follow-up operation in image detection task, and to obtain candidate window
is higher than the quality of tradition sliding window. After a breakthrough of
convolution neural networks in image classification task, the model of region proposal
and CNN increasingly become mainstream in visual recognition task. R-CNN [8] and
successive fast R-CNN [9] and faster R-CNN [10] improve the efficiency and accuracy
of image detection. However, if we change these models to image segmentation task,
they can’t be trained end-to-end.
CNNs have shown great success in several computer vision tasks including object
recognition, object localization, object detection. In the history of the development
of convolution neural networks, there is a famous challenge-ImageNet, which
accelerated the development of deep learning. More and more theory and tricks are
designed to get higher accuracy in classification, and latest ResNet [4] has decrease
the error rate to 4.94%, which means it can identify the objects in the graph more
accurate than human. In recent years, CNNs has been increasingly used in medical
image processing. Compared with the traditional machine learning model, CNN
does not need to manually design features, such as HOG, SIFT. However CNN can
train unique convolution kernels for specific classification problems. The process,
much like a child’s cognitive process, can automatically extract information, such as
intensity, spatial and color, from giving information, result in no demand for hand-
crafted features. As for the field of medical image segmentation, CNN also has a lot
of applications. Zhang et al. [1] uses 2D patches of a single size as input for a CNN to
segment MR brain images into white matter, gray matter and cerebrospinal fluid. De
Brebisson et al. [2] use multiple parallel networks of 2D patches in orthogonal planes
and distances to segment T1-weighted MR brain images.
In this paper, we propose to segment thorax CT images into three different tissue
classes: skin, lung and trachea & bronchi using ConvNet. Overview of our method
is in Figure 1. Unlike previous work that takes super pixel as input and produce the
pixel category for the center of that super pixel, the proposed method trained an end-
to-end model with arbitrary-sized images as inputs and corresponding segmented
images as output. Furthermore, unlike previous work of fully convolutional networks
[6,11] for Pascal image segmentation models fine-tuned from Alexnet [3], VGG [13]
and GoogLeNet [17], this method, adapted to the segmentation of less-featured CT
images, propose a model with fewer layers than FCN [11]. This architecture allows
automatic learning of multi-scale spatial and grayscale feature for CT images. The
404   Automatic Segmentation of Thorax CT Images with Fully Convolutional Networks

method contain numerous of parameters which can be trained from our pre-prepared
training set. In our work, we change the size of each convolution kernel and compare
dice coefficient of segmented labels. We find that our segmentation performance
increasing when the size of kernel increased. This is consistent with that the size of
the convolution kernel in the CNN model determines the context information
This paper is organized as follows. Section 2 describes the training and validation
data set that were used and detailed description of the method including the
parameter setting and principle behind. Section 3 shows the evaluation experiments
and section 4 shows the results of these experiments. Finally, section 5, the last part,
discusses the results.

2 Data and Method

The method was applied to the segmentation of CT images from 50 times CT scan
of 50 different people. Every scan get image of size 512 × 512 × n (n is the size of 55
± 5) image, which is 16-bit binary number at each pixel, using different CT value.
Before the segmentation, we should adjust the CT window of different images to be
able to see each tissue clearly and satisfy the demand for the input of deep learning
frameworkcaffe [5,15], which is a popular deep learning model based on C++ and has
perfect python and Matlab interface. To generate segmentation labels, manual editing
was carefully performed to source CT images according to the labels each pixel belong
to. This process take days was really a challenging work. Note that each image was
segmented into 512 × 512 × 1 which can be set as the input of model. As a result, we get
2905 segmented labels in all.
Deep learning [16] models are a class of machines that can learn a hierarchy
of features by building high-level features from low-level ones. Each convolution
layer has different receptive fields and higher layer has larger receptive fields since
intermediate pool layers decrease the size of feature maps. In this paper, we design
CNN architecture to segment thorax tissues based on preprocessed CT images. Next,
we take the input image with size 512× 512 as an example to illustrate the detailed
structure of the entire CNN network. Figure 1 and Table 1 show the detail architecture.
Overall the segmentation architecture contains several convolution layers and 1
deconvolution layers, which is the key point to keep end-to-end characteristics.
In general, the first part of the network contains three convoluted layers and three
pooled layers, and after each convolution layer there is a non-linear activation unit
ReLU layer. The convolution kernels of all convolution layers are connected to all the
feature maps of the previous layer, and all convolution kernels stride are 1. As for
the pooling layer, all pooling layers of the structure of this paper are max-pooling
with pooling kernel of 2 × 2 and stirde size of 2 pixel. In order to make the network
calculation more convenient, we keep the size of the feature map unchange before
and after each convolution layer. The numbers of output feature maps of the three
 Automatic Segmentation of Thorax CT Images with Fully Convolutional Networks   405

convolution layers are increased by 24, 32 and 48 respectively. However, the kernel
size of three convolution layers are decreased by 9 ×9, 7 × 7, 5 × 5 respectively. Figure
2 shows we some feature maps of trained network after convolution layers and
convolution kernels. The output feature map of last pooling layer is one eighth of
original images. Next, some changes were performing to typical CNN networks.

Figure 1. Detailed architecture of the fully convolutional network. The size of output is the same as
the input.

Table 1. Details of architecture. “Conv.”, “Deconv.” denote convolution and deconvolution


respectively
Layer10

Layer11
Layer1

Layer2

Layer3

Layer4

Layer5

Layer6

Layer7

Layer8

Layer9

Softmax
Deconv.

(Train)

Layer type
Conv.

Conv.

Conv.

Conv.

Conv.

Crop
Pool

Pool

Pool

24 24 32 32 48 48 256 4 4 ﹣ ﹣
# of feature map

9×9 2 7×7 2 5×5 2 1×1 1×1 16×16 ﹣ ﹣


Kernel size

1 2 1 2 1 2 1 1 8 ﹣ ﹣
Stride

Pad size 100 0 3 0 2 0 0 0 0 ﹣ ﹣


406   Automatic Segmentation of Thorax CT Images with Fully Convolutional Networks

2.1 Changing of Full Connection Layer

Typical recognition nets, including AlexNet [3], VGG [13] and GoogLeNet [17], take a
fixed-size image as an input, and produces an output without spatial information.
This is mainly because the full connection layer in these networks loses the spatial
information of the feature map. To adjust models to satisfy dense segmentation task,
some changes should be executed. We change the network by losing the full connection
layer behind, which have many parameters and convert all fully connected layers to
1 × 1 convolutions. One benefit of this operation is that the number of parameters are
reduced due to the fact that convolutional layer have the feature of parameter sharing
and reduce the need of training images. Changes to these convolution layers can
also be considered convolution with all input regions. Another benefit is that we can
efficiently do patchwise training. From another point of view, the full connection layer
discards the location information of feature map, which is negligible information in
image classification task. However, our network will need some post processing to
get trained images if we drop these layers. This is intuitive view of the change of full
connection layer.

2.2 Upsampling Layer

A deconvolution layer is added after convolution layer to bilinear upsample the


coarse outputs to get pixel-dense outputs. This is the vital layer to get the images
of same size of the input ones. It’s called deconvolutional operation because it
just doing the opposite operation of a convolution operation and it is same as the
backward propagate of convolution layer. Instead of convolving a filter mask with
an image, we are trying toinfer the activations when convolved with the filter mask,
which could yield the image. The original aim was to learn a hierarchy of features
in an unsupervised manner. Howeverafter three layers of pooling, our feature maps
have the size of one eighth of original images, so the direct way was to upsample the
feature map and deconvolution is the way we can integrate to our fully convolutional
network to make the segmentation end-to-end. This is means that deconvolution can
be viewed as bilinear interpolation. In this paper, we found that the up-sampling
layer in the embedded network was faster and more efficient in training. And the size
of the deconvolution layer can be fixed during training, which is also a requirement
for up-sampling.
Next to deconvolution layer, we execute dropout layer to make the outputs the
same size as the inputs. During train phase, a softmax layer was added to perform
end-to-end train. However, the softmax layer was drop out during test, which means
the output of dropout layer is the segmented images.
 Automatic Segmentation of Thorax CT Images with Fully Convolutional Networks   407

3 Experiments

3.1 Prperation of Data

In our experiment, we execute all our work based on a popular deep learning
framework caffe [5]. The parameter settings of the CNN were defined in previous
description. We set images from 40 people for training and 10 people for testing. We
didn’t do cross validation, since cross-validation can be computationally expensive.
As a result, we get 2609 images for training and 296 images for testing.
Prior to the segmentation, we calculate the mean image of our training set.
Normalization can be done by settings in the protobuff-er file, sincecorrelation is
necessary to get a desirable training model. Then, we should change the training set
and testing set and their corresponding label images to lmdb or leveldb files which are
inputs that backend by caffe.

3.2 Setting of Parameters

We train by stochastic gradient descent. We use a batchsize of 5 and fixed learning rate
of 0.001. We use momentum 0.9 weight decay of 0.01, and doubled learning rate for
biases, because the weight in the model is more sensitive than the bias. We train the
model from scratch, while we initial all convolution weight parameters with Gaussian
method of zero mean and 0.01 standard deviation and bias parameters with 0.

3.3 Metric

In this paper, the CT image segmentation criteria are measured by the Dice coefficient.
Dice coefficient is commonly used to judge the quality of image segmentation
statistics. The dice coefficient is also known as quotient of similarity (QS). The QS is
defined as

where A and B denote the binary segmentation labels of ground truth and
calculation results of the model, respectively, about one individual class of certain
subject, |A| denotes the number of elements in the binary segmentation A, and |A U B|
represents the number of shared positive elements by A and B. The Dice Coefficients
ranges between 0 and 1, and the greater the value, the more accurate the division is,
and vice versa.
408   Automatic Segmentation of Thorax CT Images with Fully Convolutional Networks

4 Results

4.1 Training Result of Our Model

As the setting in the last section, we train our model from scratch. It only takes 10
hours on the train step since the limit of train image. Figure 4 show the trainresult of
our model. We can finish train after 10 epochs.

Figure 2. Trained convolution kernels after 10 epochs using 2609 train images. From left to right:
the test image, the kernel of 9 x 9 voxels from the first convolution layer, the image after first
convolution layer, the four images after second convolution layer, the sixteen images after third
convolution layer.

4.2 Comparison of Different CNN Architecture

Firstly, we try different CNN architecture by adding a convlution layers and changing
the convlution kernel size of diffenrent convolution layer. Sepcificly, after last pool
layer, we add a convolution layer with kernel size 5 × 5, stride size 1 × 1 and pad size 2,
and this layer has 64 feature maps. It should be noted that we didn’t add a pool layer.
We also try the deep jet [10] in our experiments, which is the architecture that
combines the last feature maps with low level feature maps. In this experiments, we
first change the deconvolution layer with two different devonovlution layers. The first
deconvolution layer has kernel size of 4 × 4 and stride 2 and the second one have kernel
size of 8 × 8 and stride 4. Then, we add a convolution layer with 1 × 1 kernel size and
4 outputs after second pool layer. The convolution layer fuse with first deconvolution
layer and the result of this fusion was set as the input of second deconvolution layer.
Theoretically, the result has finer segmentation result. The segmentation performance
can be found in Figure 3 and Table 2.

4.3 Comparison with Other Methods

For comparison purposes, we devise the segmentation experiments using previous


method linear SVM, as other models generally produce less effective perform. The
 Automatic Segmentation of Thorax CT Images with Fully Convolutional Networks   409

training data of the SVM model is the same as the training data of this paper. We use
patchwise approach to train SVM. Even though patchwise training lacks the efficiency
of fully convolutional training, we have to segment the images into patches to execute
linear SVM segmentation. The result of SVM was shown in Table 2.

Figure 3. Segmentation result comparison of two different methods. The method designed in this
paper produce the best performance. From left column to right column: the source images, the
manual segmented images, the segmented result by the method proposed by Long et. al. [11], the
segmented result with method proposed in this paper. Best viewed in color.

Table 2. Results of different methods in terms of dice coefficients

Dice coefficient

Experiments Skin Lung Trachea & Mean


bronchi

Add a convolution layer 0.95 0.75 0.24 0.65

Deep jet architecture 0.93 0.88 0.16 0.66

Classification with SVM 0.74 0.79 0.21 0.58

Method proposed 0.97 0.96 0.67 0.87

4.4 Discussion

In the above section, we compare our fully convolutional network model with some
other different models with big or small changes on convolution kernels or totally
traditional model. Result have shown that our model have desirable segmentation
result based on the test CT images. The CNN architecture have integrated feature
extraction, feature representation and feature classification without post-processing
to automatically apply to different tasks which have simplify the steps of image
410   Automatic Segmentation of Thorax CT Images with Fully Convolutional Networks

segmentation, and the feature extracted in the convolution layers of our model can
adapted to other visual recognition tasks at hand. However, from the Dice coefficients
of segmentation result above, we can see that our model achieved accurate
segmentations for all tissue classed except trachea & bronchi in the CT images.
This may due to the lack of sufficient desirable training images. In our experiment,
the train images were got with the same way of CT scanner. We can hardly say the
difference between the train images. This can be seen from the dice coefficients in
Table 2 and Figure 3.
It is known that ConvNet have a large number of parameters and need thousands
of or even millions of images to train a good model. ConvNet have the most parameters
in the full connection layer, while our model change the these layers to 1 × 1
convolution layer, which have reduce the number of parameters largely.So our model
has 100 thousand parameters in all. However, we have limited of training images that
are generated by some augmentation technique and we can hardly tell the difference
between these images. Moreover, among these training images, trachea & bronchi
account for a small part of training images. As a result, the Dice coefficients of trachea
& bronchi in the test set lower than other part. This is the deficiency of our model. In
the future, we can improve our model on the segmentation of trachea & bronchi with
limited training images and improve the generalization of our model.

Figure 4. Average dice coefficient of 269 test images.


 Automatic Segmentation of Thorax CT Images with Fully Convolutional Networks   411

5 Conclusion

In this paper, we propose a method for segmenting thorax CT images. This was
achieved by a fully convolutional network integrated with some intermediate layers,
including convolution layers pooling layers and the most important deconvolution
layer. We compare the results with the general machine learning segmentation results.
The results show that the method proposed in this paper is ideal. In conclusion, the
experiment in this paper qualitatively and quantitatively shows the superiority of the
model in the thorax CT image segmentation.
Of course, there are many areas to be improved in this model, such as more
context information for segmentation, and even segmentation of graph models based
on the entire image information. In the future work, we will focus on these aspects of
the work to improve the performance of model in this article.

Acknowledgment: This work was supported by the National Natural Science


Foundation of China (Grant NO. 11547155 and NO. 61402239), National Natural Science
Foundation of Jiangsu Province (Grant NO. BK20130883, NO. BK2130868) and the
NUPTSF (NO. 214026).

References
[1] W. Zhang, R. Li, H. Deng, L. Wang, W. Lin, S. Ji, and D. Shen, “Deep convolutional neural
networks for multi-modality isointense infant brain image segmentation,” NeuroImage, vol.
108, pp. 214–224, 2015
[2] P. Moeskops, M. A. Viergever, A. M. Mendrik, L. S. de Vries, M. J. N. L. Benders, and I Isgum,
“Automatic segmentation of MR brain images with a convolutional nerural network,” IEEE
Trains. Med. Imag., vol.35, no.5, pp. 1252-1262, 2016.
[3] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton, “ImageNet classification with deep
convolutional neural networks,” In NIPS, 2012
[4] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” In CVPR,
2016.
[5] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Gaudarrama, and T. Darrell,
“Caffe: Convolutional architecture for fast feature embedding,” In ACM Multimedia, pages
675-678, 2014.
[6] L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image
segmentation with deep convolutional nets and fully connected crfs,” In ICLR, 2015.
[7] J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders, “Selective search for object
recognition,” IJCV, 2013.
[8] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object
detection and semantic segmentation,” In IEEE CVPR, 2014.
[9] Ross Girshick, “Fast R-CNN,” In arXiv:1504.08083v2, 2015.
[10] S. Ren, K. He, R. Girshick, J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with
Region Proposal Networks,” In arXiv:1506.01497,2015.
[11] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,”
In IEEE CVPR, 2015.
412   Automatic Segmentation of Thorax CT Images with Fully Convolutional Networks

[12] J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, and A. Blake,
“Real-time human pose recognition in parts from single depth images,” In IEEE CVPR, 2011.
[13] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image
recognition,” In arXiv:1409.1556, 2014.
[14] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler, “Joint training of a convolutional network and a
graphical model for human pose estimation,” In NIPS, 2014.
[15] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darell, “Decaf: A deep
convolutional activation feature for generic visual recognition,” In ICML, 2014.
[16] Yann LeCun, Yoshua Bengio and Geoffrey Hinton, “Deep Learning,” In Nature, vol. 521, 2015.5.
[17] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A.
Rabinovich, “ Going deeper with convolutions,” In arXiv:1409.4842. 2014.
[18]  S. Xie, C. Li, H. Li, and Q. Ge, “A level set method for cupping artifact correction in cone-beam
CT,” Medical Physics, 42(8):4888-4895(2015).
[19] S.Xie and L.Luo, “Scatter correction for cone beam CT using self-adaptive scatter kernel
superposition,” Chin. Phys. C 6(12), 566–572 (2012).
Yong-jie WANG, Yi-bo WANG, Dun-wei DU, Yan-ping BAI*
Application of CS-SVM Algorithm based on Principal
Component Analysis in Music Classification
Abstract: Data of music type have been increasing dramatically, however, the
traditional classification method is slow and with lower accuracy in practical
application of large scale music classification. In order to improve the accuracy of
classification, the CS-SVM music classification model based on principal component
analysis is proposed. Firstly, the music characteristic is extracted by Mel frequency
cepstrum coefficient method; secondly, using the method of principal component
analysis to reducing the dimensions of characteristic signal; finally, we build the
CS-SVM classification model to classify the music, which shows that the average
correct rate of the CS-SVM algorithm based on principal component analysis has
reached 95.30%. Compared with traditional models, CS-SVM model has higher
accuracy. The results show that this method can be effectively applied to music
classification.

Keywords: Principal Component Analysis; Cuckoo search; Support vector machine;


Classifications

1 Introduction

Music type classification is an important part of multimedia applications. With the


rapid development of the data-storage technology and internet technology, the data of
the music type has an explosive growth. Traditional manual retrieval has been unable
to meet the retrieval and classification of massive music information, so computer
automatic classification is proposed. By extracting the features of the audio signals,
we can get music category.
Music classification is essentially a pattern recognition problem, which mainly
consists of characteristic extraction and classification. At present, researchers
have proposed some methods, such as audio classification method based on rules
[1], pattern matching method, BP neural network method based on characteristic
extraction and so on. However, these methods have the disadvantages of large amount
of computation and low classification accuracy. In order to avoid the shortcomings
of traditional methods, the CS-SVM music classification method based on principal

*Corresponding author:Yan-ping BAI, College of Science, North University of China, Taiyuan, China,
e-mail: baiyp666@163.com
Yong-jie WANG, Yi-bo WANG, College of Science, North University of China, Taiyuan, China
Dun-wei DU, Electromechanical Engineering Institute, Beijing, China
414   Application of CS-SVM Algorithm

component analysis is raised. In this method, Mel frequency cepstrum coefficient


is used to extract the characteristic signal of music, then characteristic vectors are
reduced by principal component analysis, at last we built a CS-SVM classification
model. The simulation results show that this method has higher classification
accuracy.

2 Principles of music classification

Music classification is essentially a pattern recognition process, hence the


classification process meet the general pattern recognition application process. In
this paper, we design the process of music classification using pattern recognition
theory. The basic process is as follows:
1. Collecting the training and testing data for music classification.
2. Extracting characteristic signals of music data and selecting a classification
model.
3. Training model and determining model parameters.
4. Testing model classification performance.

3 PCA-CS-SVM music classification models

3.1 Music characteristic extraction

Mel frequency cepstral coefficients (MFCCs) is a feature widely used in automatic


speech and speaker recognition. They were introduced by Davis and Mermelstein in
the 1980’s, and have been state-of-the-art ever since [2].
MFCCs are commonly derived as follows:
1. Take the Fourier transform of (a windowed excerpt) a signal.
2. Map the powers of the spectrum obtained above onto the Mel scale, using triangular
overlapping windows.
3. Take the logs of the powers at each of the Mel frequencies.
 N −1 
j ) log  ∑ ω jk ⋅ sk 
e( = j 1, 2, ⋅ ⋅ ⋅, p
=
 K =0  (1)

e(j) is the j filter log energy output; ωjk is the weight of the j triangle filter of the k
point; |Sk| is the amplitude of the DCT spectrum transformed to Mel scale; p is the
number of filter, generally 24.
4. Take the discrete cosine transform of the list of Mel log powers, as if it were a
signal.
5. The MFCCs are the amplitudes of the resulting spectrum.
 Application of CS-SVM Algorithm   415

2 p   iπ  (2)
xi
= ∑  e ( j ) ⋅ cos  p ( j − 0.5)  , =i 1, 2, ⋅ ⋅ ⋅, L
p j =1   
L is the dimension of the coefficient of MFCCs, in general L≤P, L is 24 in this paper.

3.2 Principal component analysis

Principal component analysis is an optimization algorithm, which was introduced


by Pearson in 1901. Principal component analysis (PCA) is a multivariate statistical
method to transform several original variables into few principal components. The
principal component variables still reflect most information of the original variable
after dimensionality reduction. Generally the principal component is a linear
combination of the original variables which they are linearly independent [3].
Defining the data matrix as X, each row of X represents a data value and each
column of X represents an index variable. If there are p variables X1,X2,...Xp, the linear
combination of variables can be formulated as:
Y1 a11 X 1 + a12 X 2 + ⋅ ⋅ ⋅ + a1 p X P
=

Y2 a21 X 1 + a22 X 2 + ⋅ ⋅ ⋅ + a2 p X P
=
 (3)
 ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅
Y a X + a X + ⋅ ⋅ ⋅ + a X
=
 p p1 1 p2 2 pp P

The formula can be abbreviated as:


Yi a1i X 1 + a2i X 2 + ⋅⋅⋅ + a pi X P=
= , i 1, 2, ⋅⋅⋅ p (4)

The coefficients of the PCA model must satisfy the following conditions:
1. Yi , Y j are nonlinear correlation.
2. Cov (Yk , Yk +1 ) > Cov (Yk +1 , Yk + 2 ), =
k 1, 2, ⋅ ⋅ ⋅, p
ak21 + ak22 + ⋅ ⋅ ⋅ + a=
2
kp 1,k
= 1,2 ,⋅ ⋅ ⋅ p (5)

The total variance of the principal component is equal to the total variance of the
original variable. The proportion of the variance of Yi is called the contribution rate of
the principal component. The sum of the contribution rate of the former m principal
components was called as the cumulative contribution rate of the former m principal
components. The cumulative contribution rate of the former m principal component
is not lower than a limit value (usually 85%). Then former m principal components
were extracted as input variables to realize the dimension reduction of the variables.
416   Application of CS-SVM Algorithm

3.3 Support vector machine

Support vector machine is a new learning method based on statistical theory proposed
by Vapink firstly. SVM model basic contents: the data is mapped to high dimensional
space F by nonlinear mapping function φ(X), and we find decision surface (such
as (6)) in high dimension space by linear regression, then the classification can be
achieved [4]. In order to find out the best decision surface, we need to introduce the
conditions (7):
f ( x) =
ω ⋅ ϕ ( X ) + b=0
(6)
yi (ωϕ ( xi ) + b ) ≥ 1
(7)

ω is the weight vector, b is the threshold value.


Because we can’t get the correct classification result by linear decision surface,
it need add slack variable and penalty factor to ensure the correctness of the
classification [5].
1 n
min ω 2 + C ∑ ε i (8)
2 i =1
Constraint conditions:
yi ( wϕ ( xi ) + b ) ≥ 1 − ε i ε i ≥ 0,=
i 1, 2 ⋅ ⋅ ⋅ n
(9)

In the formula, C is the penalty factor, C>0, εi is slack variable.


In order to avoid the course of dimensionality, we use the kernel function k(xi,yi)to
replace the vector inner product (φ (xi),φ(yi)) in nonlinear partition. Using Lagrange
multiplier algorithm, the problem can be transformed into the next formula:
1 n n

=
∑ i ji j i
∂ ∂ y
min
2 i , j 1 =i 1
y ϕ ( x ) ϕ ( x j ) + (
∑ ∂i ) (10)

and (10) meets these conditions:


n

∑ ∂i yi 0
= (C > ∂i > 0)
i =1 (11)

In the formula, the point ( ∂ i > 0 ) is called support vector.


A mathematical model SVM for the classification of characteristic signal is
constructed.
 n 
f ( x ) =sign  ∑ ∂ i yi k ( xi , x j ) + b  (12)
 i , j =1 
In this paper, the radial basis function is selected as the kernel function of SVM, and
its definition is as follows:
 x −x 2

k ( xi ,=
yi ) exp  − 
i j

 2δ 2

  (13)
 Application of CS-SVM Algorithm   417

In the formula, δ is the width parameter of radial basis kernel function.


We can find that the selection of penalty factor C and width parameter δ has a
great influence on the accuracy and precision of SVM classification[6]. If C is too small,
punishment is too small, the error is large; if the C is too large, the generalization
ability is poor [7]. δ is related to the map of low dimension space to high dimension
space, which directly affects the complexity of decision surface.

3.4 Cuckoo search algorithm

Cuckoo Search (CS) is a swarm intelligence optimization algorithm proposed by Yang


in 2009. It is based on cuckoo populations of parasitic reproductive strategies and the
Levy fight model of the birds and the fruit. The algorithm has fast convergence and
less parameters, easy realization etc [8]. It has been successfully used in engineering
optimization, designing optimization and other practical problems. In order to
simulate the nest-searching behavior, CS algorithm set up the following three basic
principles [9].
1. Each cuckoo lays one egg at a time, and dumps it in a randomly chosen nest.
2. The best nests with high quality of eggs (solutions) will carry over to the next
generations.
3. The number of available host nests is fixed, and a host can discover an alien egg
with a probability. In this case, the host bird can either throw the egg away or
abandon the nest so as to build a completely new nest in a new location.
( t +1)
Based on these three rules, when generating new solutions xi , say cuckoo i, a
Levy flight is performed
stepmax

xi( =
)
xi( ) + ∂ ⊕ L ( λ )
t +1 t
i 1,2 ,⋅ ⋅ ⋅,n
=
(14)

In the formula, ∂ represents the step size control. The product ⊕ means entry-wise
multiplications. L(λ) represents a Levy random search path.
When the position is updated, a random number r is generated ( 0 < r < 1 ). If
r > Pa , xi(t +1) will change randomly; otherwise unchanged. At last, the group of nest
location will be remained whose test value is best.
The basic CS algorithm uses Levy flight to generate the step size, which is
random, lack of adaptability, unable to ensure fast convergence. In order to solve the
relationship between the global optimization ability and accuracy, the size of the step
will be adjusted adaptively in this paper
stepi =stepmin + ( stepmax − stepmin ) di (15)
In this formula, stepmax is the maximum step and stepmin is the minimum step.
ni − nbest (16)
di =
d max
418   Application of CS-SVM Algorithm

In the formula, nbest is the best position of the nest in t time. ni is the location of the
nest i. dmax is the maximum distance between best nest location and other nests.

3.5 PCA-CS-SVM model music classification process

1. Collecting music classification data for training and testing; the characteristic
signal of music data will be extracted by MFCCs, and make label.
2. To reduce the dimension of the characteristic signal by PCA.
3. According to experience, determine range of C, δ; determine stepmax, stepmin and
max-epoch number Nmax.
4. Initializing probability parameter Pa = 0.75 and generating randomly N nest
location. The training set is calculated to cross validation error, finding the
current optimal nest location and the corresponding minimum error.
5. Keep the best nest position which error is minimum in last generation.
6. The bird nest is updated by Levy fright. The Levy fright step is calculated by
formula (15). Then we can get a new set of bird nest location, and calculate the
error.
7. According to the classification error, we can compare the new nest position with
the position of last generation nest. If new nest position is better, last generation
will be replaced by new nest position; otherwise, the last generation nest will be
remained.
8. Comparing Pa with the random number r, the nest which is discovered with a low
probability will be keep. The nest which is discovered with a high probability
will change randomly, then corresponding classification error is calculated.
Comparing the classification error in the position of each nest in p, a new set of
better nest is obtained by replacing the larger nest position with the smaller error
of the nest.
t
9. The best nest xb will be found out in last step. Then we will make a judgment
whether the minimum error meets the accuracy requirements or not. If accuracy is
satisfied, stop iteration, and output the global minimum error and corresponding
best nest. Otherwise, return to step four.
10. The testing set will be retained with SVM in which C, δ is determined according to
t
the best nest location xb . Then music classification model will be built, which is
used to predict the classification of the test set.

4 Simulation experiment

In order to verify the classification performance of the PCA-CS-SVM model, the PCA-
CS-SVM model will classify four different music signals folk song, zither, pop and
rock. For each piece of music, using MFCCs to extract 500 groups of 24 dimensions
 Application of CS-SVM Algorithm   419

characteristic signal. At the same time, the characteristic signals are identified
by 1,2,3,4. The extracted signals are stored in a database file, each group of data
is 25 dimensions, the first dimension is the category identifier, and the later 24
dimensions are the characteristic signal. Four sets of data are combined as a set, and
reducing dimension of the data by PCA. From result, we can find that the cumulative
contribution rate of the first 17 principal components was 86.36%, so the first 17 data
are selected as the experimental input. 1500 sets of data were randomly selected as the
training data, and the remaining 500 groups was tested classification performance of
the model. Set the desired output of each group according to the characteristic signal
classification, if the identity is 1, the desired output is set to vector [1, 0, 0, 0].
The test data are classified by the trained PCA-CS-SVM model. The classification
error of the model is shown in Figure 1. The prediction error of the model is shown
in Figure 2. The classification accuracy is shown in odel has a higher accuracy and
efficiency than that of other models in music classification.
Furthermore, in order to verify the superiority of the model, we conduct three sets
of experiments, BP neural network, additional momentum BP Table 1. From the result
we know that the music classification algorithm based on PCA-CS-SVM malgorithm,
PCA-SVM algorithm. In the Table we can see, PCA-CS-SVM model have higher accuracy
than other models.

PCA-CS-SVM model
2

1.5

1
classification error

0.5

-0.5

-1

-1.5

-2
0 50 100 150 200 250 300 350 400 450 500
music signal

Figure 1. PCA-CS-SVM model classfication error


420   Application of CS-SVM Algorithm

4
forecast classification
actual classification
3.5

2.5

1.5

1
0 50 100 150 200 250 300 350 400 450 500

Figure 2. PCA-CS-SVM forecast

Table 1. Model classification accuracy

Music signal PCA-CS-SVM PCA- SVM BP AMBP

Folk song 91.43 89.00 78.29 84.26

Zither 99.90 99.12 94.00 94.35

Pop 93.50 81.29 68.12 88.32

Rock 96.36 98.86 90.46 88.50

5 Conclusions

Automatic music classification provides a new method to solve the large volume of
music classification. Based on the extraction characteristics of music signal by MFCCs,
PCA-CS-SVM model is used to classify music, the average classification accuracy rate
reached 95.30%. In particular, the Zither classification accuracy rate reaches 99.90%.
Experiments show that this method is reasonable and feasible.

Acknowledgment: National Natural Science Foundation of China (61275120)


 Application of CS-SVM Algorithm   421

References
[1] S G Mallat. “A Theory for Multire solution Signal Decomposition: the Wavelet Representation,”
IEEE Trans Pattern Analysis and Machine Intelligence,vol.11,Jul.1989,pp.674-693.doi:
10.1109/34.192 463.
[2] S. Molau. “Computing Mel-frequency cepstral coefficients on the power spectrum,” IEEE
International Conference on Acoustics,vol.8,May.2001,pp.73-76.doi: 10.1109/ICASSP.2001.940
770.
[3] R Bro, AK Smilde. “Principal component analysis,” Analytical Methods,Mar. 2014, pp.433–459.
doi: 10.1039/C3AY41907J.
[4] Astorino A,Gorgone E,Gaudioso M,et al. “Data preprocessing in semi - supervised SVM classi-
fication,” Optimization, vol.60, Jan. 2011, pp. 143-151. doi: 10.1080/02331931003692557.
[5] Horng M H. “Performance evaluation of multiple classification of the ultrasonic supraspinatus
images by usingML, RBFNN and SVM classifiers,” Expert Systems with Applications,vol.37,Nov.
2010,pp. 4146-4155. doi: 10.1016/j.eswa.2009.11.008.
[6] XS Yang, S Deb. “Engineering Optimisation by Cuckoo Search,” Mathematical Modelling and
Numerical Optimisation, vol.1,May.2010,pp.330-343. doi: 1005.2908v3.
[7] Xin-She Yang, Suash Deb. “Multiobjective cuckoo search for design optimization,”
Computers &OperationsResearch,Vol.40, Jun. 2013, pp. 1616-1624,doi:10.1016/j.
cor.2011.09.026.
[8] Zhang Xiaoli, Chen Xuefeng, He Zhengjia. “An ACO-based algorithm for parameter
optimization of support vector machines,” Expert Systems with Applications, vol.37,Sep.2010,
pp.6618-6628.doi: 10.1016/j.eswa.2010.03.067.
[9] Wu C H, Ken Y, Huang T. “Patent classification system using a new hybrid genetic algorithm
support vector machine,”Applied Soft Computing,vol.10,Sep.2010,pp.1164-1177.doi: 10.1016/j.
asoc.2009. 11.033.
Si-wen GUO*, Yu ZUO, Tao YAN,Zuo-cai WANG
A New Particle Swarm Optimization Algorithm Using
Short-Time Fourier Transform Filtering
Abstract: A new Particle swarm optimization algorithm based on the Short-time
Fourier Transform model (here after called SFPSO) was proposed to against the
problem that the standard PSO algorithm is easy to fall into local optimal solution.
Firstly, the supervision conditions of population diversity were added to the standard
PSO algorithm that the process of filtering in frequency field was triggered when the
population down to a given threshold value. Secondly, particle that within the certain
radius was conducted Short-time Fourier Transform. Using the Butterworth low pass
filter which have the minimum dispersion weakened the currently found extremism.
Finally, the particle have suffered from the force of radiation function and reentered
the other space to search. Compared with the SPSO, QPSO, Charge PSO and Genetic
PSO algorithms, the results indicate that the superiority of the SFPSO algorithm.

Keywords: Particle Swarm Optimization (PSO); Short-Time Fourier Transform(STFT);


Population Diversity; Cutoff Frequency; Multi-modal Function

1 Introduction

Particle swarm optimization(PSO) [1] has been widely used in Digital Image
Process, Artificial Intelligent, Automatic Control System [2]. As a population-based
optimization algorithm, which is inspired by the bird flocking. PSO algorithm starts
with the random initialization of particles in the solution space. Each particle is
endowed with a random position and a random velocity at the beginning, and then
adapts its search patterns based on its own experience and experiences of other
individuals.
However the standard PSO algorithm still have many shortcomings and need
to do in-depth study. F Van den Bergh has been proved that the PSO can’t guarantee
converge to the global optimal solution [3]. There have been a series of improvements
which focus on the study of the different model structure of the swarm. A limited
information Particle Swarm optimization has been used to optimization of Topology [4].

*Corresponding author: Si-wen GUO, School of Computer Science and Educational Software
Guangzhou University, Guangdong Provincial Engineering Technology Research Center for
Mathematical Education Software, Guangzhou, China, E-mail:gzgsw_100@aliyun.com
Yu ZUO, Institute of Mathematics and Computer Science, Guizhou Normal College, Guiyang,China
Tao YAN, Zuo-cai WANG, Chengdu Institute of Computer Applications, Chinese Academy of Science,
Chengdu, China
 A New Particle Swarm Optimization Algorithm   423

A new Inverse Page Rank Particle Swarm Optimizer was proposed by Cesare [5].
Document [6] mentions a hybrid method based on the Genetic Algorithm and
PSO(GA-PSO). Inspired by those works, we proposed a particle swarm optimization
with Short-Time Fourier Transform filtering processing (SFPSO) in order to improve
the performances of standard PSO.

2 SFPSO Algorithm

2.1 Short-time Fourier transform

The short-time Fourier transform(STFT), is a Fourier-related transform which used to


determine the sinusoidal frequency and phase content of local sections of a signal
as it changes over time [7], whose statistic characteristics vary with time. In essence,
STFT extracts several frames of the signal to be analyzed with a window that moves
with time. If the time window is sufficiently narrow, each frame extracted can be
viewed as stationary so that Fourier transform can be used. With the window moving
along the time axis, the relation between the variance of frequency and time can be
identified. Mathematically, this process can be expressed as follows (1).

∫ x(t )w(t − τ )e
− jω t
STFT {x(t )}(τ , ω ) ≡ X (τ , ω ) = dt
-∞ (1)

where w(t ) is the window function, x(t ) is the signal to be transformed.

2.2 The basic idea of the algorithm

In the initial state, PSO has algorithm higher degree of diversification. As the searching
process continues, a decline of diversity shows that the PSO algorithm have great
difficulties to leave local optima. Consequently, the clustering leads to low diversity
with premature convergence [8].
The proposed method effectively alleviates premature convergence and improves
weak exploitation capacity of PSO algorithm. Firstly, the diversity measure is added
in PSO algorithm used to analyze and guide. Through a certain range of frequency
domain filtering base on short-time Fourier transform, the best position particle of
current cluster should be peak clipping. Then, all of particles have suffered from the
force of the radiation function. In order to have a high speed to enter other space for
searching. The process of SFPSO algorithm shown as Figure 1.
424   A New Particle Swarm Optimization Algorithm

Pbest

1.STFT
Pbest

2.Radiation
Direction

Figure 1. The process of SFPSO algorithm.

2.3 Algorithmic process of SFPSO

Program SFPSO
Run PSO;
if Calculate Diversity() < threshold
Short-Time Fourier Transform();
Filtering();
Rebuild Fitness();
Radiation Function();
Update Position();
Update Velocity();
end

3 Related Works

3.1 Diversity guided

An accepted hypothesis is that maintenance of high diversity is crucial for preventing


premature convergence [9]. Riget has suggested a method which use a diversity
measure to control the swarm. The purpose of this way is to prevent premature
convergence to a high degree [10]. Related formulas are as follows (2),(3).
 A New Particle Swarm Optimization Algorithm   425

S (t ) = ( x1 , x2 , , xn ) (2)

1 1 N nx
D( S (t ))
diameter ( S (t )) ⋅ N N
∑ ∑
=i 1 =j 1
( xij (t ) − x j (t )) 2

1 N
x j (t ) =
N
∑x i, j
i =1 (3)

where s is the swarm, N is the swarm size, D( S (t )) is the length of longest the
diagonal in the search space, nx is the dimensionality of the problem.

3.2 The filter process

The supervision conditions of diversity has reached threshold which we set. Suppose
the best particle approaching a optimal solution. This solution may be a local optimal
solution or a global optimal solution. In other words, the best particle has reached
the peak of the fitness function (suppose solving the maximum). The most dramatic
changes in a function must be a peak, corresponding to high frequency part of signal.
In order to clip this peak, we can take the best particle as a center and use frequency
domain filters in a certain range. The key point of this process is that the window of
STFT and the best suitable filter should be confirmed.

3.2.1 Selection of window function


In order to select an appropriate window function for STFT, we can take the main
lobe and the peak level of the side lobes, which respectively determine the ability to
resolve comparable strength signals and disparate strength signals as criterion.

Hamming Time domain Hamming Frequency domain


1 0

0.9

0.8 -20

0.7
|W(e jω )| /dB

0.6 -40
Amplitude

0.5
10
20log

0.4 -60

0.3

0.2 -80

0.1

0 -100
0 5 10 15 20 25 30 0 0.2 0.4 0.6 0.8 1
Sample Normalized Frequency /π

Figure 2. Hamming window


426   A New Particle Swarm Optimization Algorithm

As show in Figure 2, the Hamming window is optimized to minimize the maximum


side lobe, giving it a height of about one-fifth that of the Hamming window [11]. The
Formula is defined by (4), and the optimal values for the coefficients are α = 0.53836
and β = 0.46164 [12].
2π n
w(n)= α − β cos( ) (4)
N −1

3.2.2 Filter design


We wish to use the ideal filter (ILPF) remove all frequency components from the
signal that beyond some cutoff frequency. But the result shows that the output signal
occurs before the input signal has arrived. Meanwhile, The ideal filter is not physically
realizable. So we chose a Butterworth low pass filter (BLPF). The form of this filter in
one dimension is given by (5).
1
H (u ) = 2n
(5)
1 +  D ( u , v ) / D0 

where u is the distance from the center of frequency rectangle. The BLPF is different
from ILPF that transfer function does not have a sharp discontinuity that gives a clear
cutoff between passed and filtered frequencies. Therefore no ringing is visible in any
of the processed with this particular BLPF [13].

3.2.3 Calculate the cutoff frequency


The uncertainty principle, which in words states that a function cannot simultaneously
have restricted support in time as well as in frequency. Suppose f is a function in
L2 ( R) . The dispersion of f about the point a is the quantity (6). The dispersion of f
about a point a measures the deviation or spread of its graph from x = a . Applying
the preceding definition of dispersion to the Fourier transform of f gives by (7).
1

1   2

2
 ∫ ( x − a ) f (= a (6)
2
=∆a f x) dx 
f ( x) 2  −∞  1
1   2

2 1
∆ a f  ∫ (η − a ) f (η=) dη  (7)
2
=
f (η )  −∞  2 a
2

Combining with the Plancherel formula and Schwarz’s inequality, we get (8).
1
∆ a ( f ) * ∆ a ( f ) ≥ (8)
2
One consequence of the uncertainty principle is that the dispersion of f about any a
and the dispersion of the Fourier transform of f about any a cannot simultaneously
 A New Particle Swarm Optimization Algorithm   427

be small. In order to extract the feature which we are interested in. The key is how to
calculate the minimum dispersion of the Fourier transform.
Another description of the dispersion is related to statistics. The function
2 2
f ( x) ∫ f as a probability density function. If a is the mean of this density, then
∆ a ( f ) is just the variance [14]. We can calculate the variance ∆ Best ( f ) which take
x
the best position particle Best as center. Then, we get the cutoff frequency of the STFT
(9).
1 N
f)
∆ Best ( = ∑ ( xi − xBest )2 (9)
N i =1

3.3 Radiation function

The particle should check the other search space after current area processing by
using the filter. Its require a reverse direct force to leave current congestion status.
Hence, we design a radiation function to suit this particular need (see (10) and (11)).
1
f (λ , x ) = λ
(10)

x + sin( x )
1 + exp

PBest (t ) − Pt (t )
x=
PBest (t ) − PFar (t ) (11)
where λ as a amplification coefficient, x as range normalization factor, pBest (t ) is
the best position in current iteration, pFar (t ) is the furthest distance position form
pBest (t ) .
The new position is simply calculated as the sum of the previous position.

x = x + ∆t (12)
The update of the velocity from the previous velocity to the new velocity is given by
(13)

v = wv + c1r1 ( pBest − x) + c2 r2 (nBest − x) + c ⋅ f (λ , x) ⋅ r3 (13)

4 Experimental Analysis

4.1 Benchmark functions

We have tested the SFPSO algorithm on five standard multi-model objective functions.
It is widely accepted that those benchmark functions for testing the performance of
different PSO model. Five functions are given by Table 1.
428   A New Particle Swarm Optimization Algorithm

Table 1. Benchmark functions

Function Expression Coordinates Minimum

Rosenbrock xi = 1 0

n −1
=f ( x) i =1
[100( xi2 − xi +1 ) 2
+ ( xi − 1) 2 ]

Ackley xi = 0 0
1 n 2
f ( x) =
−20 exp(−0.2 ∑ xi )
n i=1
1 n
∑ cos(2π xi )) + 20 + e
− exp(
n i=1
Griewank xi = 0 0
xi2 x
∑ i=1 4000 − ∏ i =1 cos( i ) + 1
n n
f ( x) =
i

Rastrigin xi = 0 0
∑ i=1[ xi − 10 cos(2π xi ) + 10]
n
f ( x) =

Schaffer sin 2 xi2 + xi2+1 − 0.5 xi = 0 -1


f ( x ) = ∑ i =0
n −1

[1 + 0.001( xi2 + xi2+1 )]2


−0.5

4.2 Experiments

The parameter λ determine the force of radiation, respectively. It is evident that a


tremendous impact around the local minimize is affected by increasing this parameter.
In order to find the relationship between λ and diversity. We made experiments with
each test function in 10 dimensions. The diversity of the radiation model is directly
compared to no attach policy model. The effect of setting λ = 0.1, 0.5,1, 2 , is exhibited
in Figure 3.
7
Rosenbrock
Ackley
6 Geiewank
radiation diversity / no radiation diversity

Rastrigin
JDSchaffer
5

1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
λ

Figure 3. The relationship between λ and diversity


 A New Particle Swarm Optimization Algorithm   429

On the Rosenbrock and JD Schaffer benchmark functions, λ = 1 is good


performing better than the others. However, on the Ackey, the Griewank, and the
Rastrigin test have not greatest improvements. We chose λ = 1 as our amplification
coefficient parameter.
Some parameters in standard PSO should be confirmed. The inertia weight linear
procedure with wmax = 0.9 and wmin = 0.4 , the acceleration coefficients c= 1 c=2 2.0 .
The selected values are widely used in the relevant literature. Each algorithms Specific
parameter settings as shown in Table 2.

Table 2. Specific parameter setting

Algorithm Particle Dimension Parameters Learning Sample Attach Policy


size

LPSO 30 10 Local area None


c= c= 2 .
1 2

GPSO 30 10 All area None


c= c= 2
1 2

GA-PSO 30 10 All area Genetic variation


c= c= 2
1 2

CPSO 30 10 Local area Charge


R =1
c

R =
p
3X max

Q = 16
i

SFPSO 30 10 Local area STFT Radiation


c
= 2,
=
3
r 2

λ =1

4.3 Algorithm performance analysis

Filtering effect to the standard PSO, the average extreme value of each benchmark
function before and after filtering can be seen as Table 3. All these functions have
good clipping rate, especially function Rastrigin is significantly outperformance than
other functions.

Table 3. Filtering effect

Function Iteration No filter Result Filter Result Clipping rate (Filter / No filter)

Rosenbrock 1000 5.13 6.61 1.29

Ackley 1000 6.63 7.34 1.11


Griewank 1000 0.89 1.13 1.23

Rastrigin 1000 1.17 1.71 1.47

JD Schaffer 1000 0.43 0.51 1.17


430   A New Particle Swarm Optimization Algorithm

4
We made experiments with each test function 5*10 iterations as termination
criterion. The performance of the SFPSO is directly compared to the LPSO, GPSO,
CPSO [15] and GA-PSO. From experimental results we can get convergence curve
shown as Figure 4. For functions Rosenbrock and Ackey SFPSO at first lags behind but
4
then surpasses the performance of other algorithms no more than 2 *10 generations.
For these functions, other algorithms move very quickly in the initial generations and
then get flatten for the remainder of the run. The performance on function Griewank
is significantly different than the other functions for this experiment. The SFPSO
never equals performance of other algorithms during the whole running.
The results on performance clearly shown that the SFPSO is much stronger
optimizer than others on all the test functions expect on Rastrigin function. The
SFPSO is not performing better than LPSO on the Rastrigin function. Because
Rastrigin is essentially unimodal during the later stage of iteration. Hence the LPSO
has no difficulties in finding the global optimum. When SFPSO is tested on Rastrigin
function, if the condition of diversity has reached. All of the particle have suffered a
radiation force. The LPSO easily finds a optimal solution without this stage.
The SFPSO algorithm has higher probability to find the global optimum in
other functions, and the fitness value is on average much better than that of other
algorithms.

10
10
SFPSO
LPSO
GPSO
GAPSO
5 CPSO
10

0
10
Fitness/Log

-5
10

-10
10

-15
10

-20
10
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Iteration*104(Rosenbrock)

Figure 4. Convergence curve


 A New Particle Swarm Optimization Algorithm   431

2
10
SFPSO
LPSO
0 GPSO
10
GAPSO
CPSO

-2
10

-4
10

-6
10
Fitness/Log

-8
10

-10
10

-12
10

-14
10

-16
10
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Iteration*104(Ackley)

0
10
SFPSO
LPSO
GPSO
-2
10 GAPSO
CPSO

-4
10

-6
10
Fitness/Log

-8
10

-10
10

-12
10

-14
10

-16
10
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Iteration*104(Geiewank)

Figure 4. Convergence curve


432   A New Particle Swarm Optimization Algorithm

2
10
SFPSO
LPSO
GPSO
0 GAPSO
10
CPSO

-2
10

-4
10
Fitness/Log

-6
10

-8
10

-10
10

-12
10
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Iteration*104(Rastrigin)

SFPSO
LPSO
GPSO
GAPSO
CPSO
-0.03
-10
Fitness/Log

-0.02
-10

-0.01
-10

0
-10
-2 -1 0 1
10 10 10 10
Iteration*104(JD Schaffer)

Figure 4. Convergence curve


 A New Particle Swarm Optimization Algorithm   433

Figure 5. Every algorithm optimum and time

5 Conclusion

The SFPSO algorithm to a large extend prevents premature convergence. Once the
population diversity decreased to a certain threshold, the current optimum has been
weakened by short-time Fourier transform filtering. We take the particle variance
as cutoff frequency of Butterworth low pass filter and then calculate the highest
resolution of filter and obtain the optimal clipping effect. Meanwhile, the force of
the radiation function makes it possible find new and better solutions of the other
search space. Compared the SFPSO algorithm with other algorithms through some
experiments, the results verify the effectiveness of the SFPSO algorithm. We believe
the SFPSO algorithm is a strong optimizer with regard to multi-modal optimization.

Acknowledgment: This paper is supported by the National High Technology


Research and Development Program of China ( Grant No. 2015AA015408 ) and the
West Light Foundation of The Chinese Academy of Sciences(No.2011180) (Grant No.
2011180).

References
[1] Kennedy, J, and R. Eberhart, “Particle swarm optimization.” IEEE International Conference on
Neural Networks,IEEE Press, Dec.1995, pp. 1942-1948.
[2] LIU Yanmin, NIU Ben, New particle swarm optimization theory and practice, Science Press,
2013, pp. 1-3.
[3] Frans,VanDenBergh,”An analysis of particle swarm optimizers. “Particle Swarm Optimization ,
2002.
[4] Du, Wen Bo, et al. “Adequate is better: particle swarm optimization with limited-
information.” Applied Mathematics & Computation, vol. 268, 2015, pp. 832-838.
434   A New Particle Swarm Optimization Algorithm

[5] Cesare, N. Di, D. Chamoret, and M. Domaszewski, “A new hybrid PSO algorithm based on a
stochastic Markov chain model.” Advances in Engineering Software, vol. 90, 2015, pp. 127-137.
[6] Juang, Chia Feng, “A hybrid of genetic algorithm and particle swarm optimization for recurrent
network design.” IEEE Transactions on Systems Man & Cybernetics Part B Cybernetics A
Publication of the IEEE Systems Man & Cybernetics Society,2004, pp. 997-1006.
[7] E.Sejdic, I. Djurovic, J. Jiang, Time-frequency feature representation using energy concentration,
An overview of recent of advances, Digital Signal Processing, 2009, pp. 153-183.
[8] Clerc, M., and J. Kennedy, “The particle swarm - explosion, stability, and convergence in a
multidimensional complex space.” IEEE Transactions on Evolutionary Computation,2002, pp.
58-73.
[9] J.S.Vesterom, J.Riget,and T.Krink, Division of Labor in Particle Swarm Optimization, In
Proceedings of The IEEE Congress on Evolutionary Computation. Honolulu, IEEE Conference
Publication, 2002, pp. 1570-1575.
[10] Riget, Jacques, and J. S. Vesterstrøm, “A diversity-guided particle swarm optimizer-the ARPSO.”
vol. 2, 2002.
[11] Enochson, Loren D, Otnes, Robert K, Programming and Analysis for Digital Time Series Data,
Programming and Analysis for Digital Time, 1968, pp. 142-145.
[12] Nuttall, Albert H, “Some windows with very good sidelobe behavior.”IEEE Transactions on
Acoustics Speech & Signal Processing ASSP-39.vol. 1,1981,pp. 84-91.
[13] Rafael C.Gonzalez and Richard E.Woods, Digital Image Processing, Pearson Education, 2007,
pp. 297-298
[14] Donoho, David L., and P. L. Donoho, A First Course in Wavelets with Fourier Analysis. A first
course in wavelets with Fourier analysis. John Wiley & Sons, 2009, pp. 63-64.
[15] Blackwell, T. M, and P. J. Bentley, “Dynamic Search With Charged Swarms.” GECCO 2002:
Proceedings of the Genetic and Evolutionary Computation Conference, New York, Usa, 9-13
July 2002, pp. 19-26.
Zhen WANG*, Hao-peng CHEN, Fei HU
RHOBBS: An Enhanced Hybrid Storage Providing
Block Storage for Virtual Machines
Abstract: Non-volatile memory devices such as Solid State Drives (SSDs) are becoming
more popular due to their high performance, increasing capacity and decreasing
cost in recent days. However, it is unfeasible to store all data in SSDs because it is
too expensive compared with Hard Disk Drives (HDDs). Hybrid storage system may
represent a balanced solution between high performance and acceptable cost. This
paper presents a hybrid storage system called RHOBBS, which provides a hybrid
block device storage for virtual machines. RHOBBS is based on Ceph and uses Object
Storage Nodes (OSDs) for storing objects. It groups OSDs into two logical storage
pools, the high performance pool (SSD pool) and mass volume pool (HDD pool).
RHOBBS strips VM disk image into objects and storing them in different storage pools
according to their contents. RHOBBS then shuffles objects across different storage
pools according to their accessing patterns periodically as well. We perform an
evaluation of RHOBBS and the experiments show that RHOBBS achieves significant
performance improvement than original Ceph and our previous study.

Keywords: storage; object-based; distributed; block storage; virtual machine

1 Introduction

Virtualization has become the foundation of cloud computing because it makes


it possible to maximize the potential of hardware by running multiple operating
systems and applications on a single physical machine. In datacenters, Virtual Block
Device (VBD) has become one of the most important solutions to provide block device
storage for virtual machines since they can provide higher availability, scalability and
manageability than direct-attached disks [1,2]. Vritual Machine Disk Images (VMDIs)
are usually used to emulate VBDs.
Typical storage solutions use Storage Area Network (SAN) or Network Attached
Storage (NAS) to store VMDIs. But with the prevalence of cloud computing, scalability
becomes more and more important. Both SAN and NAS can hardly meet the new
demands. Object-based storage system with high scalability and availability provides
us an alternative solution [3]. An object is a data container which has an identifier,
data, and metadata. Objects is designed to store arbitrary type of data. Therefore
object-based storage is a viable solution to storing data like VMDIs.

*Corresponding author: Zhen WANG, Shanghai Jiao Tong University, E-mail: estawang@hotmail.com
Hao-peng CHEN, Fei HU, Shanghai Jiao Tong University
436   RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines

The write and read speed for SSDs are typically much faster compared to
HDDs [4]. On the other hand, SSDs are much more expensive and also limited in
capacity compared to HDDs. Those applications which run in VMs in datacenters
can benefit from SSDs if they demand high storage performance. But it will cost too
much if building a full-SSD storage system for VMs in datacenter. As a result, hybrid
configuration (SSD and HDD) becomes one of most available and balanced solutions
considering the cost and performance [5].
In our previous work, a hybrid object-based storage system based on Ceph [6]
called WHOBBS [7], was proposed. WHOBBS strips VMDIs into objects, and store
them in a set of storage nodes called Object Storage Devices (OSDs). It introduces
logical storage pools to manage different types of OSDs, periodically analyzes VMs’
workload and migrates objects across two different pools (SSD pool and HDD pool).
Those objects with high access workload are more likely to be migrated to SSD pool
to exploit remarkable performance benefits of SSDs. However, in WHOBBS there is a
central monitor to collect VMs’ workload from clients and performs global analysis to
determine the appropriate object placement strategy. If the central monitor crashes,
the analysis and migration of data will stop.
In this paper, we implement RHOBBS, a more advanced version of WHOBBS.
Like WHOBBS, RHOBBS uses the concept of storage pools, analyzes workloads and
migrates data. In order to address the single failure problem brought by central
monitor, RHOBBS introduces a cluster of shufflers to replace the central monitor in
WHOBBS. A client sends a request to the shuffler cluster, then one of the running
shufflers will be responsible for analyzing the workload of that client and starts data
migration. If the responding shuffler fails, another shuffler will take over to serve
the client. To make the appropriate global placement strategy, RHOBBS enables
each shuffler to obtain global workload information by using the Raft consenesus
algorithm [8]. RHOBBS also uses VMDI introspection technology to understand the
internal structure of VMDI files. This understanding can be leveraged when stripping
VMDIs into objects and storing them in different storage pools-objects which contain
metadata of VMDIs can be moved to SSD pool because they are usually accessed more
frequently and randomly.
Our major contributions are as follows:
–– We proposed RHOBBS, a Ceph-based hybrid block storage for VMs, which
achieves optimized cost effectiveness, scalability and availability.
–– RHOBBS introduces a new technique based on Raft to collect and analyzes
workloads from all clients to determine the global placement strategy.
–– RHOBBS utilizes VMDI introspection to make a better placement strategy in the
initial phase to achieve better performance.
–– Our evaluation shows that RHOBBS can significantly improve IO performance
under limited SSD resources.
 RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines   437

2 Background and Motivation

In this section, we introduce the motivation of RHOBBS by discussing the central


node problem of WHOBBS and VMDI introspection technology. Then we describe the
caching mechanism of Ceph and its drawbacks.

2.1 The central-node architecture of WHOBBS

The Monitor node in WHOBBS cluster is liable for resource management, log recording
and workload analysis.

Figure 1. The architecture of WHOBBS

As shown in Figure 1, Monitor is designed to collect VM workload information from


Client nodes and health status from OSDs. Besides that, Monitor analyzes workloads
of Clients, makes new global objects placement strategy relatively and sends object
migration signals to the corresponding storage node. This architecture brings an
availability problem when Monitor crashes.
Instead of adopting a central-node architecture, a cluster-based solution is much
preferred. The Monitor of WHOBBS can be replaced by a cluster of shufflers. A shuffler
provides similar functions like the Monitor but it only makes migration decisions
a part of the Clients: other shufflers will be responsible for the rest of the Clients.
The task of collecting information, analyzing workloads and data migration can be
accomplished in a distributed way on multiple shufflers to avoid availability issues.
438   RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines

2.2 VMDI introspection

As WHOBBS is based on Ceph, a VMDI file is stripped into fixed-size objects and
they are stored in the HDD pool in the initial phase. The VMDI files usually have a
complex internal structure which fits well into the meta-data-vs.-data classification
[1]. In general, the meta-data of VMDIs are supposed to be accessed more frequently
or randomly. A very straightforward approach to improve the IO performance of VMs
is to pre-store those objects which contain the metadata of VMDIs. But from the view
of WHOBBS, all objects are the same: they are simple data objects. Fortunately, Vasily
Tarasov et al. [1] provided a solution to parse the internal structure of VMDI files. With
the understanding of internal structure of VMDIs, those meta-data objects can be
distinguished from others and stored in the SSD pool in the initial phase. In WHOBBS,
they have to be warmed up in HDD pool, then moved into SSD pool.

3 System Design

RHOBBS is designed to achieve effective usage of storage devices and optimized IO


performance in large-scale datacenters. In this section we will present a system view
of RHOBBS and discuss the details of several important designs in RHOBBS.

3.1 Storage System Architecture

In order to gain high scalability, RHOBBS is implemented upon object-based storage.


An OSD is a physical or logical storage unit which stores data, handles data replication,
recovery, rebalancing IO load, etc. RHOBBS adopts the concept of storage pool [7]. To
manage OSDs of different types (like SSD and HDD), same types of OSDs are grouped
into the same storage pool. Then the system only has to maintain metadata of storage
pools instead of all OSDs. Object operations are performed on storage pools instead of
specified OSDs so that RHOBBS can dispatch those IO requests to OSDs in the pool to
achieve load balance. If we abstract each storage pool as a tier, RHOBBS is different
from Ceph cache. Since every piece of data has only one copy on a specific tier and no
duplicates on other tiers, there are no extra overhead caused by data consistence and
storage devices are more efficiently utilized.
Figure 2 illustrates the major components of RHOBBS. The VMs in Clients view
the VBDs provided by RHOBBS as direct-attached disks. As RHOBBS is based on
object-based storage, the IO Controller maps the IO requests from VMs to the access of
objects. When creating an object-based VBD, the Mapper will introspect the VMDI file
and strip it into objects stored in different pools. The Extent Map contains the location
information (i.e., which storage pool the object belongs to) of all objects. When the
VM is running, Mapper receives IO requests from the upper IO controller and finds
 RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines   439

the location of requested objects through the Extent Map. After that, Mapper conveys
the IO requests of objects to the Ceph Client Component. The Ceph Client Component
will handle the interaction with storage nodes and complete those IO requests.
Simultaneously, the Collector collects all the object IO information and sends it to the
cluster of shufflers periodically. The shufflers analyze the VM workloads and make
the global object placement strategy for all Clients. The Migrator in storage nodes
receives migration signals from shufflers and handles the actual data migration. If
an object resides in a certain storage pool, all operations on it will be directed to that
pool.
All the complexities in RHOBBS are completely transparent to hypervisors in
Clients to ensure generality.

Figure 2. The architecture of RHOBBS

3.2 VMDI stripping with introspection

The Mapper introspects VMDI files to understand their internal structure. With the
understanding we make better object placement strategies when we create object-
based VBDs using VMDI files. There are many formats of VMDIs: Qemu(.qcow2),
VMware(.vmdk), Virtual Box(.vhi), etc. But their specifications are publicly available
and there are libraries which can handle them like Virtual Disk Development Kit from
VMware [9]. With the help of VMDI handling libraries and knowledge about specific
file systems, we can parse structures like partitions or logical volumes in VMDI files.
Linux is one of the most popular guest OS in IT industry. The Linux kernel currently
supports more than 10 types of file system(Btrfs, JFS, xfs, ext3, etc), but luckily only a
few of them are in widespread use. Therefore, the system makes optimization for the
majority of users by supporting those popular but a few file systems.
440   RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines

An important benefit provided by VMDI intropspection is the ability to


distinguish between data and meta-data, which leads to many possible optimizations
[1]. We take Ext3 as an example to illustrate VMDI introspection optimization here.
The Ext3 file system contains many different meta-data types, such as inode Table
s, journals, etc. Those meta-data are usually accessed more frequently or randomly.
Pre-storing those objects which contain VMDIs' meta-data is an obvious approach to
achieve performance improvement, especially for those database-type applications
since they often modify meta-data for data persistence. Another more involved idea
is to do fine-grained data classification. Since we can obtain semantic information
about data blocks when parsing VMDIs, objects which contains user-interested data
(e.g. database files) can be pre-store in SSD pool to achieve better performance at the
expense of longer parsing time.

3.3 Shufflers

As we described in section 2.1, the central-monitor architecture suffers from availability


and scalability problems. The simplest solution is using a bunch of monitors to
replace the central monitor: each monitor is responsible for workload analysis and
object migration for a part of the clients independently (i.e. a monitor serves a part of
the clients). If one of the monitors crashes, only those clients which are served by that
monitor will be affected and the other monitors will take over its job. But this method
brings another problem: each monitor cannot obtain the workload information from
all clients but only those which it serves. It leads to that monitors cannot make global
object placement strategy.
RHOBBS uses a cluster of shufflers (shown in Figure 3) to solve problems
mentioned above. The shufflers are liable for collecting workload information,
workload analysis and making object placement strategy. Shufflers use Raft algorithm
[8] to provide a highly replicated and consistent storage which is used by each shuffler
to obtain global workload information. Each shuffler has a copy of global workload
information that is consistent among the shufflers. As shown in Figure 4, one of the
shufflers (shuffler 0) is elected as leader and the other shufflers become followers
(shuffler 1 to 4). In the original Raft algorithm, all clients send workload information
to the leader and the leader will replicate its log to the followers. However, the leader
could be performance bottleneck under heavy network traffic. To address this issue,
we make a modification to the original Raft. In RHOBBS, the Collector in a follower
shuffler receives clients' requests which contain workload information, and then the
Workload synchronizer batches these requests and convey them to the leader. Then
the leader replicates its log to followers. It is necessary to point out that the system will
never wait to batch requests and batching only occurs in the presence of congestion.
Batching only on congestion minimize the impact on real-time workload analysis and
batching on each follower shuffler reduces network pressure. Since all the shufflers
 RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines   441

the same copy of global workload information, they can make the same global object
placement strategy under same policy and provide data migration service for their
corresponding clients.
In RHOBBS, each client knows about the existence all shufflers. When a client
boots up, it randomly connects to an alive shuffler and sends real-time workload
information to it. The chosen shuffler will make global object placement strategies
and migrate data for that client periodically. The client will periodically check
whether its corresponding shuffler is alive through the heartbeat mechanism. If the
shuffler goes down but the cluster of shufflers is still functioning normally, the client
will reconnect to another available shuffler. It is necessary to noted that if the majority
of the shufflers fails, the process of workload analysis and data migration will be
stopped.

Figure 3. The overview of shufflers

3.4 Workload analysis and data migration

To improve system performance, we need to ensure those high accessed data residing
in the SSD pool. Because SSDs outperform even more under random workloads, the
system tends to store those objects of higher random access pattern in the SSD pool.
The Analyzer in a shuffler is responsible for analyzing the access pattern of objects.
First of all, the Analyzer analyzes the global workload information which is collected
described in section 3.3 to calculate the weight of each object. The weight of an object
determines where to store the object. Objects of higher weight are more likely to be
store in the SSD pool while those with lower weight tend to be stored in the HDD pool.
442   RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines

After the analysis, the Analyzer sorts all the objects in descending order according to
the object weight. Then the Analyzer put those top-weighted objects into the Migration
Queue. Too much data migration work will cause huge network traffic and have a
negative influence on system performance, so that the number of objects which will
be migrated to the other pool is limited. To calculate the weight of each object, we use
the algorithm proposed by Lingxuan Shen et al in their section 3.4. [10].
The Migrator in shufflers will periodically check the Migration queue and send
migration signals to the storage nodes when the queue is not empty. Ceph maintains
three copies of an object on three different storage nodes by default. For each object,
there is always a primary storage node for it. Every operation on that object will be
redirected to the primary object. The Migrator in shufflers will find out the primary
node for a need-migrate object according to the object ID and its current storage pool,
then send the migration signals to the primary node. At last the Migrators in OSDs will
cache all the received migration signals and sequentially deal with those migration
signals by calling the Ceph OSD Component to finish the migration jobs. During the
object migration, data consistence is ensured by the locking system. Before migrating
an object, the Migrator in the OSD will invoke the Mapper on the Client to get the lock
of the object. After the migration, the Mapper changes the Object Map accordingly
and then releases the object lock so that the VM can access that object.

4 Evaluation

4.1 Experimental Environment Setup

Our experimental platform consists of 10 servers: 4 Ceph OSD servers, 1 Ceph monitor
server, 1 WHOBBS Monitor server, 3 shuffler servers and 1 VM host server. The VM host
server is Dell PowerEdge T620 tower server with Intel Xeon E5-2620 2.0 GHz CPU and
32 DDR3 SDRAM. The other servers are HP Compaq Pro 6300 Microtower servers with
Intel Core i3-3220 3.3 GHz CPU and 4G DDR3 SDRAM. All servers run Ubuntu 14.04
and the Linux kernel version is 3.19.0. Each of the 4 OSD servers is equipped with one
120GB Kingston V300 SATA3 SSD and one 1TB Seagate 520S SATA HDD as OSDs. So
we have 4 SSD-type OSDs grouped as SSD pool and 4 HDD-type OSDs grouped as HDD
pool. All the servers are connected by a 1Gbs Ethernet. The VM host runs Qemu-kvm
as hypervisor and each VM has 1 vCPU and 4GB RAM. The VBDs for VMs are created
with size of 100GB. We run benchmarks in the VM to evaluate the IO performance.
To show the improvement of RHOBBS, we compare it with the WHOBBS and the
original cache-enabled Ceph. WHOBBS has one central monitor and RHOBBS has 3
shufflers. The configuration is described in Table 1. We can compare the performance
of RHOBBS, WHOBBS and original Ceph by comparing the throughput of VBDs in
different storage system under the same benchmark.
 RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines   443

Table 1. The pool setting of rhobbs, whobbs and ceph


Storage system pool setting

RHOBBS 4 SSD OSDs as SSD pool


4 HDD OSDs as HDD pool

WHOBBS 4 SSD OSDs as SSD pool


4 HDD OSDs as HDD pool

CEPH 4 SSD OSDs as cache pool


4 HDD OSDs as backing storage pool

4.2 The performance improvement of RHOBBS

To show the performance improvement brought by RHOBBS, we conducted


experiments with two benchmarks called Filebench [11] and Fio. Fio is used to
generate block IO workloads. And Fio accesses data on storage device with Zipf
distribution. Filebench emulates file system level workloads of real applications
so that we can show how RHOBBS benefits real applications. We utilize Filebench
to emulate file system workloads of file server and web server since they are in
common use. The detailed configuration of the experiment workload is described
in Table 2.

Table 2. Configuration of workload of filebench


Type Data Set File Size Read/Write

File Server 72GB 128KB 1:1

Web Server 52GB 16KB 10 : 1

Figure 4a shows the IOPS of the original cache-enabled Ceph and RHOBBS under the
workload of file server and web server. We run 6 instances of testing VM on the tower
server and we use the average performance result of them. RHOBBS outperforms
original Ceph under both two kinds of workloads and RHOBBS improves about
100% related to Ceph. Figure 4-b demonstrates that RHOBBS has higher throughput
than Ceph, too. The results of Fio tests is showed in Figure 5. It shows that RHOBBS
performs better than Ceph under block IO workload.
444   RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines

Figure 4. IOPS and throughput of RHOBBS and Ceph under the workload of file server and web server
from Filebench

Figure 5. IOPS under Zipf block IO workload from Fio


 RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines   445

4.3 The shufflers of RHOBBS

To demonstrate the improvement brought by shufflers in RHOBBS, we used the


Filebench configuration described in Table 1 to evaluate WHOBBS and RHOBBS and we
created three independent instances of it. In stage 1 we run instance 1 on RHOBBS and
WHOBBS independently and the result is shown in Figure 6a. In stage 2, we power off
the central WHOBBS monitor in WHOBBS and one of the three shufflers in RHOBBS,
then we run instance 2 on RHOBBS and WHOBBS independently and the result is
shown in Figure 6b. In stage 3, we power off one more shuffler in RHOBBS compared
with stage 2, then we run instance 3 on RHOBBS and WHOBBS independently and the
result is shown in Figure 6c.
Figure 6a shows that RHOBBS and WHOBBS have similar performance when
the WHOBBS monitor is on. But when the WHOBBS monitor is down, WHOBBS
can not maintain high performance as shown in Figure 6b, because the process of
workload analysis and data migration is stopped. As for RHOBBS, the cluster of
shufflers is still functioning when one shuffler is down so that RHOBBS still achieves
high performance in Figure 6b. Figure 6c shows that neither RHOBBS nor WHOBBS
achieves high performance shown in Figure 6a. When the majority of the shufflers are
down, the process of workload analysis and data migration is stopped because the
Raft algorithm fails.

4.4 VMDI stripping with introspection

In section 4.3 and 4.4, for every test we have run the benchmark multiple times on
the same data set to get a stable result. To describe the benefit brought by VMDI
introspection, we still run the benchmark multiple times but we show the performance
improvement change by time. We use the file server described in Table 1. The RHOBBS
will scan the VMDI file and pre-store those recognized meta-data-type objects in the
SSD pool before we run the benchmark but WHOBBS will not.
Figure 7 shows that under the workload of file server RHOBBS outperforms
WHOBBS in the beginning 18 minutes and eventually they both achieve similar high
performance. It is important to note that it only takes extra 4 minutes on average to
scanning the disk in VMDI stripping. The result reveals that RHOBBS indeed reduces
the time to the data set and obtains better performance earlier. For datacenters which
run many VM instances, it will save a lot of time in total.
446   RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines

Figure 6. IOPS of RHOBBS and WHOBBS under the workload of file server and web server
 RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines   447

Figure 7. IOPS of RHOBBS and WHOBBS under the workload of file server

5 Conclusions

In this paper, we propose RHOBBS, an object-based distributed hybrid storage, to


provide block storage service for virtual machines. RHOBBS dynamically optimizes
the global data distribution between different pools of storage devices by real-time
analysis of VMs’ workload and data migration. Through this optimization, RHOBBS
achieves high performance and makes efficient use of high performance devices (e.g.
SSDs). Compared with its previous study WHOBBS, it replaces the central Monitor
node with a cluster of shufflers to avoid the central-node architecture problem.
Moreover, RHOBBS adopts the VMDI introspection technology to reduce the time
needed to achieve a relatively high performance.

References
[1] V. Tarasov, D. Jain, D. Hildebrand, R. Tewari, G. Kuenning, and E. Zadok, “Improving i/o
performance using virtual disk introspection,” in Presented as part of the 5th USENIX Workshop
on Hot Topics in Storage and File Systems, 2013.
[2] V. Tarasov, D. Hildebrand, G. Kuenning, and E. Zadok, “Virtual machine workloads: The case for
new nas benchmarks,” in Presented as part of the 11th USENIX Conference on File and Storage
Technologies (FAST 13), 2013, pp. 307–320.
[3] M. Mesnier, G. R. Ganger, and E. Riedel, “Object-based storage,” Communications Magazine,
IEEE, vol. 41, no. 8, pp. 84–90, 2003.
[4] S. S. Rizvi and T.-S. Chung, “Flash ssd vs hdd: High performance oriented modern embedded
and multimedia storage systems,” in Computer Engineering and Technology (ICCET), 2010 2nd
International Conference on, vol. 7. IEEE, 2010, pp. V7-297.
448   RHOBBS: An Enhanced Hybrid Storage Providing Block Storage for Virtual Machines

[5] L. Wan, Z. Lu, Q. Cao, F. Wang, S. Oral, and B. Settlemyer, Ssd-optimized workload placement
with adaptive learning and classification in hpc environments,” in Mass Storage Systems and
Technologies (MSST), 2014 30th Symposium on. IEEE,,2014, pp. 1–6.
[6] S. Weil, S. Brandt, E. Miller, D. Long, and C. Maltzahn, “Ceph: a scalable, high-performance
distributed file system,” in OSDI06 Proceedings of the 7th symposium on operating systems
design and implementation, Berkeley, CA, 2007.
[7] S. Lingxuan, H. Chen, S. Ma, Z. Du, and F. Hu, “Whobbs: An object-based distributed hybrid
storage providing block storage for virtual machines” in High Performance Computing and
Communications (HPCC), 2015, pp. 160–165.
[8] D. Ongaro and J. Ousterhout, “In search of an understandable consensus algorithm,” in 2014
USENIX Annual Technical Conference (USENIX ATC 14), 2014, pp. 305–319.
[9] “vddk,” https://www.vmware.com/support/developer/vddk/.
[10] S. Ma, H. Chen, H. Lu, B. Wei, and P. He, “Mobbs: A multi-tiered block storage system for virtual
machines using object-based storage,” in High Performance Computing and Communications,
2014 IEEE Intl Conf on. IEEE, 2014, pp. 272–275.
[11] “Filebench,” http://Filebench.sourceforge.net/wiki/index.php/Main Page.
Part III: Sensors, Instrument and Measurement I
Shang-yue Zhang*, Yu-ming Wang, Zheng-guo Yu
AIS Characteristic Information Preprocessing &
Differential Encoding based on BeiDou Transmission
Abstract: Transmission of AIS information can solve the problem of blind area of
long-range ocean based on BeiDou Navigation Satellite System short message.
However, AIS information which is from ship-dense areas filtrated by self-adaption
filtering technology is too large to transmit through the BeiDou channel. It takes
too much time for transmission to meet the real-time requirement of dynamic ship
monitoring. To solve this limitation, research on AIS information Compression coding
technology. Firstly, preprocess AIS information to remove ineffective redundancy,
then compress AIS ship position information by differential encoding. the AIS
parameters reorganization according to new rules transmitted by BeiDou. The results
show that after preprocessing and differential encoding to reduce the half amount of
information, and with low error rate, the ship’s position error is within 2 meters.

Keywords: Automatic Identification System; BeiDou Short Message; Characteristic


Information; Compression coding

1 Introduction

Automatic identification system, or AIS for short, is an all-purpose ship dynamic


monitoring system with high efficiency. It takes an important role in ship navigation,
ship collision avoidance, marine communications, ship-shore communications and
construction of shipping informationization. AIS can identify and track the target
ship, receive and transmit the AIS information of nearby ships. It can approximately
achieve real-time tracking, has strong communicating stability to land, simplify the
exchange procedure of information, and risk a low chance of missing the target [1,2].
AIS operates in maritime mobile spectrum of VHF, so it has blind areas in long-range
seas, and can’t dynamically monitor the situation of voyage safety and so on. These
problems can be solved by sending AIS information contained by short message of
BeiDou satellite navigation system to shore-based command center. But the channel
bandwidth of BeiDou system is narrow, only can sending information of 210 bytes for
a single time. When ships are in navigable condition, especially in ship-dense areas,
a single ship receive a large number of AIS information. Even if these information is

*Corresponding author: Shang-yue Zhang, Dept. of Navigation, Dalian Naval Academy, Dalian of China,
116018, E-mail: dlzhangshangyue@163.com
Yu-ming Wang, Zheng-guo Yu, Dept. of Navigation, Dalian Naval Academy, Dalian of China, 116018
452   AIS Characteristic Information Preprocessing & Differential Encoding

filtrated by self-adaption filtering technique and the ships which need to be monitored
is tagged [3], the amount of tagged ships is still large, taking a long time to transmit
via BeiDou channel. And it prejudices the ship dynamic monitoring.
This thesis studies on the compression coding technique of AIS feature information
via BeiDou transmission. By analyzing the features of AIS data, preprocessing the AIS
information before compression and differential encoding can be achieved. Under
this premise of low error rate, the transmission quantity of AIS data can be expanded
by compressing the volume of information.

2 Summarization of Compression Coding of AIS Information

2.1 Proposal of Compression Coding of AIS Information

When the ship is in oceangoing voyage, sending the AIS information of the area which
the own ship is central to base commanding center via BeiDou satellite navigation
system can help the commanding staff in shore-based command center monitoring
the ship dynamic situation in long-range seas effectively and promptly at anytime to
ensure the ships’ navigation support capacity in long-range seas.
When the ship sails in a ship-dense area, a large number of AIS information
will be received. But BeiDou system has a narrow channel and its communicating
ability is confined, it can only send a message which contains 210 bytes
maximumly for a single time. An AIS cryptograph has 28 bytes totally such as
“1P000Oh1IT1svTP2r:43grwb0Eq4”. It can be supposed that 200 pieces of AIS
information of 200 ships are received when a ship sails in a ship-dense area. Filtrated
by self-adaption filtering technique, 50 ships which may have affection on the
navigation safety of own ship are tagged. The AIS information of these 50 tagged
ships contains message of 1,400 bytes totally. If using BeiDou system to transmit these
information completely in the high service frequency of 5s, it will take 30s to transmit
the AIS information of these 50 tagged ship totally. It’s obvious that 30s is far more
than t=12s [4] which is the maximum update interval of ship AIS dynamic information
reporting. It impairs the base command’s real-time dynamic monitoring capability on
ships. So it’s necessary to use compression coding measures appropriately to reduce
the quantity of information after the self-adaption filtering technique is applied.

2.2 Analysis on Compression Coding Technique of AIS information

The total amount which AIS information contains is I = ˗log2(P), whose unit
of measurement is bit. The P is the probability that could be occurred in every
circumstances which could be detected at the similar situation of these data for an
individual time [5]. The amount of bit which is in actual usage is always larger than
 AIS Characteristic Information Preprocessing & Differential Encoding    453

I, because the redundant data bits are added in the process of data inputting. The
redundant part can be removed by data compression.
Data compression is a technique of reducing the data amount to decrease the
storage space in order to enhance the efficiency of data transmission, data storage and
data processing under this premise without losing any information, or reorganizing the
data in particular algorithm to reduce the redundancy of data and storage space[6]. The
redundancy of data includes Spatial Redundancy, structural redundancy, knowledge
redundancy, visual & acoustic redundancy and redundancy of information entropy. The
data compression includes lossless compression and lossy compression. Reducing or
erasing the redundancy in data without any distortion for AIS information is requested
to revivify the original data. So the lossless compression must be applied
By analyzing the characteristics of AIS information and reducing the redundant
information to save the storage space, BeiDou communication bandwidth can be
taken full advantage to transmit AIS information when applying BeiDou system. The
redundant information of AIS can be divided into two parts: ineffective redundancy
and spatial redundancy information. Then valid redundancy refers to the part which
is irrelevant to dynamic monitoring, it can be erased by ineffective redundancy
preprocessing before compression. Spatial redundancy information refers to the parts
of the relevance among adjacent information. There are relevant information among
the AIS information of ships, it can be processed by differential encoding.

3 Compression coding of AIS Information

3.1 Analyzing & Preprocessing Ineffective Redundancy of AIS Information

There are various kinds of AIS telegraph. To ensure the capability of dynamic monitoring in
long-range seas, BeiDou system only need to transmit the position reporting information
of telegraph ID1,2,3. The total amount of bytes of cryptograph is 168, parameters are
showed in Table 1 (MMSI: Maritime Mobile Service Identify; ROT: Rate of turning; SOG:
Speed to ground; COG:Course to ground; UTC: Coordinated Universal Time).
As shown in Figure 1 & 2, collecting the AIS information of 200 ships at different time
periods, for the reason of receiving too much data, only some ships’ AIS information and
transcode are shown. By analyzing its cryptographies and corresponding transcoding,
it can be discovered that some parameters in AIVDM & AIVOM statements are not
effective dynamic monitoring information, such as Message ID, Forward indicator,
positional accuracy, true heading, usage of area ready to deploy, reservation and RAIM
mark parameters. Dynamic monitoring in long-rang seas is for monitoring and ensuring
the navigation safety of own ship in real time. The commanding staff in shore-based
center need to master the dynamic change of AIS information of boats which are near
own ship. So the parameters above mentioned which are total 15 bytes should be regarded
as monitoring ineffective redundancy and erased.
454   AIS Characteristic Information Preprocessing & Differential Encoding

Table 1. Parameters Table of AIS telegraph

Parameters bit number Parameters bit number


Telegrah ID 6 Longitude 28
Indicator 2 COG 12
MMSI 30 Head-direction 9
State 4 Timestamp 6
ROT 8 Area -use 4
SOG 10 Reserve 1
Precision 1 RAIM 1
Latitude 27 Communicationtatus 19

Figure 1. AIS data translation 6-bit binary field

Figure 2. Decoded information of AIS at different times.


 AIS Characteristic Information Preprocessing & Differential Encoding    455

The removal of ineffective redundancy of time  stamp and parameters of


communication status can be achieved by integrating the information. Time stamp
shows the seconds of UTC time. The communication status includes Synchronization
Status, Duration of time slot, and descending telegraph. The descending telegraph
includes receiving station, numbers of TS, UTC hour and minute and the TS bias.
So it can be achieved by retaining UTC hour, minute and second. 7 digits redundant
information can be erased. For the reason that the AIS information of nearby ships of
own ship is received at the same UTC time, it’s achieved that only sending the UTC
time when own ship received these information.
By erasing these monitoring redundant information, the volume of information
of a nearby ship is 137 digits, 29 digits less compared with the original code.
Differential encoding technology is to predict the current sample with the
past sample, using the characteristic that adjacent information of message source
has some particular relevance, then encoding the differentials. If the prediction is
accurate, the error would be small. It can be supposed that the system transmitting
end send a initial value x0 at first, then only send the predicting differentials dn=xn‒xn-1.
The receiving end would add the predicting differentials dn which are received
and quantized to the former predicting value x´n-1 which is numerated, then get the
restoring signal x´n.
When predicting the differential equal quantification code, the predicting
differential error qn would be born. So the differential after encoding is d´n = dn+qn,
and because of x´n = x´n-1 + d´n, then
n
′ xn + ∑ qk (1)
x=
n
k =1
n

∑q k =1
is the Deviation of accumulating differentials, the distortion of total differential
k

encoding system comes from quantization encoding.


Own ship has some correlation to the AIS information of the ships nearby, it
means other ships has some affection of positioning on own ship. The positioning
information of own ship corresponds to the past sample of differential coding, the
positioning information correspond to multiple current samples. On the condition of
confirming own ship’s position, and evenly quantization coding the position shifts
(difference of longitude/latitude) between other ships and own ship, the positions
of other ships can be predicted. The digits of information of ship position (55 digits)
occupy 37% of the total amount of digits of AIS information (146 digits) which is
preprocessed. Differential encoding of ships’ position can enlarge the compression
ratio of system. For the reason that the process is the first-order Markov process, the
first-order prediction is applied. The error of accumulation of differentials is q1, the
size of error depends on the accuracy of selection of positioning information.
456   AIS Characteristic Information Preprocessing & Differential Encoding

3.2 Differential Encoding of AIS Ship Positioning Information

AIS pertains to VHF communication equipments, its effect of waves is influenced


by some objective conditions such as the height of antenna, radiated power and
atmospheric conditions, etc. Known by AIS broadcast and receiving distance formula,
in general, the distance of AIS impact can be determined by formula 2.
d [n mile]= ƒ ×(
) h1[m]+ h 2 [m] (2)
h1 & h2 in formula are the height of antennas of two ships, according to the IEC
recommendation, ƒ =3.0 is taken when ship to ship. The distance which own ship
can receive the AIS information from other ships is 20-30 n mile. It can be seen that
when other ship is in short distance to own ship, there is minutes and the clock bias
of difference of longitude/latitude of ship positions.
For the convenience of encoding on sending end and decoding on receiving end,
the code of differentials of ship position is evenly quantization encoding. The length
of code is determined by the accuracy of position. It can be supposed that the distance
between two ships is d, then the formula 3.
 ∆D
φ = arc sec latdiff

d = latdiff 2 + (londiff × cos(φ ) 2 (3)

ΔD in formula 3 [7] is the projection difference latitude when the length of the map
which longitude 1′ is the unit. According to the latitude ϕ1, ϕ2 of both ships, and check
the list of projection latitude ratio in Anthology of Nautical Tables, projection latitude
D1, D2 which correspond to ϕ1, ϕ2 can be worked out. The differential between D1, D2 is
ΔD, and ϕ is the mid-latitude, latdiff and londiff are the difference of latitude/longitude,
londiff× cos(ϕ) is the easting. It’s can be known that the difference of latitude of other
ship to own ship can be 30′ maximum. Usually the distinguishability of nautical chart
is 0.001′, the difference of 1′ latitude and easting on earth surface is approximately
equal to 1 nm which equal to 1852m. So on this condition, the accuracy within 2 m
can satisfy the need of positioning accuracy in theory. The maximum difference of
latitude can be confirmed as 30.000′. The binary digit of 30000 is 111010100110000
of 15 digits. So the sendable digit amount of effective information of difference of
latitude is 16 (First digit position is sign digit). Compared with the original data, 11
digits of spatial redundancy of each ship’s AIS information can be saved.
The circumstance when the maximum value of difference of longitude is set is
that the two boats are in the same latitude. It can be known that the relationship
between distance & difference of longitude of two ships which in the same latitude by
the formula of Meridianal Parts (MP):

londiff × 59.935(nmile) × cos ϕ (4)


d=
 AIS Characteristic Information Preprocessing & Differential Encoding    457

Known by the formula, when the d of distance between two ships is still, the higher
latitude & larger difference of longitude, the more difficult to limit the upper limit of
difference of longitude. When ship sails in the high latitude seas, the length of fixed
length code of difference of longitude is approaching the original length of longitude
(28 digits). It’s not effective that sending difference of longitude directly. Known by
the Notice 3, londiff× cos(ϕ) is that the maximum value of easting can be limited,
so the easting can be applied to replace the difference of longitude. The maximum
value of easting is 30.000′ (same as difference of latitude). So the amount of effective
easting information digits which can be sent is 16 digits (First digit position is sign
digit). Compared with the original data, 11 digits of spatial redundancy of each ship’s
AIS information is saved.

4 Analysis on Error of Differential Encoding

A ship receive 100 pieces of AIS information of 100 ships when sailing in Bohai Strait.
The positioning information of the ship is 38°50′.796 latitude, 121°55′.890 longitude.
Differential encoding the AIS information of 100 nearby ships, the experimental
result is shown in Figure 3.

Figure 3. Analysis of differential encoding errors & BER

Known from Figure 3, the latitude error is within 5.5e-4 minutes, the average error is
1.53e-4 minutes. It can be known that the latitude error is within 1m, average value
is 0.28m; the longitude error is within 5.6e-4 minutes, average error is within 1.33e-4
minutes; the positioning error is within 1m, the average positioning error is 0.27m
458   AIS Characteristic Information Preprocessing & Differential Encoding

from the knowledge that 1 latitude is equal to 1852m. The error rate of latitude coding
is within 9e-8, the average value is within 4.2e-8. It can be known by analyzing the
experimental result, the error of longitude & latitude is tiny and at the meter scale.
The error rate is very low, almost no error code.

5 Recombination after Compression of AIS information

For the reason that the compression encoding of AIS information is even quantification
code, then pack these information to symbol sequence following the order that
sending the AIS information of own ship first then sending the AIS information of
other ships which is ready to send. So the length of code element can be fixed at a
number, shown in Table 2. The reforming format of AIS information of own and other
ships is shown in Table 3.

Table 2. Format of symbol sequence after reorganization

Own ship ship1 … Ship n CRC

AIS date AIS date AIS date Check code

Table 3. Parameters Table of AIS telegraph

Parameters MMSI State ROT SOG Easting Latitude COG UTC Total compression
Bit ratio of
system

Own ship 30 4 8 10 28(longitude) 27 12 18 137 81.5%

Other ships 30 4 8 10 16 16 12 0 96 57.1%

6 Conclusion

By analyzing the information of AIS characteristics, the writer proposed compress


the AIS information by preprocessing before compression and differential encoding
technology to reduce the volume of data in order to send the AIS information of more
ships by BeiDou channel. The experimental result shows that when processing the AIS
ship positioning information by differential encoding technology, the compressing
rate of ship positioning is 58.2%, the compressing rate of system is 57.1%. 9 ships’ AIS
information can be sent maximumly before compression, 17 ships’ AIS information
can be sent after compression, the information content is doubled with minor error
and low error rate.
 AIS Characteristic Information Preprocessing & Differential Encoding    459

References
[1] Zhi-lei Liu, “Automatic Identification System Information Safety and Security,” Digital
communication World, pp. 5-9, 2014.
[2] Jun-zhong Bao, etc. “Automatic Identification System Application,” Dalian: Dalian Maritime
university press, pp.17-18, 2006.
[3] Yu-ming Wang, Shang-yue Zhang. “Research & Prospect for Ship Long-rang Dynamically
Monitoring System,” Marine electric & electronic engineering, no. 2,pp. 16-20,Apr. 2015.
[4] An-cun Yuan, Shu-fang Zhang. “International standard universal of ship automatic identi-
fication system assembler,” Dalian: Dalian Maritime university press, 2005:2-3.
[5] Yu-min Wang, etc, “Information Theory and Coding Theory,” Beijing: Higher Education Press,
2010.
[6] Yue-nan Wu, “Date Compression,” Beijing: Publishing House of Electronics Industry, Aug. 2012.
[7] Ren-yu Zhao, “Navigation Study,” Beijing: China Communication Press, Aug. 2008.
You LI*, Xing-shu WANG, Hao XIONG
Modeling of Ship Deformation Measurement based
on Single-axis Rotation INS
Abstract: In order to simplify the structure of ship deformation measurement system, a
novel method of ship deformation measurement based on single-axis rotation inertial
navigation system is proposed. The system needs only one additional set of laser
gyro unit which is installed at the point to be measured. Kalman filter formulation is
applied to construct the deformation measurement model, the measurement vector is
the difference of the angular rate output of the single-axis rotation inertial navigation
system and the laser gyro unit, the state vector includes the static and dynamic
deformation of the ship, gyro bias and random walk. Simulation results show that the
deformation measurement error can be within 5 angular seconds, the measurement
accuracy is almost the same with the system based on two sets of laser gyro unit.

Keywords: ship deformation measurement; single axis rotation; Inertial navigation


system; Kalman filter

1 Introduction

Large ship will bend or twist under the force of sea waves, helm’s operation or
other external force, as in [1]. Usually, the bend angle is very small, and it has no
significant impact on the ship. But for warships that equipped with weapons and high
precision measuring instruments, which need precise attitude reference information,
the deformation will seriously reduce the precision strike capability of the weapon
systems and the measurement accuracy of the measuring instruments on the ship,
because the attitude information is usually provided by the inertial navigation system
(INS) installed in the ship center, and there is usually a certain distance between the
inertial navigation system with the weapons and the instruments. In order to provide
accurate attitude information for the measurement instruments and weapon systems,
we need to measure the deformation of the ship between the INS with the weapon
systems and other measuring instruments.

*Corresponding author:You LI, Dept. Electromagnetic Spectrum Management, Academy of National


Defense Information, Wuhan, China, E-mail: youli_2008@126.com
Xing-shu WANG, Hao XIONG, Dept. Optoelectronic Instrument & Measurement, National University
of Defense Technology, Changsha, China
 Modeling of Ship Deformation Measurement based on Single-axis Rotation INS   461

1.1 Conventional Deformation Measurement Based on Two Sets of LGUs

Conventional ship deformation measurement system based on two sets of three


mutually orthogonal laser gyro unit (LGU) is shown in Figure  1, as in [2,3]. In this
system, LGU1 is installed near the INS on the ship, and LGU2 is installed near the
weapon systems or other instruments which need high precision attitude information.

zb1 zb2

LGU2 yb2
o2
LGU1 o1 yb1

xb1 xb2

Figure 1. The principle of ship deformation measurement system based on two sets of LGU

Owing to the deformation of the ship and the initial installing misalignment of the
two LGUs, the angular velocity of the ship measured by the two LGUs is different.
Assuming that
 the angular misalignment
 between LGU1 with LGU2 has a static
component Φ and a dynamic one θ , as in [3] and [4], the total misalignment angle
  
ϕ = Φ + θ can be presented in a matrix form

ϕ = [ϕ x , ϕ y , ϕ z ]T (1)

Let us assume that the angular velocity output of LGU1 is Ω , and the output of LGU2
is Ω′ , so we can get measurement equation of deformation measurement as follows
  
Ω′= C Ω + ϕ (2)

Where C is the direction cosine matrix from coordinate frame o1 xb1 yb1 zb1 to coordinate
frame o2 xb2 yb2 zb2 .
As shown in Figure 1, it is obvious that this deformation measurement system
need two additional sets of LGU. As we know, the inertial navigation system on the
ship can also provide the angular velocity of the ship, so we can omit one set of
LGU, which is installed near the INS on the ship. Thus, the cost of ship deformation
measurement system can be significantly reduced. The problem is that the INS on the
ship is usually rotation modulated in order to improve its navigation accuracy, in this
case, the output of the INS includes not only the angular velocity of the ship motion,
but also the rotation modulation velocity of the INS, so the output information of the
single-axis rotation INS cannot be used directly.
462   Modeling of Ship Deformation Measurement based on Single-axis Rotation INS

zb1
zb2

IMU o1 yb2
LGU2 o2
xb1 yb1
turntable

zp
yp xb2

o xp

Figure 2. Configuration diagram of ship deformation measurement system based on single-axis


rotation INS.

1.2 Deformation Measurement Based on Single-axis Rotation INS

The configuration diagram of the ship deformation measurement system base on the
output of the rotation INS and the output of the LGU is shown in Figure 2. Compared
with the system showed in Figure 1, LGU1 is omitted, and the output of the inertial
navigation system, which is single-axis rotation modulated, is used to replace LGU1.
The single-axis rotation INS consists of an inertial measurement unit (IMU), which
includes three orthogonal laser gyros, three orthogonal accelerometers, and a high-
precision single-axis turntable. The IMU is mounted on the single-axis turntable, and
it rotates as with the turntable according to a certain rule around the Z-axes of the IMU
coordinates.
For illustration, we define the related coordinate frame as follows:
The IMU frame (b-frame) is defined by the measuring axes of the three orthogonal
gyros. The origin of the frame is at the geometric center of the three IMUs and three
axes aligned with the sensitive axis direction of the three orthogonal gyros. The
b-frame rotates as with the IMU when the turntable rotates around its rotation axis.
The ship frame (p-frame) is orthogonal axes set which is aligned with the roll,
pitch and yaw axes of the ship. At the initial moment, the p-frame is thought to be
coincident with the b-frame.

2 Modeling of deformation measurement based on single-axis


rotation INS

Compared with the method shown in Figure 1, the method in Figure 2 needs only
one additional set of LGU, and the output of the INS is taken into use. Because the
physical structure is simplified, the deformation measurement model would also be
more changed.
 Modeling of Ship Deformation Measurement based on Single-axis Rotation INS   463

2.1 Rotation Demodulation

Because of the rotation modulation, the angular rate output of the laser in the IMU
contains not only the angular motion but also the rotation of the turntable. In this
case, we assume that the attitude of b-frame in the inertial frame is expressed as b(k )
at the time tk , similarly, the attitude of p-frame is expressed as p (k ) , and we can
construct quaternion Q(tk ) to describe the rotation from p (k ) to the b(k ) at time tk .
In the same way, at time tk +1 , the attitude of b-frame in the inertial frame is
expressed as b(k + 1) ; the attitude of p-frame is expressed as p (k + 1) , and we can
construct quaternion Q(tk +1 ) to describe the rotation from p (k + 1) to b(k + 1) . Thus
we can obtain the quaternion expression of the ship motion p (∆t ) from the time tk to
tk +1 , shown as
t ) Q(tk ) ⊗ q (∆t ) ⊗ Q* (tk +1 )
p (∆= (3)

Where ∆=t tk +1 − tk , and Q* (tk +1 ) represents the complex conjugate of Q(tk +1 ) .


Then according to the relation between quaternion with direction cosine matrix,
we can obtain
 p0   cos( µ 2) 
 p   ( µ µ ) sin( µ 2) 
t )  1=
p (∆=   x 
 p2  ( µ y µ ) sin( µ 2) 
   
 p3   ( µ z µ ) sin( µ 2)  (4)

Where µ = µ x + µ y + µ z , and µ x , µ x , µ x represents the three orthogonal gyro


2 2 2

t t
output from time k to k +1 , they can be expressed in matrix form as follows

[ µ x , µ y , µ z ]T
Ω�(tk +1 − tk ) = (5)

Thus we obtained an equivalent gyro output with the strap down LGU as shown in
Cp
Figure 1. We also can obtain the direction cosine matrix of b-frame to the p-frame b1
as follows
 cos µ y cos µ z − sin µ x sin µ y sin µ z cos µ y sin µ z + sin µ x sin µ y cos µ z − cos µ x sin µ y 
 
=Cbp1  − cos µ x sin µ z cos µ x cos µ z sin µ x  (6)
sin µ y cos µ z + sin µ x cos µ y sin µ z sin µ y sin µ z − sin µ x cos µ y cos µ z cos µ x cos µ y 

2.2 Modeling of Gyro Output

Because of the error during the manufacturing process and other error factors, the
output of laser gyro usually contains bias error and random walk error.
 Considering the gyro bias, when the input angular rate of the IMU from tk to tk +1

is Ω = [Ω x , Ω y , Ω z ]T in vector form and the bias error is ε = [ε x , ε y , ε z ] , so the real
T

output of the three gyros from time tk to tk +1 can be expressed in vector form as follows
464   Modeling of Ship Deformation Measurement based on Single-axis Rotation INS

 Ω  ε 
  x   x 
Ω = Ω y  + ε y 
 Ω z  ε z  (7)

Due to the rotation modulation, given that the turntable


r rotate an angle of α around
the OZ axis, when there is no gyro bias, the output Ω1 of the IMU after demodulated
would be
 cos α sin α 0 Ω x 
  
Ω1 = C p Ω =  − sin α
b1
cos α 0  Ω y 
 0 0 1   Ω z 
(8)
r
%
Considering the gyro bias, the measurement gyro output Ω1 of IMU after demodulated
would
 be  

Ω'= Cbp1 Ω1= (C )b1 T
p Ω1
 Ω x  ε x cos α − ε y sin α 
=Ω y  + ε x sin α + ε y cos α 
 Ω z   εz 
 p
= Ω + Cb1 ε
(9)

This means that the gyro bias is modulated when the turntable rotates. Thus the
gyro bias could not be considered as a constant variable in the ship deformation
measurement equation.

2.3 Kalman Filter Formulation

In order to measure the deformation of the ship, the Kalman filter is used, as in
[5]. Assuming that at rsome moment the angular velocity of the ship measured r
% %
by the rotation INS is Ω1 , and the angular velocity measured by the LGU2 is Ω 2 ,
then  substituted them to we can get
    
Ω 2 − ε=
2 Cbp1 (Ω1 − ε1 ) + ϕ (10)

Applying equation and to, we can obtain the following equation


 −Ω
Ω1
 =
2 (ϕ ×) ( Ω 1 − Cbpε1 ) − ϕ + Cbpε1 − ε 2
1 1

(
1 )
 × ϕ − ϕ + C b ε − ε
=− Ω b1 1 2
(11)
p
Equation is the measurement equation of the Kalman filter, where C is the direction
b1
cosine matrix from the b-frame to the p-frame and can be obtained according to the
rotate angle of the turntable.
 Modeling of Ship Deformation Measurement based on Single-axis Rotation INS   465

Compared with the form of measurement equation of Kalman filter=


Z Hx + v ,
the state vector is

x = [ϕ x , ϕ y , ϕ z , ϕ x , ϕ y , ϕ z , ε1x ,ε1 y , ε1z , ε 2 x ,ε 2 y , ε 2 z ]T (12)

In this case, the measurement matrix will be as follows


 0 Ω 1z −Ω 1 y −1 0 0 −1 0 0 
   p

H =  −Ω1z 0 Ω1x 0 −1 0 Cb1 0 −1 0 
   
 Ω1 y −Ω1x 0 0 0 −1 0 0 −1
(13)

In addition, as in [6] and [9], the ship dynamic deformation is usually considered as
second order markov process shown as the following equation

θi += i µi w ( t ) ( i
2 µiθi + bi2θi 2bi D= x, y , z ) (14)

Where θi = [θ x , θ y , θ z ] represent the ship dynamic deformation, Di represent the


T

variance of θ i , µi is the irregularity coefficient of dynamic deformation, and w(t ) is


the white noise.
Simultaneous and, we can obtain the state equation of Kalman filter, as in [5].

x (t ) Ax(t ) + Bw(t )
= (15)

3 Simulation of deformation measurement based on single-axis


rotation INS

In order to confirm the feasibility of this deformation measurement method,


experiment is designed as follows: a large platform A is fixed on the ground and is
programmed to swing to imitate the ship attitude on the sea [7]; a small platform A is
fixed on the large platform A, and is programmed to swing to imitate the deformation
of the ship. Similarly, a single-axis rotation INS is mounted on platform A [8], and a
set of LUG is mounted on platform B. The installation of the experiment is shown in
Figure 3.
Before we conduct the experiment, we should first sample the real ship attitude
data and the deformation data. Figure 4 is the ship attitude data and Figure 5 is the
ship deformation data we sample on the sea. The attitude is expressed in Euler angle
form in ox , oy and oz direction independently, which is also known as roll, pitch
and yaw angle. The ship deformation data shown in Figure 5 is sampled from the real
ship by optical deformation measurement instrument, which is regarded as the most
accurate apparatus of deformation measurement. In this simulation experiment, the
data we sampled before is used to program the platform A and platform B respectively.
466   Modeling of Ship Deformation Measurement based on Single-axis Rotation INS

LGU

Platform B
IMU

turntable

Platform A

Figure 3. Equipment installation of experiment.

1
0.5
X (°)

0
-0.5
20 40 60 80 100
Time (s)
2
Y (°)

-2
20 40 60 80 100
Time (s)
2
Z (°)

1
0
20 40 60 80 100
Time (s)

Figure 4. Ship attitude information in 100 seconds.

0
X (′′)

-5
-10
-15
20 40 60 80 100
Time (s)
0
-2
Y (′′)

-4
-6
-8
20 40 60 80 100
Time (s)
10
0
Z (′′)

-10
-20
20 40 60 80 100
Time (s)

Figure 5. Ship deformation information in 100 seconds.


 Modeling of Ship Deformation Measurement based on Single-axis Rotation INS   467

The simulation experiments were conducted by two steps. In step 1, the single
axis rotation INS was not rotation modulated, only platform A and platform B are
programmed to imitate the ship attitude and ship deformation.
Step 2, the single-axis rotation INS was rotation modulated normally, platform A
and platform B were programmed to imitate the ship attitude and ship deformation.
According to the output information of the single-axis rotation INS and the LGU,
we use the algorithm discussed in section II to estimate the deformation of the ship,
and then compare the measurement results in step 1 and step 2.
The static deformation estimation in step 1 is shown in Figure 6, and the static
deformation estimation in step 2 is shown in Figure 7.
400
X
200 Y
Z
static deformation (′′)

-200

-400

-600

-800
0 0.5 1 1.5 2
Time (Hours)

Figure 6. Deformation estimation in step 1.

400
X
200 Y
Z
static deformation (′′)

-200

-400

-600

-800
0 0.5 1 1.5 2
Time (Hours)

Figure 7. Deformation estimation in step 2.


468   Modeling of Ship Deformation Measurement based on Single-axis Rotation INS

Repeat the experiments of step 1 and step 2 for three times, the measurement
results is show in Table 1.
Compare the measurement result of step 1 and step 2, the measurement error is
almost within 10 angular seconds as shown in Table 1. It means that whether the
single axis rotation INS is rotation modulated or not, the deformation measurement
result is almost the same.

Table 1. Deformation Estimation of Step 1 and Step 2

Measurement result Measurement result error between step 1


of step 1(″) of step 2(″) and step 2(″)

X Y Z X Y Z X Y Z

1 -681.9 210.8 -621.9 -679.4 206.4 -625.2 2.5 -4.4 -3.3


2 -466.5 217.3 -864.5 -460.8 218.9 -871.7 5.7 1.6 -7.2
3 -451.3 201.9 -914.4 -453.3 190.2 -919.1 -2.0 -11.7 -4.7

4 Conclusion

In this article, we discussed the model of ship deformation measurement based on


single-axis rotation INS and a set of strap down laser gyro unit. Simulation experiment
was also conduct to confirm the feasibility of the measurement method. Real ship
attitude data and the ship deformation data was use to improve the authenticity of the
experiment. In the measurement process, the gyro output of the single-axis rotation
INS is demodulated and then Kalman filter equation is constructed to estimate the
deformation of the ship.
According to the results of simulation experiments, the measurement accuracy
of the method based on single-axis rotation INS is approximately the same with the
method based on two sets of LGUs. This provides a new and feasible scheme for the
design of ship deformation measurement system.

References
[1] Wei Gao, Wei Shan, Bo Xu, et al. Nonlinear deformation measurement method based on IMU
for ship[C]. Mechatronics and Automation (ICMA), 2015 IEEE International Conference on
Mechatronics and Automation , 2015: 1397-1401.
[2] S. Majeed, Jiancheng Fang. Comparison of INS based angular rate matching methods for
measuring dynamic deformation[C]. Electronic Measurement & Instruments, 2009. ICEMI ‘09.
9th International Conference on, 2009: 1-332-1-336.
[3] Mochalov A V, Kazantsev A V. Use of ring laser units for measurement of moving object
deformations[C]//Second International Conference on Lasers for Measurement and Information
Transfer. International Society for Optics and Photonics, 2002: 85-92.
 Modeling of Ship Deformation Measurement based on Single-axis Rotation INS   469

[4] Mochalov A V. A system for measuring deformations of large-sized objects[J]. AGARDOGRAPH


AGARD AG, 1999: 15-15.
[5] Schneider A M. Kalman filter formulations for transfer alignment of strapdown inertial units[J].
Navigation, 1983, 30(1): 72-89.
[6] Zheng J X, Qin S Q, Wang X S, et al. Attitude matching method for ship deformation
measurement[J]. J. Chinese Inertial Technology, 2010, 18(2): 175-180.
[7] Liu Aili, Dai Hongde. Deformation Estimation of Warship Based on the Match of Inertial Sensors’
Output[J]. Chinese Journal of Sensors and Actuators, 2011, 24(001): 145-148.
[8] Titterton D. Strapdown inertial navigation technology[M]. IET, 2004.
[9] Wu W, Chen S, Qin S. Online estimation of ship dynamic flexure model parameters for transfer
alignment[J]. Control Systems Technology, IEEE Transactions on, 2013, vol.21, no.5, 1666-1678.
Mei-ling WANG*, Hua SONG, Chun-ling WEI
Research on Fault Diagnosis of Satellite Attitude
Control System based on the Dedicated Observers
Abstract: In this paper, the design principle of disturbance decoupling for nonlinear
unknown input observer is introduced. And then a bank of unknown input observers
are designed for the satellite attitude control system, and each observer is decoupled
to the flywheel fault of a particular axis, which is sensitive to the flywheel faults of the
other axes. The observers generate a structured residual set, and the flywheel fault
isolation is realized by fault separation logic. Finally, the feasibility of the method is
verified by simulation analysis.

Keywords: Satellite attitude control system; unknown input observer; fault diagnosis

1 Introduction

Satellite attitude control system, one of the core systems of the the satellite for
normal operation, its reliability and stability play a very important role in the normal
operation of the satellite. Due to the problem of increasingly high requirements on the
satellite performance, the complex flight environment, and the limited resources on
satellite, the complex satellite attitude control system is inevitable showed different
types of faults, which lead to system performance degradation and system fault,
and resulting in serious losses. Thus, the study of the fault diagnosis of the satellite
attitude control system is very important [1,2].
Recent years, the fault diagnosis method based on state estimation has been
widely used and achieved remarkable results [3]. However, in the current study, the
fault diagnosis of linear systems is studied deeply, the nonlinear systems are not
mature due to their own complexity, especially for the modern systems which are
often accompanied by disturbances and uncertainties, the fault diagnosis of nonlinear
systems becomes more difficult [4]. Reference [5] used an sliding mode observer, but
this method exists a certain time lag phenomenon; Reference [6] used a robust
fault diagnosis method for satellite actuator based on discrete proportional integral
observer. This paper studied the fault detection and isolation of nonlinear systems
based on the state estimation method, and then the dedicated observers thought

*Corresponding author: Mei-ling WANG, school of automation science and electrical engineering,
Beihang University, Beijing, China, E-mail: 781854605@qq.com
Hua SONG, school of automation science and electrical engineering, Beihang University, Collabora-
tive Innovation Center of Advanced Aero-Engine, Beijing, China
Chun-ling WEI, National Key Laboratory of Space Intelligent Control Technology, Beijing, China
 Research on Fault Diagnosis of Satellite Attitude Control System   471

is used to design a set of unknown input observers, such that each observer is not
sensitive to a specific fault, and generating a set of structured residuals. Finally,
realized the flywheel fault detection and isolation of the satellite attitude control
system.

2 Fault diagnosis method based on nonlinear unknown input


observers
For the following nonlinear system:
x =A( x, u ) + K ( x, u ) f + E ( x)d
(1)
y = c( x)
n
Where x(t ) ∈ R represents the state variable of the nonlinear system; A( x, u ), K ( x, u )
represents the nonlinear functions of the system; u ( x) ∈ R m represents the input
n
vector of the system; f ∈ R represents the fault vector, d ∈ R represents the
n

p
unknown disturbance vector; y (t ) ∈ R represents the output of the system and
E ( x) represents the interference distribution matrix.
Rewrite the model (1) using the following state transformation:

Γ =T ( x) (2)
Then, the model after transformation is described by

∂T ( x)
T=
( x) ( A( x, u ) + K ( x, u ) f + E ( x)d ) (3)
∂x
The transformation must be selected to ensure that it can remain unaffected by the
unknown inputs but still reflect actuator faults. And it is easy to see from Eq. (3) that
the transformation needs to satisfy the following equation:

∂T ( x)
E ( x) = 0 (4)
∂x
In this way, the transformation model can be obtained by Eq. (4), and the system after
transformation is:

∂T ( x)
Γ T=
= ( x) ( A( x, u ) + K ( x, u ) f ) (5)
∂x
This model is called disturbance decoupling because it has nothing to do with the
interference d .
Assume rank ( E ( x)) = p , then according to the Frobenius theorem, if the
interference distribution matrix E ( x) meets the following sufficient and necessary
conditions:

rank ( E ( x)[ei ( x), e j ( x)])


= p, i,=
j 1, 2, ⋅⋅⋅, n − p (6)
472   Research on Fault Diagnosis of Satellite Attitude Control System

th
where ei ( x) is the i column of E ( x) , and satisfies
∂e j ( x) ∂ei ( x)
[e=
i ( x ), e j ( x )] ei ( x) − e j ( x)
∂x ∂x (7)

then Eq. (4) exists n− p independent solutions, [7]

Γ
= i Ti ( x),=i 1, 2, ⋅⋅⋅, n − p (8)

And then, the complete transformation is

 T1 ( x) 
 
Γ T=
= ( x)   
Tn − p ( x) 
  (9)

It can be seen from Eq. (9) that the state order of the transformed system is n − p < n
, and the output information of the original system is needed when using the reduced
order state Γ to restore the original state of the system:

y * = c* ( y ) (10)

where c* ( y ) is a specific transformation of y = c( x) and satisfies:

 ∂T ( x) 
 ∂x 
rank  * =n
 ∂c (y) y = c ( x ) 
 
 ∂x  (11)

According to Γ and y* , the inverse function Ψ 0 satisfies

x =Ψ 0 (Γ, y* ) (12)

When there is no fault, the system inevitably has the state estimation error, in order to
ensure that the designed observer is asymptotically convergence, a feedback quantity
is needed to be introduced:

R(T ( x), c( x)) = 0 (13)

And if the Eq. (13) exists, the following condition must be satisfied:

 ∂T ( x) 
 ∂x 
rank   < n− p+m
 ∂c( x) 
 ∂x  (14)
 Research on Fault Diagnosis of Satellite Attitude Control System   473

where m is the independent outputs number of the system. As seen from Eq. (11) :
 ∂c* (y) 
 ∂T ( x)   y =c( x)
=
rank   + rank n
 ∂x   ∂x  (15)
 
And from Eq. (15):
 ∂T ( x) 
 ∂x 
rank  =n
 ∂c( x) 
 ∂x  (16)

Then it can be obtained by Eq. (14) and Eq. (16) that p < m , that is, the number of
independent measurement signals required is greater than that of independent
unknown disturbance inputs.
In addition to ensuring the fault detection observer stable, Eq. (13) can also be
used as a residual to determine whether the system is fault, that is because once the
fault occurs, the Eq.(13) will no longer be established.
So the reduced order observer can be designed as [9]
∂T ( xˆ )
zˆ
= A( xˆ , u ) + H ( x, u ) R(Γˆ , y )
∂x Ψ 0 ( Γˆ , y* )
xˆ = (17)

The state estimation error e and residual r of the observer is defined as:

e = Γˆ − Γ (18)

r R(Γˆ , y )
= (19)

The residual vector r can be used for fault detection. And the Taylor series expansion
of the differential equation governing the dynamics of the estimation error e is:
∂T ( x)
=e ρ (e, t ) − K ( x, u ) f
∂x
∂T ( x)
= F (t )e + ο (e 2 ) − K ( x, u ) f
∂x (20)

∂ ∂T ( x) ∂Ψ 0 (Γ, y* )
F (t ) = ( A( x, u ))
∂Ψ 0 ∂x ∂Γ
* ∂R (Γ, y )
Where + H (Ψ 0 (Γ, y ), u ) , and ο (e ) is the second and higher
2
∂Γ
order terms of e .
In order to make the state estimation error uniformly bounded, we should select
the appropriate feedback matrix H ( xˆ , u ) to make the e = F (t )e asymptotic stable.
Generally, H ( xˆ , u ) can be obtained by pole placement or simulation analysis [8]. In
addition, in order to ensure that all faults affecting the original system can be reflected
474   Research on Fault Diagnosis of Satellite Attitude Control System

in the estimation error of the observer, according to the Eq. (20), the transformation
matrix T ( x) should also satisfy the following equation
 ∂T ( x) 
rank  �K ( x, u )  = rank ( K ( x, u )) (21)
 ∂x 
Based on the above analysis, the robust fault detection observer is shown on Figure 1.

f d
u Nonlinear
y
dynamic system

C * ( y)
*
y
Disturbance Γ̂ γ
R(Γˆ , y )
decoupling model
x̂ u
H ( xˆ , u ) ⋅ γ

Figure 1. Sketch map of robust fault detection

In the picture, u represents the input vector of the system; d represents the unknown
disturbance vector; f represents the fault vector; y represents the output of the
system; c ( y ) is a specific transformation of y = c( x) and y* = c* ( y ) ; R (Γˆ ( x), y )
*

is the feedback quantity; γ is the residual error, and H ( xˆ , u ) is the feedback matrix.

3 Dedicated Observers Design For Flywheel Fault Isolation

In this section, this paper analyzed the actuator fault diagnosis of satellite attitude
control system based on unknown input observers, then designed a bank of unknown
input observers which are respectively decoupled with the three axis flywheel faults
so as to achieve the purpose of fault isolation.
Firstly, the dynamic equation of the satellite attitude control system is [10]:

 I y − Iz  1 
 wy wz   0 0
 x I   Ix  M + T 
 w x      x dx
 w  = Iz − Ix 1
 y  I
 wx wz  +  0 0   M y + Tdy 
  Iy 
 w z     M z + Tdz 
y
 
I
 x − I  0 1
y
w w 0
 I x y   I z  (22)
 z 
Where, represents three axis angular velocity of satellite,
T
 wx
wz  wy
T T

 M x My represents the actuator input control torque; Tdx Tdy Tdz  represents
M z 
the space environmental disturbance torque; and  I x I y I z T represents the satellite
inertia moment.
 Research on Fault Diagnosis of Satellite Attitude Control System   475

When the actuator fails, the space form of the satellite dynamics equation can be
written as Eq.(23):

x (t ) =
Φ ( x, t ) + Bu f (t ) + Bd
y = Cx(t ) (23)

where, u f = (t ) u (t ) + f (t ) is the flywheel output when a fault occurs,


f (t ) = [ f1 (t ) f 2 (t ) f 3 (t ) ] is fault vector function, f i (t ), i = 1, 2,3 is respectively
T

corresponding to the x, y, z flywheel fault. Use u (t ) + f (t ) to replace u f (t ) in Eq.


(23), and the Eq.(23) can be rewritten as:
x (t ) =
Φ ( x, t ) + Bu (t ) + Bf (t ) + Bd
y = Cx(t ) (24)

1/ I x 0 0 
B =  0 1/ I y 0 
Where  0 0 1/ I z  is the flywheel fault distribution matrix, and can be
represented as B = [ B1 B2 B3 ] .
Then the design principle of flywheel unknown input observer is as follows:
Assume that the i axis flywheel fails, and then establish an unknown input observer
i , use the i axis flywheel fault as the unknown inputs, and make the observer be
robust to the i axis flywheel fault, while be sensitive to the other axis flywheel fault,
so the observer is decoupled with i axis flywheel fault. Therefore, in order to achieve
the flywheel fault isolation, three observers are needed and the three observers
generate three residuals. Then the fault separation judgment logic is:

γ i ≤ εi 
 ⇒ i axis flywheel fails
γj ≥ ε j , ∀j ≠ i 
(25)

Where ε i is the fault detection threshold for i axis flywheel.


In this paper, assume that the z axis flywheel is fault, and an unknown input
observer is designed to decouple the z axis flywheel fault. When the z axis flywheel
is fault, the attitude dynamics model is:
x (t ) =
Φ ( x, t ) + Bu (t ) + B3 f (t ) + Bd
y = Cx(t ) (26)

In order to make the transform be decoupled with the fault, it is needed to meet the
following equation:
∂T ( x)
B3 = 0 (27)
∂x
Because rank ( B3 ) = 1 , the number of independent transformation is n − 1 = 3 − 1 = 2 ,
it is clear that the T ( x) consisted of any independent function
= Γi Ti = ( x1 , x2 ), i 1, 2
can satisfies the Eq.(27). Choose Γ1 = T1 ( x)= x1 , Γ 2 = T2 ( x)= x2 , then
476   Research on Fault Diagnosis of Satellite Attitude Control System

x 
( x)  1 
Γ T=
=
 x2  (28)
In addition, it is also required to use a measurement signal to restore the state x ,

x =Ψ 0 (Γ1 , Γ 2 , y* ) (29)

Choose= y* c= *
( y ) [ 0 0 1] y , then the observer decoupled with z axis flywheel
fault can be designed as:

 ∂T (Ψ 0 (Γˆ , y ))
*
= Γˆ (Φ (Ψ 0 (Γ, y* )) + Bu ) + H (Ψ 0 (Γ, y* ))γ (30)
∂* x
( )
T
Where ψ (Γˆ , y ) =Γ Γ
0 x , and the residual vector is
1 2 3

γ = y − η (Γˆ , y* ) (31)

Where η (Γˆ , y ) =
( xˆ1 x2 ) .
*

In simulation, selecting an appropriate feedback matrix H is needed to guarantee


1
the convergence of the state estimation error, and in this paper choose H = 1 .

Similarly, for the x and y axis flywheel fault, design two unknown input
observers which are respectively decoupled with the x and y axis flywheel fault.
Therefore, the residual set composed of three unknown input observers can be
obtained, and then according to Eq. (25), the fault flywheel can be determined.

4 Simulation and analysis

In this paper, assume that there is a constant bias in the flywheel output, and design
unknown input observers to isolate the flywheel fault. In simulation, the satellite
inertia moment is designed as:
I x = 1849.3765,I y = 1435.234, I z = 2278.8824 , the space disturbance torque
is designed as Td x = Ax × sin(ω0 t ), Td y =
Ay × sin(ω0 t ) , and Ax = 1.4 *10( −5) ,
Ay = 1.5*10 , Az = 1.6 *10 , ω0 = 0.02rad / sec .
( −5) ( −5)

When there is no failure with three axis flywheels, the residuals of the three
observer residuals are as follows:
 Research on Fault Diagnosis of Satellite Attitude Control System   477

Figure 2. Residual of x axis unknown input observer

Figure 3. Residual of y axis unknown input observer


478   Research on Fault Diagnosis of Satellite Attitude Control System

Figure 4. Residual of z axis unknown input observer

Figure 2 - 4 show that when there is no fault with the flywheels, the three observers
are able to track the system state very well, the residual errors caused by the unknown
input disturbance is less than 0.2 × 10−16 , so the fault detection threshold value can
be set as ε=1 ε=2 ε=3 0.3 × 10−16 , and then the judgment logic for the flywheel fault
isolation is:

γ i ≤ 0.3 × 10−16 
 ⇒ i axis flywheel fails
γ j ≥ 0.3 × 10 −16
, ∀j ≠ i 
(32)

 u t ≤ 600 s
uout =  in
When the x axis flywheel fails at 600s, its output is uin + ∆ t > 600 s , and
∆ = −6 × 10 N ⋅ m , the three observer residual are as follows:
−5
 Research on Fault Diagnosis of Satellite Attitude Control System   479

Figure 5. Residual of x axis unknown input observer

Figure 6. Residual of y axis unknown input observer


480   Research on Fault Diagnosis of Satellite Attitude Control System

Figure 7. Residual of z axis unknown input observer

Figure 5 - 7 show that when x axis flywheel fails, the residual of x axis unknown input
observer remains unchanged, while the other two observer residuals rises rapidly and
exceeds the threshold after the failure (the bold line in figures), and according to the
fault isolation logic it is easy to judge that the x axis flywheel fails.

5 Conclusion

This paper has introduced the principle of unknown input observer and flexibly
applied it in the satellite attitude control system, established a series of unknown
input observers, and ensured that each observer is decoupled with a particular axis
flywheel failure, while sensitive to the other axis failure. And then used the structured
residuals set to isolate the flywheel fault. Finally, conducted simulation analysis on
the condition of non fault and constant deviation of the flywheel, and the feasibility
of this method is verified. And this method could be applied to fault detection and
isolation of the flywheel in satellite attitude control system.

Acknowledgment: This study was supported by the National Natural Science


Foundation of China (No. 61573059).
 Research on Fault Diagnosis of Satellite Attitude Control System   481

References
[1] Yin S, Xiao B, Ding S X, et al. A review on recent development of spacecraft attitude fault
tolerant control system[J]. IEEE Transactions on Industrial Electronics, 2016, 63(5): 3311-3320.
[2] Wang R, Cheng Y, Xu M. Analytical redundancy based fault diagnosis scheme for satellite
attitude control systems[J]. Journal of the Franklin Institute, 2015, 352(5): 1906-1931.
[3] Cheng Y, Wang R, Xu M. A Combined Model-Based and Intelligent Method for Small Fault
Detection and Isolation of Actuators[J]. IEEE Transactions on Industrial Electronics, 2016, 63(4):
2403-2413.
[4] Cheng Yao, Wang Rixin, Xu Minqiang. Spacecraft fault diagnosis based on nonlinear unknown
input observer[J]. Journal Of Deep Space Exploration,,2015,03:278-282.
[5] Chen Zhenpeng. Research on Fault Diagnosis and Fault-Tolerant Control of Aircraft Actuator
Using Observer [D] Harbin Institute of Technology, 2015
[6] Shen Y, Wang Z H, Zhang X L. Fault diagnosis and fault-tolerant control for sampled-data
attitude control systems: an indirect approach [J]. Proceedings of the Institution of Mechanical
Engineers, Part G: Journal of Aerospace Engineering, 2014, 228( 7): 1047 - 1057.
[7] Isidori A. Nonlinear control systems[M]. Springer Science & Business Media, 2013.
[8] Seliger R, Frank P M. Fault-diagnosis by disturbance decoupled nonlinear observers [C] //
Decision and Control, 1991. Proceedings of the 30th IEEE Conference on. IEEE, 1991: 2248-2253.
[9] Seliger R, Frank P M. Robust nonlinear observer-based fault detection for an overhead crane [C]
// Automatic Control World Congress. 1993, 5: 429-432.
[10] Jia Qingxian, Zhang Yingchun, Li Huayi, Li Baohua. NUIO/LMI based robust fault diagnosis for
satellite attitude control system[J]. Journal of Harbin Institute of Technology, 2011,43(3): 19-22.
Ming-hui YAN*, Yao-he LIU, Ning GUO, Hua-cheng TANG
Data Advance Based on Industrial 4.0 Manufacturing
System
Abstract: This paper takes industrial 4.0 manufacturing system as the research
object. Based on the optimal matching of the front and back ends of the Internet of
things and cloud computing, it focused on studying the data advance of back end
Web Service, so as to solve the problem of data advance in chip manufacturing and
data extraction of mobile terminals.

Keywords: industrial 4.0, data advance, Web Service, Json

1 Data extraction in the process of chip manufacturing

As is known to all, the intellectual transformation and upgrading of manufacturing


industry towards the direction of “Internet plus industrialization” is the influence
of industrial 4.0 to world industry. Since the chip manufacturing industry not only
involves automatic robot, cloud computing, Internet of things and big data.
Analysis technology, it also provides underlying technologies, such as
microelectronics products including processors, sensors and micro-controllers, etc.
Different types of microelectronic products will be driven by new market demands at
the technical level so that semiconductor enterprises can be further driven to make
adjustments at the industrial level. Hence, industrial 4.0 is reshaping the data [1].
Integration of chip manufacturing industry at the same time of providing market
opportunities for semiconductor manufacturing enterprises.
Semiconductor former procedure, namely, the front-end production in
semiconductor’s manufacturing process, mainly involves lithography, etching
machine, cleaning machine, ion implantation, chemical mechanical flat, etc.
Semiconductor latter procedure, namely, the back-end production in semiconductor’s
manufacturing process, mainly refers to the process engineering of device separation
on a wafer, SMT assembly and package, etc.
In the whole process of chip manufacturing, the globalization of production leads
to the completion of the former and later processes in different places. But, the latter
procedure depends on the data of the former procedure. If the former procedure’s data
is not provided in time, it will bring great difficulties to manufacturers of the latter

*Corresponding author: Ming-hui YAN, School of Mechanical Engineering, Hubei University of


Technology, Wuhan,China, E-mail: 493041071@qq.com
Yao-he LIU, Ning GUO, Hua-cheng TANG, School of Mechanical Engineering, Hubei University of
Technology, Wuhan,China
 Data Advance Based on Industrial 4.0 Manufacturing System   483

procedure. Thus, the concept of cloud manufacturing is gradually introduced into the
semiconductor industry. Cloud manufacturing is a new concept to reduce the waste
of manufacturing resources and realize the high-degree sharing of manufacturing
resources using information technology. In cloud manufacturing, a public service
platform which shares manufacturing resources can be established in order to connect
the giant social manufacturing resources pool, provide all kinds of manufacturing
services and realize open collaboration of manufacturing resources and services and
high-degree sharing of social resources. Enterprise users no longer need to invest
in high costs to purchase processing equipment and other resources. They can seek
advice from public platforms on purchasing and renting manufacturing capacities.
Under ideal conditions, chip manufacturing will realize the integration of the related
resources of the whole life cycle of products development, manufacturing, selling
and using and provide a standard, normalized, sharable manufacturing service
mode. This manufacturing mode can enable the manufacturing industry users to use
a variety of manufacturing services conveniently like using water, electricity and gas.
So, the former procedure manufacturer can input the data of the front-end production
into the cloud platform and the latter procedure manufacturer can download the
data directly from the cloud platform according to their needs without needing the
useless data of other procedures. This can not only effectively reduce information
investment’s occupation of enterprises’ fund, but also can help enterprises save the
cost dramatically.
However, the development of China’s chip manufacturing industry is slow.
Especially, foreign key technologies and equipment for chip manufacturing is
still in a state of blockade. This leads to the result that the development of China’s
semiconductor industry cannot meet domestic industries’ needs for chips and the
data saved in them can only be pre-recorded in the working environment of foreign
manufacturers. At the same time, equipment for writing data into chips is also provided
by foreign countries. However, imported equipment is expensive and the introduction
of a full set of equipment is facing difficulties. But, without the introduction of foreign
advanced equipment, in the absence of semiconductor front-end data, chips produced
will be influenced in size and electrical characteristics, etc. Hence, it’s difficult to
realize transcend of the technical level of domestic semiconductor discrete devices
industry, which influences the development of semiconductor industry in China to a
large extent. Thus, it’s very necessary to establish a cloud manufacturing platform to
solve the problem of data transmission between the former procedure and the latter
procedure of semiconductors. As shown in Figure 1.
484   Data Advance Based on Industrial 4.0 Manufacturing System

Figure 1. The cloud platform of semiconductor chip manufacturing

1.1 Data Extraction Size Adaptation in Chip Manufacturing

With the rapid development of mobile Internet and smart mobile terminals (such as
smart phones), traditional network infrastructure migrates to the direction of cloud.
For cloud computing infrastructure to realize automatic distribution according to
needs, it needs to depend on data center, server, storage, virtualization, software
lightweight design and operating system to perform rapid configuration of the
terminal equipment. With configuration, there comes the choice. With the choice,
there comes the optimization. There exists a problem of data size adaptation between
mobile terminals and the back-end. Data size adaptation refers to the data adaptation
between the front end and the back end of the Internet of things. As shown in Figure 2,
fire fighters’ squirt  guns should be connected to fire hydrants. Similarly, however,
general household water taps should not be connected to the fire hydrant, otherwise
it will not fit. In the same way, in the design of the Internet of things, if the cloud cache
capacity is too large (data flow or bandwidth is abundant), there may be a waste of
resources. Conversely, if the cloud cache capacity is too small, it will lead to poor data
flow or data blocking [2].

Figure 2. Adaptation of water supply for civil pipeline


 Data Advance Based on Industrial 4.0 Manufacturing System   485

In the design process of mobile terminals of the Internet of things, hardware


adaptation must rely on the back-end software interaction model. However, the
mobile terminal design system is large and complex and the design of hardware and
software is not easy to balance. At the same time, there are big differences between
mobile search and traditional Internet search. It’s very difficult to present the data
of traditional Internet directly on intellectual mobile terminals well. Thus, the
transformation of data capacity becomes a problem. In addition, traditional Internet
data is generally placed in the relevant database. Since the huge data in the relevant
database leads to time-consuming searching, lightweight databases such as Web
Service and related technologies must be introduced in the design of the Internet
of things sometimes, which leads to the result that software designers need to face
challenging technological updates. In the face of these problems, we must consider
the interrelation between Web Service and data advance.

2 Web service and semiconductor data advance

With the rapid development of the Internet, people want application programs to be
able to communicate well with each other and this demand is constantly expanding.
So, developing uniform standards and protocols to support the communication
between programs appears to be extremely urgent. W3C’s uniform formulation
of various standards of Web Services and bringing them into the terminal design
environment provides a basis of broad support for Web Services.
At present, there are three available ways in interactive data processing, namely,
HTTP protocol, SOAP protocol and Web Service of REST style. These three cases are
shown in Figure 3.

Figure 3. Three ways of data interaction


486   Data Advance Based on Industrial 4.0 Manufacturing System

HTTP Service is based on HTTP protocol. It’s one of the main protocols of network
transmission. It obtains what is needed mainly through “post” and “get”. Its advantage
is higher data processing efficiency. But, it also has the disadvantages of slower speed
and lack of security. HTTP Service method can’t handle cross-domain problems.
Web Service uses a fixed XML format to encapsulate SOAP messages. It can adopt
HTTP as the bottom data transmission. But. It’s not limited to HTTP protocol and there
are standards on the method’s return of messages. Since it’s based on XML format, its
biggest advantages are cross-platform and cross-language. Considering HTTP Service
pattern can’t handle cross-domain problems, if we need to call another application
service, it’s Web Service. Besides, Web Service is also able to handle more complicated
data types. But, Web Service also has some disadvantages, such as slow processing
speed and only applicable on clients, etc.
Compared with HTTP and Web Service, Rest has a more comprehensive application
scope. REST is a lightweight Services Web architecture style which is based on HTTP
protocol. So, it doesn’t involve the introduction of complicated SOAP protocol. It can
be realized through HTTP protocol and it’s simpler than SOAP. In terms of processing
speed, REST is relatively prevailing. In addition, REST can also be applied to mobile
clients apart from clients and it can store and extract data from the cloud. At this point,
REST is simpler and more effective [3].
It can be seen from the above that the relationship between Web Service and
semiconductor data advance and the advantages are mainly expressed in:
a) The advantage contents of Web Service are more dynamic.
b) Bandwidth and storage are cheaper and the universal calculation is more important.
c) The information data is more lightweight in data advance and the transferred flow is
larger, which enables data to expand to the clouds outside of the local area.

3 Design of data advance process

Nowadays, more and more websites have released API developed by themselves,
including two Web services of different styles--SOAP and REST. Although compared
with REST, SOAP has higher authority and maturity, REST’ s idea of abstracting things
on the Internet into resources to operate is very popular with major websites. While in
data advance, the main advantages of REST are as follows:
a) REST can not only directly use HTTP to realize the operation purpose entirely, but
also can use Cache to perform caches. Hence, it has a faster response speed in the
operation process.
b) Due to its sensitive reactions and quick responses, REST has higher performances,
efficiency and usability which are much higher than SOAP protocol.
c) In terms of resource operation, the methods that REST adopts in a series of operations
including resource acquisition, creation, modification, and deletion are totally in
concert with the GET, POST, PUT and DELETE methods provided by HTTP protocol.
 Data Advance Based on Industrial 4.0 Manufacturing System   487

This is a development pattern which is specially applicable for Internet practical


application. Such design can improve the flexibility of the system on the basis of
reducing the complexity of the development [4].

From the above we can see that REST infrastructure is specially applicable for completely
stateless CRUD (Create, Read, Update, Delete) operation. JSON is a lightweight data
exchange format which serves REST style. It adopts both the text format completely
independent of language and the data mode of C language family such as C, C# and
Java, etc. These characteristics allow JSON to become an ideal data exchange language.
It’s convenient both for people’s usual reading and writing habits and for machine
resolution and generation, which improves network transmission rate. Although JSON
and XML are almost neck and neck in readability, scalability and coding difficulty, since
JSON has a simpler, clearer structure and it’s easy to operate, a large number of front
desk developers choose JSON [5]. In a word, REST infrastructure is the most suitable for
data interaction based on browsers. Meanwhile, we can boldly replace the position of
XML with JSON texts.
In the development platform of NET--Visual Studio, the examples of the establishment
and application of Web Service used for database operation are as follows:
1. The establishment of a Web Service Project
The method to establish a Web Service project is similar to the method to establish a
ASP.NET website. Use C# to run the start page of Visual Studio 2010, choose 【File】-
【New】-【Web Site】 and select the “ASP.NET Web Service” model from the dialog
box, as shown in Figure 4.

Figure 4. The Establishment of a Web Service Project


488   Data Advance Based on Industrial 4.0 Manufacturing System

2. The establishment of a Rest Web Service Project


Rest Web Service is a lightweight Web service infrastructure style. Its realization and
operation are simpler than SOAP and XML [6]. Rest simply describes a infrastructure
pattern which merely takes advantages of existing mature technologies. With the
advent of the Internet of things, mobile devices and people’s lives are inextricably
linked. Massive users generate a huge amount of data accordingly and traditional
Web services become more complex. Thus, a lightweight service is needed to replace
current Web services and Web Service of REST style becomes the optimal way to solve
the current problems.
First, we need to establish the Web Service page of Rest style, as shown in Figure 5.

Figure 5. The Establishment of a Rest Web Service Project

The realization code of its main functions is as follows:


a)IRestservice.cs code
namespace EricSunRestService
{
[ServiceContract(Name = “RestServices”)]
publicinterface IRestServices
{
[OperationContract]
[WebGet(UriTemplate = Routing.GetClientRoute, BodyStyle = WebMessageBodyStyle.
Bare)]
string GetClientNameById(string Id);
}
 Data Advance Based on Industrial 4.0 Manufacturing System   489

publicstaticclass Routing
{
publicconststring GetClientRoute = “/Client/{id}”;
}
}
b)RestService.cs code
namespace RestService
{
[ServiceContract(Name = “RestServices”)]
publicinterface IRestServices
{
[OperationContract]
[WebGet(UriTemplate = Routing.GetClientRoute, BodyStyle = WebMessageBodyStyle.
Bare)]
string GetClientNameById(string Id);
}

publicstaticclass Routing
{
publicconststring GetClientRoute = “/Client/{id}”;
}
}
c)Program.cs code
namespace HostService
{
class Program
{
static void Main(string[] args)
{
RestServices Services = new RestServices();
WebServiceHost _serviceHost = new WebServiceHost(demoServices, new
Uri(“http://localhost:8000/DemoService”));
_serviceHost.Open();
Console.ReadKey();
_serviceHost.Close();
}
}
}
490   Data Advance Based on Industrial 4.0 Manufacturing System

It can be seen from the above code that uri points directly to local addresses,
which reflects that Rest extracts things on the Internet into resources and operate
the resources through HTTP protocol so that designers can not be bounded when
designing this lightweight interface and they can benefit a lot. Thus, compared with
SOAP, REST appears to be lightweight and simple [7].

3.1 Interface Operation

Input the corresponding address into the browser: Uri(“http://localhost:8000/


DemoService”) and you can obtain the string entered in the above code, as shown in
Figure 6.

Figure 6. Interface Operation

4 Conclusion

Under the promotion of the industrial 4.0 tide, the reform and renewal of chip
manufacturing industry is imperative under the situation [8]. Hence, one can see
that the storage and extraction of semiconductor chips’ former and latter procedure
data are increasing concerned by semiconductor manufacturers. In this case, it’s very
necessary to study the data advance of the back-end Web Service based on cloud
manufacturing so as to solve a series of problems such as the data advance in chips
manufacturing and data extraction of mobile terminals. This also brings wholly new
opportunities and challenges for chip manufacturing industry.

Reference
[1] Navraj Chohan,  Chris Bunch,  Chandra Krintz,  Navyasri Canumalla.Cloud Platform Datastore
Support.Journal of Grid Computing, 2013, Vol.11 (1), pp.63-81
[2] Yaohe Liu. The optimum design of the Internet of things and the technology of data
adaptation[M].Beijing:Science press.2014,14-15
 Data Advance Based on Industrial 4.0 Manufacturing System   491

[3] Ge Zhou,  James Nightingale.Cloud Platform Based on Mobile Internet Service Opportunistic
Drive and Application Aware Data Mining.Journal of Electrical and Computer Engineering, 2015,
Vol.2015
[4] Guo Quan Huang,  Sui Xun Guo,  Wei Xiong.Model Design of Self-Service Intelligent Resource
Management System Based on Cloud Platform.Advanced Materials Research, 2014, Vol.2863
(846), pp.1491-1495
[5] Yong Qi Han,  Yun Zhang,  Wei Dong Guan.Research on Building the Cloud Platform Based on
Hadoop.Applied Mechanics and Materials, 2014, Vol.2987 (513), pp.2468-247
[6] H. Guo,F. Tao,L. Zhang,Y.J. Laili,D.K. Liu.  Research on measurement method of resource service
composition flexibility in service-oriented manufacturing system[J]. International Journal of
Computer Integrated Manufacturing. 2012 (2)
[7] Fei Tao,Lin Zhang,A.Y.C. Nee.  A review of the application of grid technology in manufacturing[J].
International Journal of Production Research. 2011 (13)
[8] Xiaoying Yang,Guohong Shi,Zhiwen Zhang.  Collaboration of large equipment complete service
under cloud manufacturing mode[J]. International Journal of Production Research. 2014 (2)
Hai-tao ZHAI, Wen-shen MAO, Wen-song LIU*, Ya-Di LU, Lu TANG
High Performance PLL base on Nonlinear Phase
Frequency Detector and Optimized Charge Pump
Abstract: to improve the performance of PLL, the circuits of PFD and CP are studied.
A kind of the nonlinear PFD is proposed which improve the speed of phase detecting.
A novel CP circuit featured with high speed and high precision is designed. The circuit
consists of the current complement circuit, fast-locked circuit, feed-through and
offset circuit to improve the performance. The proposed CP can match the charge in a
great range. Finally, with the proposed PFD and CP, the whole performance of PLL is
tested, and the results are showed.

Keywords: frequency synthesizer; phase frequency detector; charge pump; charge


rudder

1 Introduction

Phase-Locked Loop (PLL) is a kind of close-loop auto-control system that could


track the phase of input signal and output the signal of the stable frequency. At the
beginning, PLL is made of discrete components. Considering the cost and technology
difficulty, PLL is used in the military and precision measurement. Since 70s of the
twenty-century, PLL was accepted gradually as the low-cost component [1].
Frequency synthesis of PLL is used in the transceiver circuits to provide the stable
intrinsic signal. As the typical digital-analog circuits of PLL, charge pump phase-
locked loop (CPPLL) is featured with low jitter, low power and high speed. CPPLL
could be designed flexibly by balancing the bandwidth, the damp factor, the locked
range, and so on [2].
To improve the performance of PLL, the key circuits, PFD and CP, are studied.
A kind of nonlinear PFD is proposed which improve the speed of phase detecting. A
novel CP circuit featured with high speed and high precision is proposed. The circuit
consists of the current complement circuit, fast-locked circuit, feed-through and
offset circuit to improve the performance. The proposed CP can match the charge in a
great range. Finally, with the proposed PFD and CP, the whole performance of PLL is
tested, and the test results are showed.

*Corresponding author: Wen-song LIU, Equipment Research and Development Center, 28th of China
Electronics Technology Group Corporation, Nanjing, China, E-mail: xss4@163.com
Hai-tao ZHAI, Wen-shen MAO, Equipment Research and Development Center, 28th of China Electro-
nics Technology Group Corporation, Nanjing, China
Ya-Di LU, Lu TANG, School of Information Engineering, Southeast University, Nanjing, China
 High Performance PLL base on Nonlinear Phase Frequency Detector   493

2 Nonlinear PFD

2.1 Theory Analysis

A new kind of Nonlinear PFD is proposed. When the phase difference of input signal
is beyond [-π,π], no reset signal is produced so the dead zone of PD could be cleared
totally.
The theory of the nonlinear PFD is that reset signal go through the AND gate
controlled by the fref when the phase of fref is ahead of that of fvco (|φ|≥π). So the logic of
UPnew would keep 1 until Δφ<π., then the reset signal could be eliminated by the reset
of the AND gates [3].
With the nonlinear PFD is used, the locked course of PLL would be accelerated.
Because when the phase difference of input signal is beyond [-π, π], PFD would work
in the stable output zone. When PLL is locked, PFD would work in the linear benefit
zone, which means low jitter and suitable loop bandwidth [4].
The circuit of nonlinear PFD is shown as Figure 1. The work course of the circuit
is detailed as below: the initial state of the circuit is that UPn=DNn=1, UP=DN=0,
Reset=0, nd MN1 and MN3 are turned off. When fref=fvco=0, Xref = Xvco=1, and MN5 and
MN7are turned off, UPn and DNn wouldn’t be affected, and Reset would keep 0. When
fref and fvco rise to 1, MN5 and MN7 are turned on, so UPn and DNn discharge to 0 and
then UP=DN=1.

Fref
MP1 MP3
X UP UP

Fref
MN1 MN5

MN2 MN6

RESET

Fvco
MP2 MP4
Y DN DN

Fvco
MN3 MN7

MN4 MN8

Figure 1. Circuit of nonlinear PFD.


494   High Performance PLL base on Nonlinear Phase Frequency Detector

If the phase of fref is ahead that of fvco while the phase difference Δφ<π, the condition
fref=fvco=1 makes Reset 1, and then Xref =Xvco=0. In further, UP and DN are reset to 0 at the
same time. In this condition, nonlinear PFD is the same as the traditional PFD, and
the linear gain is ICP [5]. But if the phase of fref is ahead of that of fvco while the phase
difference Δφ≥π, the condition that fref=0 and fvco=1 makes Reset=1. In this condition
only Xvco changes into 0(DN=0 and UP =1). Nonlinear PFD can provide a stable ICP.

2.2 Simulation of Nonlinear PFD

Circuit function test is to verify PF and PD of the whole PFD by instantaneous


simulation. The input signal Vre is the square wave of 25MHz and Div is also the square
wave of varying in phase and frequency. UP and DW are the output of PFD. When Div
varies, the output of PFD is shown as Figure 2.

Figure 2. Same frequency but different phase( the phase of frefis ahead of that of fdiv 100o).

From Figure 2, UP goes from low to high when the rising of Vref comes. Similarly, DW
goes from low to high when the rising of Div comes. The state of High level of UP and
DW is for the time being, then UP and DW are reset to zero. So the logic function of the
proposed PFD is right.
 High Performance PLL base on Nonlinear Phase Frequency Detector   495

The character curve of transform can by got by the instantaneous simulation and
DC scanning simulation, as Figure 3 shows. It’s necessary to point out that he delay
of Vdiv is set as b·T/360, where b is phase variable. Figure 3 shows that there’s no dead
zone for the PFD and the precision of PD is [-316o, 315o].

Figure 3. Character curve of transform.

3 Optimized CP with Charge Rudder

3.1 Circuit of CP

CP is optimized with charge rudder. Generally speaking, the proposed CP consists of


the core circuit, the clamper amplifier, the rail to rail circuit, and the current source
[6,7]. The core circuit is shown as Figure 4.
To satisfy the quick responding of switch, based on the traditional CP of charge
rudder, some modules are added like the current complement circuit (M2, M3), the
fast locked circuit(M4, M9), the feed-through and offset circuit (M12, M14), and so on.
To reduce the mismatch of static current and dynamic current caused by the
modulating effect of current-mirrored, the clamper amplifier is added, which could
keep the Vout and Vref the same. The detailed analysis is as below: when charge switch
M11 is turned off, switch M16 is on, the current goes through M16 and the follower
makes the voltage of the drain of M16 following. The result of such design is that
496   High Performance PLL base on Nonlinear Phase Frequency Detector

the current-mirrored transistors M9 and M10 are saturated, so the transistors are
conducted and the current always exits. The responding speed of CP is only restricted
by the switch M11 and M 16 and is obviously increased. As Figure 4 shows, the current
go through M17, so the responding speed of CP is affected only by M13 and M17. It
means that the proposed CP could be high speed while the conflict between charge
match and responding speed could be solved.

Figure 4. Circuit of charge pump

3.2 Simulation of Proposed CP

P-type switch (MP3) and N-type switch (MN2) are both switch on and UPP is
grounded. DWW is voltage of the source. The output is connected with the adjustable
and ideal source. The DC scanning is performed and the simulation could be got.
When output voltage is between 0 and 1.1V, the charge match is not good. But for the
whole PLL circuit, it’s good for the quick adjusting of VCO, which promotes the overall
performance of the circuit [8,9].
Stabilization is tested using STB of Cadence, as Figure 5 shows. Gain could reach
69.53dB and phase margin could be 65o when Vdc=600mv.
 High Performance PLL base on Nonlinear Phase Frequency Detector   497

Figure 5. Gain and phase of clamper amplifier (Vdc=0.6v)

4 pll Simulation

With the circuits of PFD and CP well designed, the whole circuit of PLL is simulated
with the load of 3-order low pass filter. The simulation condition is detailed in the
following: fref = 25MHz(40ns), fdiv = 24.39MHz(41ns), the phase of fref is 150o ahead of
that of fdiv.
When reference frequency = 25MHz, divider = 400 and VC=323mV,the output
frequency of PLL is 10GHz,as Figure 7 shows.
Using 90nm CMOS, The high-speed PFD and the high-speed CP based on the
optimization of charge rudder are designed. The simulation shows that: when the
voltage is 1.2V and the reference frequency is 25MHz, PFD could work well and the
function is right. When output voltage is between 0V-0.9V and output current is
200μA, the mismatch rate of current is below 1%.
498   High Performance PLL base on Nonlinear Phase Frequency Detector

Figure 6. Co-simulation of PFD and CP(fref ahead of fdiv).

Figure 7. Simulation of PLL.


 High Performance PLL base on Nonlinear Phase Frequency Detector   499

5 Conclusion

Using 90  nm CMOS, The high-speed PFD and the high-speed CP based on the
optimization of charge rudder are designed. The simulation shows that: when the
voltage is 1.2 V and the reference frequency is 25 MHz, PFD could work well and the
function is right. When output voltage is between 0V-0.9V and output current is
200 μA, the mismatch rate of current is below 1%.

References
[1] Simon Haykin. Communication System[M]. John Wiley & Sons, 5th International student edition.
[2] Xue Hong. Design of PFD and CP in CMOS PLL of DVB-T Receiver[D]. Southeast University, 2008.
[3] Sang-O Jeon, Tae-Sik Cheung, Woo-Young Choi; “Phase/frequency detector for high-speed PLL
applications”[J]; Electronics Letters, 1998, 34(22): 2120-2121
[4] Tobias Tired, Henrik Sjöland, Per Sandrup, et al. A 28 GHz SiGe PLL for an 81-86 GHz E-band
beam steering transmitter and an I/Q phase imbalance detection and compensation circuit [J].
Analog Integrated Circuits and Signal Processing, 2015, 84(3):383-398.
[5] Mohamed Elsayed, MohammedM. Abdul-Latif, Edgar Sánchez-Sinencio. A Spur-Frequency-
Boosting PLL with a -74 dBc Reference-Spur Suppression in 90 nm Digital CMOS [J]. IEEE Journal
of Solid-State Circuits, 2013,48(9):2104-2117.
[6] Fong N H,Plouchart J,Zamdmer N, et al. A 1-V 3.8-5.7 GHz wide-band VCO with differentially
tuned accumulation MOS varactors for common-mode noise rejection in CMOS SOI technology
[J]. IEEE Transactions on Microwave theory and tech,2009,51(8): 1952-1959.
[7] Mark Ferriss, Jean-Olivier Plouchart, Arun Natarajan, et al. An Integral Path Self-Calibration
Scheme for a Dual-Loop PLL [J]. IEEE Journal of Solid-State Circuits, 2013,48(4):996-1008.
[8] Robert J. A. Baker, Bosco Leung, Christopher Nielsen. Phase Noise Modeling for Integrated
PLLs in FMCW Radar [J]. IEEE Transactions on Circuits and Systems—II: Express Briefs, 2013,
60(3):137-141.
[9] Joonhong Park, Hyuk Ryu, Keum-Won Ha, et al. 76-81-GHz CMOS Transmitter with a
Phase-Locked-Loop-Based Multichirp Modulator for Automotive Radara [J]. IEEE Transactions on
Microwave Theory and Techniques, 2015,63(4): 1399-1407.
Fang-yan LUO
Accelerated ICP based on linear extrapolation
Abstract: The efficiency of ICP (iterative closest point) algorithm is an important
problem in point cloud registration. As known that A new registration vector will
be calculated in each iteration during ICP. A new accelerated ICP algorithm based
on linear extrapolation is proposed, in which the linear extrapolation optimization
algorithm is cited to extrapolate a new registration vector, if last several vectors almost
direct to the same direction. The proposed method is verified on two different point
clouds, comparing with other methods. The result shows that the proposed method is
effective to accelerate ICP algorithm.

Keywords: point cloud registration, iterative closest point, ICP, linear extrapolation.

1 Introduction

With the development of 3D scanning technology, it is easy to obtain high resolution


and accuracy point cloud. As the view limitation of the scanner, point clouds will be
captured from different orientations. To present the whole object shape, these point
clouds must be registered into the same coordinate system. Point cloud registration
is to find the most suitable transform parameters between two or more point clouds.
Point cloud registration include rigid registration and non-rigid registration [1]. Over
the past two decades, many registration algorithms were proposed, which can be
divided into coarse registration and fine registration method [2].
The ICP(Iterative Closest Point) is one of the most popular rigid find registration
algorithms [3], of which many variants have been proposed [1, 2, 4-8]. GoICP is a
globally optimality to Euclidean registration under L2 error based on branch-and-
bound scheme, which can search the 3D motion space SE(3) efficiently [9]. Liu
proposed PCPS-ICP for the registration of two overlapping model captured from
two adjacent views [10]. F-ICP was proposed for minimizing fractional root mean
squared distance(FRMSD) [11]. Xie proposed a dual interpolating point-to-surface,
which establish the correspondences by adopting a dual surface fitting approach
[12]. Maier-Hein proposed A-ICP, for accommodating anisotropic and inhomogeneous
localization error, which is a modification of ICP for anisotropic and inhomogeneous
localization error [13]. LieTrICP using Lie group parameterization to obtain the
optimal transformation, combines the advantages of the Trimmed Iterative Closest
Point(TriICP) algorithm [14].

*Corresponding author: Fang-yan LUO, Department of Information Engineering, Guangdong


Polytechnic, Foshan 528041, China, E-mail: 532443807@qq.com
 Accelerated ICP based on linear extrapolation   501

Simon [15] presented a decoupled acceleration ICP of translation and rotation


with the following reason: in the original Acc-ICP [3], a same parameter is assigned
on the prediction transformation(translation and rotation), but it would be happened
that the prediction only on translation or rotation will perform well, even if the whole
incremental transformation is not suitable.
For the above problem, a new accelerated ICP based linear extrapolation,
is proposed avoiding overshoot and unnecessary calculating. Firstly linear
extrapolation is uesed for forecasting, without the parabolic interpolate. And then it
selects the vector addition of last two incremental transformation as the direction of
extrapolation (combinational direction), avoiding prediction direction is only linear.
Lastly a coefficient (e.g. 0.5) is set on the part of extrapolation to reduce the extent.
The paper is organized as follows: In the next section ICP algorithm and linear
extrapolation will be briefed. The proposed novel algorithm is detailed in section
3, and experiments and discussion are in section 4. Lastly conclusion is showed in
section 5.

2 Related work

2.1 ICP algorithm

Given two point clouds P and X, with Np and Nx points respectively, an appropriate
registration vector q will be calculated for matching P to X.
Iteration is initialed by setting k=0, vector q0 =[1 0 0 0 0 0 0]t, P0=P, d0=0:
1) Compute the closest point x from X to every point in Pk, and all the point x denoted Xk.
2) Compute registration vector qk matching P to Xk.
3) Apply qk to P, getting Pk+1.
4) Get MSE (mean square error) dk between Pk+1 and Xk.
5) Stop the iteration if dk- dk-1 < ɛ.

During each iteration there are two main sections: looking for the closest point, and
calculating the registration vector, in essence.

2.2 Linear extrapolation

In mathematics, extrapolation is the process of estimating, beyond the original


observation range, the value of a variable on the basis of its relationship with other
existing variables. Registration vectors Q={q1,q2,…, qk} are calculated successively for
minimizing dk between Pk+1 and Xk, that is a typically optimization. Because the
convergence of ICP, registration vector will be more and more close to the expected
value as iteration going, and the variation of them is more and more little. Assuming
502   Accelerated ICP based on linear extrapolation

Figure 1 shows the calculated successively registration vectors, q’3 can be extrapolated
on the basis of q2 and q3.

q5
q3'
q4
q3

q2

q1

Figure 1. Linear extrapolation for q

3 Proposed Method

Two point clouds P and X, with Np and Nx points respectively, initialization with k=1,
vector q0 =[1 0 0 0 0 0 0]t, P0=P, d0=0, 0
θ =0.
1) Compute the closest point x from X to every point in Pk-1, and all the point x denoted Xk.
2) Compute registration vector qk matching P to Xk.
3) Apply qk to P, getting Pk+1.
4) Get MSE (mean square error) dk between Pk+1 and Xk.
5) Stop the iteration if dk- dk-1 < ɛ.
6) Calculate the angle θk between qk and qk-1, the difference vector ∆qk= qk - qk-1.
7) If θk < θk-1 < θk-2 <30˚, extrapolate new qk’.
7.1) Set x=[-(||∆qk||+||∆qk-1||), -||∆qk||, 0], y=[dk-2, dk-1, dk], fitting to y=ax+b, and get v=x=-
b/a, when y=0.
∆qk + ∆qk −1
7.2) Calculate the unit vector dirext = .
∆qk + ∆qk −1
7.3) if v>0, qk’=qk+0.5×v×dirext, Apply qk’ to P, getting Pk+1’.
7.4) Get new dk’ between Pk+1’ and Xk.
7.5) update qk=qk’.
 Accelerated ICP based on linear extrapolation   503

4 Experiments and analysis

The proposed LE-ICP will be verified by two point clouds, as shown in Figure 2
and Figure 3. Stanford bunny is from Stanford Computer Graphics Laboratory; and
leather point cloud is a raw data set captured by our scanner. All experiments are
implemented by Matlab, and run on a laptop with an Intel Core Ducore 2.1 CPU and 3
GB RAM memory. K-nearest neighbor is adopted for searching the nearest point.

Figure 2. Bunny from Stanford

Figure 3. Leather point cloud

All experiment results will focus on the registration accuracy (MSE) and efficiency
(iteration). The angle between two adjacent registration vectors will also be computed.
And all these result will be compared with standard ICP and Acc-ICP.
The Stanford bunny have 8171 points, which is regarded as reference model,
shown in Figure 2. And the data model is got by transforming the reference model with
504   Accelerated ICP based on linear extrapolation

rotation R=[0.8682, -0.3160, 0.3827; 0.4536, 0.8181, -0.3536; -0.2013, 0.4805, 0.8536]
and translation T=[0.0005, 0.0003, 0.0001], and then the MSE between reference
model and data model is 0.0743. Here registries the data model to reference model by
standard ICP, AccICP and LE-ICP respectively. As shown in Figure 4, the original ICP
and AccICP iterate 24 times, and the proposed method run with 18 iterations.
Figure 3 is a raw data set captured from leather with tiny alligator texture, which
have 10815 points, including noises. The raw data is regarded as reference model,
and set the data model by transforming the reference model by rotation R=[0.9474,-
0.1884, 0.2588; 0.2640, 0.9171, -0.2985; -0.1811, 0.3511, 0.9187] and translation T=[0.05,
0,03,0.05], and then MSE between reference model and data model is 5.0815 before
registration. Figure 5 is the the result, the AccICP has plaied a part in accelerating,
and the proposed LE-ICP is only 40 iterations.
The efficiency of registration is very important in applications, and variants
of efficient ICP algorithms have been proposed [4]. Besl also proposed an Acc-ICP
to speed up the method. But in the experiments it won’t be acceleration. For this
problem, LE-ICP is proposed based on linear extrapolation.
The LE-ICP is test on two different point clouds, comparing with original ICP and
AccICP, as shown in Figure 4 and Figure 5. Firstly, the registration accuracy of LE-ICP
is not worse than standard ICP and AccICP. Secondly, the efficiency is better than
standard ICP and AccICP no matter the size of data set. The iteration by LE-ICP is less
20 but larger by standard ICP and Acc-ICP as Figure 4; and the iteration by LE-ICP is
almost the half of by standard ICP as Figure 5.

Figure 4. Result on Stanford bunny


 Accelerated ICP based on linear extrapolation   505

Figure 5. Result on leather data set

5 Discussion and Conclusions

The processing speed is a fatal factor for widely popularizing ICP registration
algorithm. In this paper, a variance of ICP is proposed based linear extrapolation,
which focuses on efficiency of the standard ICP algorithm. It forecasts the direction
of registration’s parameters by the last several parameters, if the angle between them
is less than the threshold. As the experiment result show that, the proposed LE-ICP
is still effective for registration, and then it can decrease the iteration time. So the
experiment results are very good, and it is an available method to speed up ICP.
This work also offer some suggestions for future work about rigid registration
based on ICP, which has been regarded as standard algorithm. For example, how to
improve the ICP to be a online registration algorithm, which is very important on the
various application of 3D scanner.

References
[1] G. Tam, Z.-Q. Cheng, Y.-K. Lai, F. Langbein, Y. Liu, A. Marshall, R. Martin, X. Sun and P. Rosin.
Registration of 3D Point Clouds and Meshes: A Survey From Rigid to Non-Rigid. Visualization
and Computer Graphics, IEEE Transactions on, 2013. 19(7): p. 1199-1217.
506   Accelerated ICP based on linear extrapolation

[2] J. Salvi, C. Matabosch, D. Fofi and J. Forest. A review of recent range image registration methods
with accuracy evaluation. Image and Vision Computing, 2007. 25(5): p. 578-596.
[3] P.J. Besl and N.D. McKay. A method for registration of 3-D shapes. Pattern Analysis and Machine
Intelligence, IEEE Transactions on, 1992: p. 586-606.
[4] S. Rusinkiewicz and M. Levoy. Efficient variants of the ICP algorithm, in 3-D Digital Imaging and
Modeling, 2001. Proceedings. Third International Conference on2001. p. 145-152.
[5] J. Xiong, Q. Wang, H. Wu, B. Ye and J. Zhang. Multi-Signal Intelligence Data Fusion Algorithm
Based on Pca, in Informatics in Control, Automation and Robotics2011, Springer. p. 267-273.
[6] J. Santamaría, O. Cordón and S. Damas. A comparative study of state-of-the-art evolutionary
image registration methods for 3D modeling. Computer Vision and Image Understanding, 2011.
115(9): p. 1340-1354.
[7] H. Pottmann, Q.-X. Huang, Y.-L. Yang and S.-M. Hu. Geometry and convergence analysis of
algorithms for registration of 3D shapes. International Journal of Computer Vision, 2006. 67(3):
p. 277-296.
[8] M. Rodrigues, R. Fisher and Y. Liu. Special issue on registration and fusion of range images.
Computer Vision and Image Understanding, 2002. 87(1): p. 1-7.
[9] J. Yang, H. Li and Y. Jia. Go-ICP: Solving 3D Registration Efficiently and Globally Optimally, in
International Conference on Computer Vision2013.
[10] Y. Liu. Penalizing Closest Point Sharing for Automatic Free Form Shape Registration. IEEE
Transactions on Pattern Analysis and Machine Interlligence, 2011. 33(5): p. 1058-1064.
[11] J.M. Phillips, R. Liu and C. Tomasi. Outlier robust ICP for minimizing fractional RMSD, in 3-D
Digital Imaging and Modeling, 2007. 3DIM’07. Sixth International Conference on2007, IEEE. p.
427-434.
[12] Z. Xie, S. Xu and X. Li. A high-accuracy method for fine registration of overlapping point clouds.
Image and Vision Computing, 2010. 28(4): p. 563-570.
[13] L. Maier-Hein, A. M.Franz, T.R.d. Santos, M. Schmidt, M. Fangerau, H.-P. Meinzer and J.M.
fitzpatrick. Convergent Iterative Closest-Point Algorithm to Accomodate Anisotropic and
Inhomogenous Localization Error, in IEEE Transactions on Pattern Analysis and Machine Interl-
ligence2012. p. 1520-1532.
[14] J. Dong, Y. Peng, S. Ying and Z. Hu. LieTrICP: An improvement of trimmed iterative closest point
algorithm. Neurocomputing, 2014.
[15] D.A. Simon, M. Hebert and T. Kanade. Techniques for fast and accurate intrasurgical
registration. Computer Aided Surgery, 1995. 1(1): p. 17-29
Li-ran PEI, Ping-ping JIANG*, Guo-zheng YAN
Studies of falls detection algorithm based on
support vector machine
Abstract: Some fall detection systems using inertial-sensor based on threshold
algorithm have been proposed so far. But, they all not accurate enough to satisfy
patients. In order to improve the performance of falls detection system, a support
vector machine (SVM) algorithm was proposed in this paper. Firstly, motion data were
collected with a porTable inertial sensing device worn at the patients’ waist. Then,
five eigenvalues were extracted to get more inherent characteristics. Finally, the SVM
classifier was used to mark the suspected falls behaviors, whose parameters were
optimized by the particle swarm optimization (PSO) algorithm. The experimental
results showed that when distinguishing falls and falls-like activities, the accuracy,
false positive rate and false negative rate of the SVM based falls detection algorithm
was 97.67%, 4.0% and 0.67% respectively, while it was only 90.33%, 22.67%, 7.33%
based on threshold under the same condition. The performance improving of the SVM
based falls detection system in this paper is promising in elderly group applications.

Keywords: falls detection; inertial sensors; machine learning; SVM; PSO

1 Introduction

Falls are the leading cause of injuries among the elderly [1], and the risk of falls
increases with aging. Real-time falls detection can not only reduce the physical and
psychological harm among the elderly effectively, but also improve the self-care ability
of them. In recent years, the development of micro-sensors, wireless communication
protocols and machine learning technology have spawned a variety of wearable falls
detection devices [2,3]. The most common ones are those combining inertial sensors
with threshold algorithm [4]. For example, Kangas et al. [5] evaluated three different
falls detection algorithms using accelerometers, and they concluded that the waist
was the best place for falls detection. But even tested at the waist, the highest accuracy
is only 90%. Chin-Feng Lai et al. [6] used a series of triaxial accelerometers combing
with threshold algorithm to detect the damaged parts of the elderly after falls, but the
accuracy of the system was only 70%. It can be concluded that the threshold based
falls detection system is simple and easy to achieve, but its accuracy is unacceptable.

*Corresponding author: Ping-ping JIANG, School of Electrical Information and Electrical Engineering,
Shanghai Jiao Tong University, Shanghai China, 200240, E-mail: jpp99@sjtu.edu.cn
Li-ran PEI, Guo-zheng YAN, School of Electrical Information and Electrical Engineering, Shanghai Jiao
Tong University, Shanghai China, 200240
508   Studies of falls detection algorithm based on support vector machine

And it has limitation in distinguishing falls from falls-like activities, which results in
high false negative rate and false positive rate. Thus, this paper proposed a support
vector machine (SVM) algorithm to enhance the accuracy and reduce the false positive
rate and the false negative rate.
This paper used a triaxial accelerometer and a gyroscope to collect data from
healthy young volunteers simulating falls and daily activities of the elderly. The
system distinguished falls from daily activities using the radial basis function (RBF)
based SVM classifier, and completed the parameters optimization (including penalty
factor C and the parameter g of RBF) with the particle swarm optimization (PSO)
algorithm. Finally, we compared the SVM classification algorithm with the traditional
threshold detection algorithm at two different scenarios: one is distinguishing falls
from simple daily activities; the other is distinguishing falls from falls-like activities.
The accuracy, false positive rate and false negative rate are defined as evaluation
criterion. Eventually, the feasibility and accuracy of SVM based falls detection system
were proved by experimental comparison and analysis.

2 Methods

2.1 PorTable Falls Detection System

This paper developed a waist-mounted falls detection system, which consists of four
parts including a sensor (MPU6050), a microcontroller (STM32F103), a memory unit
(TF card) and a power supply unit (3Ah Li-Pol rechargeable battery). The system block
diagram is shown in Figure 1, where MPU6050 is an MEMS digital sensor integrated
with a three-axis accelerometer, a gyroscope and a digital motion processor (DMP).
The triaxial accelerometer and the gyroscope measured the acceleration and angular
velocity along three orthogonal axes with programmable range fixed at (where g
is the acceleration of gravity:). The external dimensions of the system are with the
weight of 92.5g. The power unit guarantees the system to work 24 hours above.

Power unit
+5V lithium battery

Z
Regulating circuit
X
Y

MEMS sensor I2C Microcontroller SDIO Storage unit


MPU6050 STM32F103 DMA TF card

Figure 1. Diagram of falls detection system


 Studies of falls detection algorithm based on support vector machine   509

The system was worn at the volunteers’ waist during experiment, whose three
axes X,Y and Z correspond to the left to right, back to forth and up to down of the
body respectively, and measures the triaxial acceleration and angular velocity data
with a sampling rate of 100Hz. The DMP transformed the angular velocity data into
quaternion. Finally, the original data of the system consists of three-axis acceleration
and two attitude angles named pitch and roll, whereby the pitch and roll rotates around
the Y axis and the X axis respectively. The core of the system is the microcontroller
which received data from MPU6050 and stored them into TF card for further analysis.

2.2 Data Collection

We recruited 15 healthy young volunteers (10 males and 5 females) for the experiments.
And all of them have learned about the purpose and the content of the experiment
carefully. The experimental equipment contained a falls detection system, a
computer and a 15cm thick wrestling mats. The volunteers simulated falls in the lab,
and completed the daily activities outdoors. During experiment, volunteers wore the
falls detection system as required to complete six kinds of falls simulations and six
daily activities in sequence, as shown in Table 1. We classified falls into recoverable
falls and unrecoverable falls depending on the state of volunteers within one minute
after falls. And we classified daily activities into simple daily activities and falls-like
activities according to the similarity of daily activities and falls. The period we record
each motion was 90 seconds. And each motion was repeated three times. At last, the
MATLAB was used for further analysis and processing.

Table 1. The list of daily activities and falls simulation

No. motion classes explanation

1 lateral falls lying on the ground within 60s after falls


2 lateral falls Standing up within 60s after falls
3 backward falls lying on the ground within 60s after falls
4 backward falls Standing up within 60s after falls
5 forward falls lying on the ground within 60s after falls
6 forward falls Standing up within 60s after falls
7 Simple daily walking
8 Simple daily upstairs
9 Simple daily downstairs
10 Daily like falls Bending over to pick something
11 Daily like falls Sit down and stand up
12 Daily like falls Squat and stand
510   Studies of falls detection algorithm based on support vector machine

2.3 Data Preprocessing and Eigenvalues Extraction

In order to get more realistic characteristic information, this paper processed the raw
data using median filter (n1=5) and mean filter (n2=7) to reduce the pulse interference
and random noise. The original data varies with an unexpected falls. Taking backward
falls as an example, the change curves of all original data are shown in Figure 2. We
can see that the triaxial accelerations and the attitude angles changed apparently
when falls occur. In order to distinguish the falls data and the daily data, we extracted
the following five eigenvalues:
(1) The amplitude of the resultant acceleration reflected the trend of the acceleration
signal during the whole motion, which was denoted as SV.

(1)
(1)
Where were the outputs of the triaxial accelerometer. The SV included two parts
(dynamic acceleration and static acceleration) (1)normally, and it can estimate the
movement intensity. Besides, the static acceleration isgenerally.

(2) Dynamic acceleration SVD was the high (2) frequency part of the SV, which reflects
the acceleration changes. And it can be used to (1)
determine the intensity of impact.
(1)
(2) (2)
(3)
Wherecan be obtained from the original accelerationby using a second order
Butterworth filter with a cut-off frequency is 0.15Hz.
(3)
(2)
(3) The acceleration perpendicular
, to the body
(2) was denoted as BVA, which will
(4)
increase gradually and reach a peak during the falls process.

, (3) (4)
(3)
(3),
(4) In order to characterize the intensity of acceleration, the K was defined to capture
the change amount of BVA through the application of, a sliding window (m=5).
, (4)

(4) (4)

Where N was the data length, and m was the width of the window. The larger, the
faster BVA changes.


(5) The process of falls and falls recovery is always accompanied by significant changes
of pitch and roll. M was used to provide information on the postural orientation of
volunteers, and it was defined as the sum of the absolute value of the variation of
(7)
pitch and roll.

(7)

(7)
(7)
, (4)
 Studies of falls detection algorithm based on support vector machine   511

(5)

Where N was the data length, and w was the width of the window. The larger, the
faster attitude angles change.

(7)

Fig.2 Each original data change curve while falls backwards. (Acceleration & Attitude angle).

Figure 2. Each original data change curve while falls backwards. (Acceleration & Attitude angle).

The Figure 3 showed the obvious changes of each eigenvalues during backward falls,
indicating that the selected eigenvalues can be used to differentiate falls from daily
activities. If the falls are recoverable, the system only needs to record the event, but
does not need for remote help. We also find that the eigenvalue M changed obviously
during the falls recovery process, while the eigenvalues SV, SVD, BVA and K didn’t. It
can be served as an important basis for determining whether the fall is recoverable.

Fig.3 The curves of eigenvalues including SVD, SVD, BVA, K and M during falls backwards.

Figure 3. The curves of eigenvalues including SVD, SVD, BVA, K and M during falls backwards.
512   Studies of falls detection algorithm based on support vector machine

3 SVM based falls detection algorithm

SVM is a machine learning method based on VC dimension theory and structural risk
minimization principle of statistical theory [7,8]. It has unique advantage in solving
the small-sample, nonlinear and high dimensional pattern recognition [12].
We selected 50 samples from each motion data we collected before, so there were
600 samples in total. Then the data were divided into two parts named training set
and test set. Each set contained 300 samples and two categories. And we extracted
five eigenvalues named SV, SVD, BVA, K and M for each sample. Since the data is
linearly inseparable, SVM needs to complete calculation in the low-dimensional
space. Then the inputs was mapped into high-dimensional feature space with kernel
function. Finally, we constructed the optimal separation plane in a high-dimensional
space to achieve non-linear classification of data. The diagrammatic description of
SVM classification algorithm is shown in Figure 4.
The parameters that need to be optimized in SVM are the penalty factor C (C>=
0) and the parameter of RBF g (g>= 0). If selecting parameters only depends on
experience, the results are unsatisfactory. For example, if we choose C and g only base
on experience such as C=2, g=0.1, the classification accuracy was 84.33% (253/300).

Start

Select Predict
training set test set

Select Performance
test set evaluation

Y Need
Normal- Accuracy Y Using PSO to
normalization
ization <95%? optimize C & g
or not
N N
Output the
Select RBF
optimal
as kernel
parameters &
function
accuracy
Get the
classification Ending
model

Figure 4. The implementation process of SVM algorithm


 Studies of falls detection algorithm based on support vector machine   513

Therefore, the PSO algorithm was used to optimize the parameters for improving the
classification performance of SVM. The PSO is a parallel algorithm, which finds the
optimal solution with iteration [9,10]. Eventually, we used fitness to assess the quality
of solution. In experiment, the cognitive learning factor C1, the social learning factor
C2 and the maximum number of the population in PSO algorithm was set to 1.5, 1.7
and 20 respectively. The experimental results were discussed under two different
scenarios below.

3.1 Distinguish Falls from Simple Daily Activities

The results showed that the RBF based SVM classification algorithm can distinguish
falls from simple daily activities completely. The accuracy reached up to 100%, where
the optimal parameters C was 0.62 and g was 37.4.

3.2 Distinguish Falls from Falls-like activities

The results of distinguishing falls from falls-like activities with the same algorithm
above, were shown in Figure 5a, where the category labels in the vertical axis
represents falls when it equals to 1 and daily activities when it equals to -1. And the
fitness variation-curve of PSO in the process of the parameter optimization is shown
in Figure 5b. We can see that there are one falls sample and six daily samples were
misclassified. The overall accuracy of PSO-SVM classification algorithm was 97.67%
(293/300), where the optimal parameters C was 0.57 and g was 88.9.

a) b)

Fig.5a PSO-SVM classification results based on RBF. Fig.5b The fitness curve of PSO parameters optimization.

Figure 5. a PSO-SVM classification results based on RBF; b The fitness curve of PSO parameters
optimization.
(2)
514   Studies of falls detection algorithm based on support vector machine

4 The performance comparison between


(3) threshold algorithm and
SVM algorithm

The training set and the test set we collected before were used to evaluate the
, (4)
performance differences between SVM and threshold algorithm in MATLAB.
Whereby, the optimal threshold value of each eigenvalues in the threshold algorithm
was obtained by combing the experience of Bourke et al. [11] with the curves we get
in experiment, as shown in Table 2. Similarly, the best, classification model of the
SVM algorithm was obtained via the same training set using the PSO algorithm.
Besides, the results of the experiment were also utilized to complete the performance
verification of the two algorithms, where the accuracy, false positive rate and false
negative rate were defined as the performance indexes, as shown in Equation 6.

(6)

Among (7)
them, the true positive events are the number of falls correctly classified;
the true negative events are the number of daily events correctly classified; the false
positive rate events are the number of daily events not correctly classified; and the
false negative rate events are the number of falls not correctly classified.

Table 2. The optimal threshold value of each characteristic

Eigenvalue SV SVD BVA K M

Best threshold value 2.2g 0.9g 1.5g 0.6g 15°

Table 3 shows the overall performance of two algorithms under two scenarios.
Comparing the results of two algorithms, we can find that the overall accuracy of
both threshold algorithm and SVM algorithm were high in scenario A. There are 18
false positive events and 20 false negative events in the test of the threshold, while
all instances were correctly classified in the test of the SVM. Considering there was
a small overlap of eigenvalues between the falls and daily activities in scenario
B, the performance was worse than that in scenario A. However, the advantage of
SVM algorithm was still obvious than that of threshold algorithm. Comparing with
the threshold algorithm whose accuracy was 90.33%, the accuracy of the proposed
algorithm reached up to 97.67%, and the false positive rate of SVM algorithm was
18.67% lower than the former. Besides, the false negative rate of SVM algorithm was
only 0.67%. Therefore, we can concluded that the performance of the RBF based SVM
algorithm was better than that of the traditional threshold detection algorithm.
 Studies of falls detection algorithm based on support vector machine   515

Table 3. The performance comparison between threshold and SVM

Motion type Threshold SVM

Falls and simple daily accuracy 96.67% 100%

false positive rate 6.0% 0%

false negative rate 6.67% 0%

Falls and daily like falls accuracy 90.33% 97.67%

false positive rate 22.67% 4.0%

false negative rate 7.33% 0.67%

5 Conclusion

A SVM-based falls detection system was proposed in this paper. This system can
distinguishes falls from daily activities while collecting data from waist-mounted
measurement kit. PSO algorithm was also applied in this system to help optimize the
parameters of SVM which influences the classification accuracy. The results proved
the feasibility and the superiority of SVM based falls detection algorithm proposed in
this paper. Its usage of SVM-based falls detectors among elders is promising. Our next
study included transplanting the RBF based SVM algorithm onto the falls detection
system for real-time falls detection, and integrating GSM module in the system for
remote help.

References
[1] L Sang-I, C Ku-Chou, L Hsuei-Chen, et al. Problems and falls risk determinants of quality of life
in older adults with increased risk of falling[J]. Geriatrics & Gerontology International, 2015, 15:
579–587.
[2] L. Schwickert, C. Becker, U. Lindemann, et al. Falls detection with body Worn sensors[J].
Zeitschrift für Gerontologie und Geriatrie, 2013, 46(8):706-719.
[3] I N Figueiredo, C Leal, L Pinto,et al. Exploring smartphone sensors for falls detection[J]. Mobile
User Experience, 2016, 5 (1):1-17.
[4] W Yung-Gi, T Sheng-Lun. Falls event detection by gyroscopic and accelerometer sensors in
smart phone [J], Computers and Applications. 2015,37(2):60-66.
[5] M Kangas, A Konttila, P Lindgren, et al. Comparison of low-complexity falls detection algorithms
for body attached accelerometers[J], Gait & Posture, 2008,285-291.
[6] L Chin-Feng, C Sung-Yen, C Han-Chieh, et al. Detection of Cognitive Injured Body Region
Using Multiple Triaxial Accelerometers for Elderly Fallsing[J].IEEE SENSOR JOUTNAL.2011,
11(3):763-770.
[7] Y Xiaowei. Algorithm design and analysis of support vector machine[M]. Beijing: Science
Press,2013.
516   Studies of falls detection algorithm based on support vector machine

[8] O Aziz, M Musngi, J Edward, et al. A comparison of accuracy of falls detection


algorithms(threshold‑based vs. machine-learning) using waistmounted Tri-axial accele-
rometer signals from a comprehensive set of falls and non-falls trials[J]. Medical & Biological
Engineering &Computing, 2016,1-11.
[9] S Xiaowen, S Ziwen, Q Fang.Research on human falls detection based on PSO-SVM & threshold
[J]. Computer Engineering,2016,42(5):317-321.
[10] A Subasi. Classification of EMG signals using PSO optimized SVM for diagnosis of neuro
muscular disorders[J]. Computer in Biology and medicine.2013,43(5):576-586.
[11] A.K Bourke, P Van de Ven, M Gamble, et al. Evaluation of waist-mounted tri-axial accelerometer
based falls detection algorithms during scripted and continuous unscripted activities[J]. Journal
of biomechanics, 2010,43(15): 3051-3057.
[12] F Bianchi, S J. Redmond, M R. Naranyanan. Barometric Pressure and Triaxial Accelerometry-
based Falls Even Detection[J]. IEEE transact- ions on neutral system and rehabilitation
engineering.2010,18(6):619-626.
Ting-ting GUO*, Feng QIAO, Ming-zhe LIU, Ai-dong XU, Jun-nan SUN
Research and Development of Indoor Positioning
Geographic Information System based on Web
Abstract: For the problem of information retrieval about the positioning and
navigation in some large indoor venues, a design scheme was proposed. Indoor
Positioning Geographic Information System can conveniently, readily and accurately
solve the problem of positioning and navigation information retrieval in buildings,
large gymnasiums and other limited spaces. To develop an Indoor Positioning
Geographic Information System with rich functions, this paper introduces a distributed
architecture of geographic information system and achieves a good interactive
indoor positioning and navigation system based on web by design of vector layers,
construction of geodatabase, rendering of maps and processing of dynamic layers. In
the hierarchical processing method, the design method of dynamic layer and static
layer is adopted. An indoor positioning was tested by using the developed Indoor
Positioning Geographic Information. The test results show that system can accurately
process positioning and can process operations of moving, zoom and query with good
interactivity.

Keywords: indoor positioning; geographic information system; web services; map;


layer

1 Introduction

In recent years, with the rapid development of mobile communication, intelligent


terminal and networking technology, maps that are used in daily life is no longer
just presented in the form of paper. As a product of high and new technology, the
positioning map system can be seen everywhere in personal computer and mobile
devices. By positioning map system, users can easily find a route to the destination. A
lot of time and unnecessary expense are saved, which makes positioning map system
widely popular.
Currently, most of positioning services of the applications are outdoor. For
example, smart phone users obtain positioning data through GPS and search the

*Corresponding author: Ting-ting GUO, Faculty of Information and Control Engineering, Shenyang
Jianzhu University (SJZU), Shenyang, China, E-mail: guotingting@sia.cn
Feng QIAO, Faculty of Information and Control Engineering, Shenyang Jianzhu University (SJZU),
Shenyang, China
Ming-zhe LIU, Ai-dong XU, Jun-nan SUN, Shenyang Institute of Automation Chinese Academy of
Sciences, Shenyang, China
518   Research and Development of Indoor Positioning Geographic Information System

destination through Map Google and other tools. With the development of wireless
positioning technologies and the increasing demands of positioning-based services,
the technology of positioning and navigation has been gradually applied to large
indoor space. And indoor positioning services have a greater development [1]. In
Google Maps, you can view and navigate a number of floor plans of commercial
properties (such as airports, department stores, shopping malls, etc.). Currently, its
indoor maps only can be used in some parts of the UK, Switzerland, Canada, the
United States and Japan. A map provides the function of mobile terminal interior map,
which allows users to overview the floors of the department store, business brands
and the information of product type in Beijing, Shanghai and other places. Both at
home and abroad, indoor maps are in its infancy. The complex indoor environment
has gradually become an important place for people’s life and work, which change
people’s way of life and push the positioning service toward a new height through the
function of indoor positioning geographic information system (GIS). For this purpose,
this paper develops a Web-based indoor positioning geographic information system.
It combines with the indoor positioning and navigation technology. The user can
grasp the real-time positioning information of the indoor space, and can more easily
manage and release geographic data.

2 Indoor Real-time Positioning System Architecture

According to the requirements of the specific environment of the position, indoor


wireless positioning system contains positioning-based services / engine, geographic
information systems and mobile targeting data. It is used by the Microsoft Windows
operating system as the platform builds on B / S architecture of distributed wireless
indoor positioning system. The collection of moving target positioning data of the
working process is: the collected raw data of base station is sent to the server. The
data are calculated through the positioning algorithm to calculate the coordinates
of moving targets. The moving object displays on the client application Positioning
Map [2,3]. Geographic information service system of the data processing of the work
process is: Acquisition of raw data by geographic data processing editor and vector
cartography, then geodatabase stores vector maps and geographic data. Data shows
that the work process is: the needs of users in the client release the map, while the
rendering tools render the map.
The traditional way of geographic data storage is to store the geographic data
in a folder, making it difficult to add or improve the data. To solve this problem, the
geographic data is stored in a geographic database by changing the value of the call
to change the map. It is allowed users to easily modify geographic data, which is the
key to this article. Indoor wireless positioning system structure is shown in Figure 1.
 Research and Development of Indoor Positioning Geographic Information System    519

Figure 1. Structure of indoor wireless positioning system.

3 Indoor Map Building

3.1 Geospatial Data and Its Relationship

In GIS, geospatial data according to the geometric characteristics can be divided into
4 basic types: point, line, face and body [4,5]. The true image of geographical entity
includes not only the positioning, shape, size and property of the entity, but also it is
reflected the relationships between entities. It refers to the adjacency relationships,
associations and inclusion relationships. These relationships are maintained with
a graphical deformation under continuous state, but the relationships of graphic
nature remain unchanged.

3.2 Indoor Map File Format

Geographic data is represented in two ways including vector and raster expression.
Data structure is generally divided into data structure based on vector model and
data structure based on raster model. Vector data is structure oriented object, namely
each of which is directly endowed with the positioning and attribute information and
the statement of the topological relationship between objectives [6]. Raster data is
positioning-oriented structure. Any point on the plane space is directly related to a
520   Research and Development of Indoor Positioning Geographic Information System

certain or a class of objects. It cannot completely build the topological relationship


in objects [7,8].
The most common map file is vector graphics, as the most commonly used type of
graphics, having the advantages of zoom in or out and without distortion. Complete
SHP data includes a vector map file (.SHP) that is used to record the coordinate
information of spatial object, a file (.DBF) for storing non-spatial attribute data and
an index file (.SHX) used to connect spatial and non-spatial data.

3.3 The Production Process of Indoor Map

The key of the development of this system that can be carried out smoothly is
the drawing of indoor map. It makes indoor map meet the requirements of the
application platform and facilitate test late for the maintenance and optimization.
Therefore, before making the map, to be clear what should be shown on the map,
through which type to show, such as points, lines, polygons, etc. After the above
problems are defined, the raw data format of the indoor place is converted into
vector map format to make the indoor map. There are several types of map layers:
The layers of citing a group of data are feature layers; the layers of citing grid map
or image as its data source are raster layers; basemap layers can provide a high-
performance display of basemap content. The basic flow chart of indoor positioning
map is shown in Figure 2.

Figure 2. Basic process of indoor positioning map.

4 Design of Indoor Positioning Geographic Information System

Indoor geographic information of real time location system (RTLS) [9] is mainly
divided into three parts: the collection of geographic data, the editing and storage of
geographic data and the display of geographic data.
 Research and Development of Indoor Positioning Geographic Information System    521

4.1 Geodatabase

The geodatabase is the hardware and software system, which organize and manage
geographic data by computer database technology [10]. In order to facilitate the
management and maintenance of the map, the individual layers of indoor maps
are stored in the geodatabase, while the spatial data and its corresponding feature
attribute data are stored in the database management system (DBMS). Geographic
database adopts open source structure to store the feature class in a Table of the DBMS.
Each record in the Table corresponds to one element, which can be put together by a
group of elements with the same spatial reference, facilitating the organization and
management of data. Because the storage structure of the geodatabase is the feature
object, which corresponds to the conversion between objects. It is the most direct
way for geodatabase data conversion. A collection of elements forms a layer, which
represents a class of map information, such as corridors, bathrooms, etc. Although
these layers are stacked to display, they are not covered with each other. Adopting
map layer in a hierarchical manner can be called a layer when needed, applying to
a layer of the need for a variety of occasions. In this system, a data Table stored in a
geographic database is shown in Table 1.

Table 1. Data Table of Geodatabase

data Table owner notes

sia_border postgres Linear layer


sia_hallway postgres Surface layer
sia_lift postgres Surface layer
sia_room postgres Surface layer
sia_skylight postgres Surface layer
sia_stairs postgres Surface layer
sia_toilet postgres Surface layer
sia_water_room postgres Surface layer
spatial_ref_sys postgres Spatial coordinate system

4.2 Rendering and Release

1. Rendering of indoor positioning map: Since the exporTable data of the drawing
software is the black and white map, the map must be rendered before the map
is called and released to produce the good localization display effect. Rendering
can make the layer clearer and more realistic, so that the whole system has the
best display effect. MapServer is used to extract the information of the map
from the database to render in this system. Its core is MapFile that organize a
522   Research and Development of Indoor Positioning Geographic Information System

variety of map elements into an object hierarchy of the system. Data formats, user
interaction, and the support of OGC protocol are defined in MapFile.
2. Release of indoor positioning map: The system is required to achieve map browsing
functions in a Web browser, which can be achieved to zoom in or out, pan and
other common operations. To this end, the map is released by OpenLayers.
OpenLayers is a JavaScript package for the development of WebGIS client. It can
not only achieve the above functions, but also can be carried out on the selection
of the line, the selection of surface, the choice of elements, layer overlay and other
different operations, which have more extended functions. Taking a building as
an example, its length is about 500m and the width is about 300m, drawing out
the vector map is shown in Figure 3. Taking the mobile device as a test platform,
loading the indoor positioning map, the basic effect of browsing the map is
achieved by OpenLayers as shown in Figure 3. Zooming in on the map, the effect
is shown in Figure 4. In addition, you can also drag and drop the positioning map
and other operations.

Figure 3. Indoor positioning map (original).

Figure 4. Indoor positioning map (enlarged).


 Research and Development of Indoor Positioning Geographic Information System    523

4.3 Hierarchical design of map layers

Due to the complexity of the function of client, involving a positioning base layer,
camera layer, map layer, alarm layer and target layer, the client that displayed map
is required layers to exist at the same time without any interference. It is necessary
to design the hierarchical map. Positioning map is performed in real time, judging
whether the layer needs to be repainted every moment. In order to increase the
efficiency of the work, the entire map is not updated every time and only needed
to update the individual layer. The structure of positioning map system is shown in
Figure 5.

Figure 5. The structure of positioning map system.

Generally, map layer is not needed to change, which is stored in the geographic
database by the ArcGIS drawing tool, through the rendering to the terminal released.
Since the target of the positioning is updated every time, the target layer is needed to
update the data in time. The GIS tool could not meet the requirements, which can be
completed by the SharpMap API.
Due to the constant update of the target layer, the re-rendering of the layer will
cause the display of the map appearing on the Web page. To solve this problem, it
is required to cite the Canvas technology to draw these real-time updated layers.
The <canvas> tag is used to define the graphics, such as charts and other images. In
order to display vector graphics through the client, HTML5 is introduced the Canvas
element, which can show a drawing API to the client. So the script can be drawn what
you want to draw on a canvas.
Most of Canvas drawing API are not defined on the <canvas> element itself,
but instead are defined on a “drawing environment” that is obtained by using the
getContext () method of canvas. Canvas API also uses the path representation, but the
path is defined by a series of method calls, rather than using a string of letters and
numbers, such as calling beginPath () and arc ().
524   Research and Development of Indoor Positioning Geographic Information System

4.4 The client of indoor positioning map system

In this paper, the positioning of the geographic information system mainly includes the
following three functional modules: the basic function, the expansion of the function and
the positioning of the function [11]. Basic functions include the zoom and movement of
the map. The realization of these functions mainly relies on OpenLayers. The expansion
of the functions, which include the application of the mouse wheel and the regional alarm
function, are realized through the JavaScript. Alarm function is used in some special areas
of the map, when the target moves to these areas by changing the color of the label to
carry out regional alarm. The positioning of the function includes maps display, real-time
monitoring, and switching maps. The client is made up of several layers. Map display
function will be displayed different numbers of the layers according to the needs of the
different layers. The function of real-time monitoring can monitor the real-time situation
of the target. Switching map function can be set according to the needs of multiple layers,
which can be convenient for users to choose different real-time positioning space. The
client structure of positioning map system is shown in Figure 6.

Figure 6. Client structure of positioning map system.

The client of indoor positioning as a key part of the whole geographic information
system determines the accuracy of the positioning display. According to the demand
of application, the program of the client mainly includes user management, interactive
operation of map, real-time positioning monitoring system, personnel information
management and positioning equipment management and other functional modules.
Different modules are corresponded to different sub-pages. Among them, the whole
system is controlled by the login management module of user to ensure the security
of the system. The interactive operation of map and real-time positioning monitoring
module mainly display the results, which can zoom in or out, move and other operations.
The module of personnel information management and positioning equipment
management mainly complete the current information on the positioning of personnel
or assets, as well as the add, delete and modify operations of the personnel and asset
positioning information. Each module structure of the client is shown in Figure 7.
 Research and Development of Indoor Positioning Geographic Information System    525

Figure 7. Each module structure of the client.

5 Experimental Verification

The research shows that this program can be applied to the design of the whole indoor
positioning geographic information system, which is proved that the design of this
system can be applied to the specific environment. This system with a web browser
as running environment will eventually be displayed through rendering program from
the geographical database data. The laboratory as the experimental scene is shown
in Figure 8. In the hall, four CSS base stations are used for precise positioning of the
test. Test using the refresh cycle is 1s, and the positioning system can be based on the
information collected to redraw the positioning of each point in the map after each 1s.
Through the experiment, it is proved that the indoor positioning geographic information
system has been put into practical application, which has a good interaction.

Figure 8. The client application of positioning system.


526   Research and Development of Indoor Positioning Geographic Information System

6 Conclusion

According to the needs of large venues for indoor positioning geographic


information system, this paper studies the concept model and design method of
indoor positioning geographic information system based on Web. Development
of indoor positioning geographic information system makes full use of the GIS
software in the Web environment for the map layer processing. It combined with
the geographic database, data collection of network and other technologies. At
the same time it is combined with the indoor positioning technology, users can
grasp the real-time positioning information of the indoor space so as to enhance
the interaction between users and maps. Now it can be released and used through
repeated experiments and tests. In the next period of time, the indoor positioning
of geographic information systems will be added to more functionality, so that this
system can be applied in more places to meet the different needs of users.

Acknowledgment: The project is supported by the Ministry of industry and


information technology (Research on key application criteria of digital workshop in
the field of sensor manufacturing).

References
[1] L. Zhu, A. Yang, D. Wu, and L. Liu, “Survey of indoor positioning technologies and systems,”
International Conference on Life System Modeling and Simulation and International
Conference on Intelligent Computing for Sustainable Energy and Environment, Springer Berlin
Heidelberg, pp.400-409, September 2014.
[2] L. M. Ni, D. Zhang, and M. R. Souryal, “RFID-based localization and tracking technologies,”
IEEE Wireless Communications, Vol.18(2), pp. 45-51, 2011.
[3] Z. Deng, Y. Yu, X. Yuan, N. Wan, and L. Yang, “Situation and development tendency of indoor
positioning,” China Communications, vol. 10(3), pp. 42-55, 2013.
[4] Q. Q. Xu, “A kind of geographic information system application based on flex API of ArcGIS
Server,” 2010 3rd International Conference on Advanced Computer Theory and Engineering
(ICACTE). IEEE, vol.3, pp. V3-246-V3-249, August 2010.
[5] K Chang, “Introduction to geographic information systems,” Boston: McGraw-Hill Higher
Education, 2006.
[6] M. Bertolotto, M. J. Egenhofer, “Progressive transmission of vector map data over the world
wide web,” Geoinformatica, vol. 5(4), pp. 345-373, 2001.
[7] D. Zhang, J. Ma, Q. Chen, and L. M. Ni, “An RF-based system for tracking transceiver-free
objects,” Fifth Annual IEEE International Conference on Pervasive Computing and Communi-
cations (PerCom’07). IEEE, pp.135-144, March 2007.
[8] M. J. Liang, H. Q. Min, and R. H. Luo, ”Graph-based SLAM: a survey,” Robot, vol. 35(4), pp.
500-512, 2013.
[9] D. Ghosh, R. Guha, “What are we ‘tweeting’about obesity? Mapping tweets with topic
modeling and Geographic Information System,” Cartography and geographic information
science, vol. 40(2), pp. 90-102, 2013.
 Research and Development of Indoor Positioning Geographic Information System    527

[10] R. O. Obe, L. S. Hsu, “PostGIS in action,” Manning Publications Co., 2015.


[11] S. Mazuelas, F. A. Lago, J. Blas, and A. Bahillo, “Prior NLOS Measurement Correction for
Positioning in Cellular Wireless Networks,” IEEE Transactions on Vehicular Technology, vol.
58(5), pp. 2585-2591, 2009.
Chun FANG*, Man-feng DOU, Bo TAN, Quan-wu LI
Harmonic Distribution Optimization of Surface Rotor
Parameters for High-Speed Brushless DC Motor
Abstract: Surface rotor parameters of high-speed brushless DC motor, such as pole-
arc, salient pole ratio, permanent magnet thickness and air-gap length affect the
harmonic distribution of air-gap flux, back-electromotive force and winding current,
eventually the signal-to-noise ratio of sensorless control and rotor eddy current loss.
Thus it is necessary to carry out the research to optimize these rotor parameters.
The exact subdomain method and finite element method are adopted in this paper,
to investigate the air-gap flux and back-electromotive force harmonic distribution
impacted by pole-arc and salient pole ratio. Rotor eddy current loss and winding
current harmonic with various magnet thickness and air-gap length are studied also.
Analytical and FEM results indicate that 3rd harmonic back-EMF increased with pole-
arc, and the growth of 3rd harmonic back-EMF is inversely proportional to the increase
of salient pole ratio. Winding current high order harmonics raise significantly when
air-gap length is equal to magnet thickness. According to conclusions a 2.5kW
45000rpm high-speed BLDCM is designed and tested. Experimental results are
consistent with the analytical and FEM results.

Keywords: High-speed brushless DC motor; pole-arc; salient pole ratio; harmonic


distribution; eddy current loss

1 Introduction

High-Speed brushless DC motor (BLDCM) is widely used in aerospace and precision


Manufacturing for its high power density and efficiency [1,2]. Due to the influence
caused by high speed and high loss density on rotor position sensors [3], sensorless
control method is suitable in high-speed BLDCM control. The most popular sensorless
control are based on the phase back- electromotive force (EMF) [4,5] and 3rd harmonic
back-EMF [6]. In 3rd harmonic back-EMF sensorless control, rotor positions can be
acquired by detecting zero crossings of 3rd harmonic back-EMF waveform, or by
integrating it and then comparing the result with the actual 3rd flux linkage. The
advantage of the 3rd harmonic back-EMF is the insensitive to winding parameters such
as the resistance and inductance [7,8]. However, the phase back-EMF is determined by

*Corresponding author: Chun FANG, School of Automation, Northwestern Ploytechnical University,


Shaanxi Xi’an China, E-mail: chunfang@mail.nwpu.edu.cn
Man-feng DOU, Bo TAN, Quan-wu LI, School of Automation, Northwestern Ploytechnical University,
Shaanxi Xi’an China
 Harmonic Distribution Optimization of Surface Rotor Parameters   529

rotor pole arc [9-11] and PM magnetization [12], thereby the harmonics back-EMF can
be influenced. Surface permanent magnet with radial magnetization generates the
trapezoidal phase back-EMF that contains 3rd harmonic component. Surface mounted
(SPM) and surface inset permanent magnet (SIPM) are the most popular among
surface PM structures. Compared with SPM that SIPM is more mechanically robust
for its embedded PM into the surface of rotor core, which makes the salient pole ratio
differently. Consequently the q-axis inductance Lq of SIPM is larger than Ld. The
salient effect of SIPM is stronger than SPM, since Lq of SPM is approximately equal
to Ld. Analytical methods of magnetic field methods [13-18] are flexible to compare
different motor topologies. Especially the subdomain model [15-18] is able to analyze
the magnetic flux distribution of rotor configurations by applying different boundary
conditions.
High-speed motor air-gap length is normally greater than 1mm and air-gap in low
and medium speed motors is usually less than 1mm. Unlike low and medium speed
motors, in high-speed motors the magnet thickness is not much larger than air-gap
length. Then the adjustment of that ratio has significant affection on the distribution
of air-gap flux density and stator winding current. Rotor eddy currents loss in PM
motors are caused by the air-gap permeance generated from stator slot-opening, the
spatial and time harmonics in stator winding current [19,20]. Since the air-gap length
in high-speed motor is large then the air-gap permeance influence can be neglected.
In practical motor designing, the outer size of motors are predetermined, so the
air-gap length is limited. However the ratio of magnet thickness and air-gap length is
quite flexible, various configurations have different impact on the rotor eddy current
loss which needs to be investigated.
The subdomain model and finite element method (FEM) are adopted to analyze
harmonic distribution of the air-gap flux density and back-EMF influenced by pole
arc ratio and salient pole ratio. The affection of various combination of magnet
thickness and air-gap length that on stator winding current and rotor eddy current
loss is investigated as well. Based on analysis result a prototype high-speed BLDCM
is designed.

2 Analytical Model

The back-EMF and its harmonics of BLDCM are proportional to motor speed, air-gap
flux density and winding coefficients, then the 3rd harmonic back-EMF can be
expressed as equation (1). Higher 3rd harmonic can improve the signal-to-noise ratio
(SNR) in sensorless control.
530   Harmonic Distribution Optimization of Surface Rotor Parameters

Figure 1. Geometry of High-Speed BLDCM

E3 ∝ ωr Bδ 3 k3

(1)

Where E3 is the 3rd harmonic back-EMF amplitude, ωr is the electric angular speed,
Bδ3 is the amplitude of the 3rd harmonic air-gap flux density, k3 is the 3rd harmonic
winding coefficients.
Since the skewed slot is not considered and the winding is full-pitch and single
layer, (1) can be simplified as:

E3 ∝ ωr Bδ 3 (2)
The geometry of the high-speed BLDCM is shown in Figure 1. The motor is divided into
4 subdomains: the PM subdomain, the air-gap subdomain, the slot opening subdomain
and the slot isthmus subdomain. Geometrical parameters are: the inside radius of the
permanent magnet R1,the outside radius of the permanent magnet R2, the inner stator
radius R3, the slot-opening radius R4 and the stator slot bottom radius R5. Pole arc ratio
is αp. p is the number of poles. Current density is Jj in each slot. δ is slot opening angle
and β is the slot angle. The ratio of magnet thickness and air-gap length is defined as SMA.

SMA= (R2-R1)/ (R3-R2) (3)

The following assumptions are made in order to simplify the problem:


–– Winding end effects are neglected.
–– Magnets radially magnetized with a relative recoil permeability.
–– Stator and rotor cores are infinitely permeable, magnetic saturation is ignored.
–– Winding current is distributed evenly in stator slots.

The analysis is operated with 2-D polar coordinates. The magnetic vector potential
depends on r and θ coordinates. Magnetic vector potential in subdomains are defined
as below:
Aj=Aj(r,θ)ez,, for the jth stator slot isthmus subdomain
Ai=Ai(r,θ)ez,, for the ith stator slot-opening subdomain
Aag=Aag(r,θ)ez,, for the air-gap subdomain
Apm=Apm(r,θ)ez,, for the PM subdomain
ez is the unit vector along the z axis.
 Harmonic Distribution Optimization of Surface Rotor Parameters   531

2.1 Subdomain Model

1. Governing Partial Differential Equations


The polar coordinates (r,θ ) is fixed on stator. The slot that located at (r, 0°) is defined
as the 1st slot, the angular position of its slot-opening and slot isthmus are defined as:
 2iπ β
 = θi −
Qs 2

θ = 1
θi − (δ − β )
 j 2 (4)

The position of kth PM is defined as:


αp kπ
θk = − + +∆
2 p (5)

Where k=1,2,…2p; Δ is the rotor angular position, Δ=0°in Figure 1.


With assumptions, in polar coordinates Poisson equations in PM and slot isthmus
subdomains are expressed as (6) and (7), Laplace equation in air-gap and slot opening
subdomains are (8) and (9).
 ∂ 2 Apm 1 ∂Apm 1 ∂ 2 Apm µ ∂M
 0 r
+ + =
 ∂r 2 r ∂r 2
r ∂θ 2 r ∂θ

 R1 ≤ R ≤ R2
 θk ≤ θ ≤ θk + α p
 (6)

 ∂ 2 A j 1 ∂A j 1 ∂ 2 A j
 + + − µ0 J j
=
 ∂r 2 r ∂r r 2 ∂θ 2

 R4 ≤ R ≤ R5
θ ≤ θ ≤ θ + δ
 j j
(7)

 ∂ 2 Aag 1 ∂Aag 1 ∂ 2 Aag


 + + 0
=
 ∂r 2 r ∂r r 2 ∂θ 2

 R2 ≤ R ≤ R3
0 ≤ θ ≤ 2π
 (8)

 ∂ 2 Ai 1 ∂Ai 1 ∂ 2 Ai
 2 + + 0
=
 ∂r r ∂r r 2 ∂θ 2

 R3 ≤ R ≤ R4
θ ≤ θ ≤ θ + β
 i i (9)
532   Harmonic Distribution Optimization of Surface Rotor Parameters

Where Mr is the radial component of the magnetization and subscripts pm, j, ag and
i are used for the quantities in the permanent magnet region, slot isthmus region,
air-gap region and slot opening region respectively.

2. Boundary Conditions
Boundary conditions in air-gap subdomain, slot opening subdomain and slot
isthmus subdomain of SIPM and SPM are same. However, boundary conditions for
the PM subdomain are different, the location between PM poles is rotor core in SIPM
and air-gap in SPM, then differences on permeance of magnetic circuits are caused.
With the assumption of stator and rotor iron cores are infinitely permeable, boundary
conditions and continuity of SPM and SIPM in PM subdomain can be expressed as:
∂Apm
 =0
 ∂r r = R1

 Apm ( R2 , θ ) = Aag ( R2 , θ )
(10)
∂Apm
 = r (−1)k Br
 ∂θ θ =θ k

 ∂Apm
 = r (−1)k Br
 ∂θ θ= θ k +α p

 ∂Apm
 =0
 ∂r r = R1

 Apm ( R2 , θ ) = Aag ( R2 , θ )
(11)

Boundary conditions and continuity in slot isthmus subdomain can be expressed as


equations (12).
∂A j
 =0
 ∂θ θ =θ j

∂A j
 =0
 ∂θ θ= θ j +δ

 ∂Ai
 ∂A j  whenθi ≤ θ ≤ θi + β
 =  ∂r r = R4
 ∂r r = R 
 4
 0
 ∂A
 j =0
 ∂r
 r = R5
 A ( R ,θ ) = A ( R ,θ ) (12)
 j 4 i 4
 Harmonic Distribution Optimization of Surface Rotor Parameters   533

Boundary conditions and continuity in slot-opening subdomain can be expressed as


equations (13).
∂Ai
 =0
 ∂θ θ =θi

 ∂Ai =0
 ∂θ
 θ= θi + β
 A ( R ,θ ) = A ( R ,θ )
 i 4 j 4
 Ai ( R3 , θ ) = Aag ( R3 , θ ) (13)

Boundary conditions and continuity in air-gap subdomain can be expressed as
equations (14).
∂Aag
 = f (θ )
 ∂r r = R3

 ∂Aag
 ∂r = g (θ )
(14)
 r = R2
Where
∂Ai
 , θi ≤ θ ≤ θi + β
f (θ ) =  ∂r r = R3

0, else (15)
 1 ∂Apm
 ,θ k ≤ θ ≤ θ k + α p
g (θ ) =  µr ∂r r = R2

0, else (16)

3. General Solution
The magnetic vector potential in slot isthmus subdomain can be written as:
1 1
A j (r ,θ ) =A0j + µ0 J j ( R52 ln r − r 2 )
2 2
j δ R2 Pmπ /δ ( r , R5 )

+ ∑ [ Am ⋅ ]
m =1 mπ Emπ /δ ( R4 , R5 )
mπ 1
⋅ cos{ [θ − θi − ( β − δ )]}
δ 2 (17)
j
Where m is a positive integer. The coefficients Am is determined by Fourier series
(a, b) (a / b)c + (b / a)c , Ec=
expansion of ∂Ai / ∂r , and Pc= (a, b) (a / b)c − (b / a)c .
Due to different boundary conditions of PM subdomains, solutions are different
either [17,18], so the coefficient expressions and calculation of SIPM is different from
SPM.
534   Harmonic Distribution Optimization of Surface Rotor Parameters

The radial and tangential of magnetic flux density can be deduced from the
magnetic vector potential.

1 ∂A ∂A
Br (r , θ ) = Bθ (r , θ ) = −
r ∂θ ∂r (18)
B(r , θ ) Br (r , θ )er + Bθ (r , θ )rθ (19)
=

2.2 Back-EMF

For the 2 pole motor, the 3rd magnetic flux linked by each coil can be expressed as:
δ /2
=φ3 (t ) Rav La ∫ [ Brj3 (r ,θ , t ) + Bθj3 (r ,θ , t )]dθ
−δ /2 (20)

Where La is the axial length of the motor, Rav is the average slot isthmus radius, θ p
is coil pitch.
According to Faraday’s law, the 3rd harmonic back-EMF induced in the stator
winding is expressed as:
e3 (t ) = N sωr pφ3 (t ) (21)

Where Ns is the number of turns of each phase.

2.3 Eddy current loss

In PM subdomain, eddy current density axial component satisfies with Helmholtz


equation:
∂ 2 J zpm 2
1 ∂J zpm 1 ∂ J zpm
+ +
∂r 2 r ∂r r 2 ∂θ 2
− jnωσ m µ0 µm J zpm = 0 (22)

Rotor eddy current loss can be acquired according to Poynting theorem:


1 → →
=pre
2 ∫
Re( E× H )dS (23)
s
S is the surface of the computed region.

3 Analysis And Simulation Results

Finite element method(FEM) is adopted to calculate the air-gap flux density


distribution and phase back-EMF of SIPM and SPM.
 Harmonic Distribution Optimization of Surface Rotor Parameters   535

3.1 A. Air-gap flux density & EMF

Radial and tangential components of the air-gap flux density for no load condition of
SPM and SIPM (αp =0.85) are shown in Figure 2.
0.8

0.6
Flux Density (T)

0.4

0.2
FEM
Analytic
0
0 60 120 180 240 300 360
Rotor Position (°)

(a) Flux density radial component of SPM


0.8

0.6
Flux Density (T)

0.4

0.2
FEM
Analytic
0
0 60 120 180 240 300 360
Rotor Position (°)
(b) Flux density radial component of SIPM
0.25
FEM
0.2 Analytic
Flux Density (T)

0.15

0.1

0.05

0
0 60 120 180 240 300 360
Rotor Position (°)
(c) Flux density tangential component of SPM
0.25
FEM
0.2 Analytic
Flux Density (T)

0.15

0.1

0.05

0
0 60 120 180 240 300 360
Rotor Position (°)
(d) Flux density tangential component of SIPM

Figure 2. αp =0.85, radial and tangential components of the air-gap flux density for no load
condition of SPM and SIPM
536   Harmonic Distribution Optimization of Surface Rotor Parameters

As we can see from the Figure 2, air-gap flux radial/tangential components of both
SPM and SIPM demonstrate consistency. The average values are 0.558T and 0.529T
respectively and SPM is higher. On the other hand, the q-axis permeance of SPM is
small because of the salient pole ratio is 0. Quite the oppoisite, q-axis permeance of
SIPM is smaller. As Figure 2c and 2d show, tangential component summit of SPM is
0.18T and SIPM is 0.21T, average value of tangential component are 0.058T and 0.060T
respectively, tangential component summit of SIPM is higher than that of SPM for 15%
Air-gap flux radial/tangential components of SPM and SIPM for load condition
(αp =0.85) are shown in Figure 3.

0.8

0.6
Flux Density (T)

0.4

0.2
FEM
Analytic
0
0 60 120 180 240 300 360
Rotor Position (°)

(a) Flux density radial component of SPM


0.8

0.6
Flux Density (T)

0.4

0.2
FEM
Analytic
0
0 60 120 180 240 300 360
Rotor Position (°)
(b) Flux density radial component of SIPM
0.25
FEM
0.2 Analytic
Flux Density (T)

0.15

0.1

0.05

0
0 60 120 180 240 300 360
Rotor Position (°)

(c) Flux density tangential component of SPM

Figure 3. αp =0.85, radial and tangential components of the air-gap flux density for load condition of
SPM and SIPM
 Harmonic Distribution Optimization of Surface Rotor Parameters   537

0.25
FEM
0.2 Analytic
Flux Density (T)

0.15

0.1

0.05

0
0 60 120 180 240 300 360
Rotor Position (°)

(d) Flux density tangential component of SIPM


Figure 3. αp =0.85, radial and tangential components of the air-gap flux density for load condition of
SPM and SIPM

Air-gap flux radial components of structures for load condition are shown in Figure 3a
and 3b with the average values are 0.549T of SPM and 0.545T of SIPM. While tangential
components display differnence at q-axis significantly, summit of SPM is 0.172T and
that of SIPM is 0.212T with the difference is 18.9%.
Fourier analysis is used to acquire the 3rd harmonic of air-gap flux density with
various αp for load condition. The analysis result is shown in Figure 4.

0.2
SIPM
Air-gap flux density (T)

SPM
0.15

0.1

0.05

0
0.7 0.75 0.8 0.85 0.9 1
αp

Figure 4. 3rd harmonic of air-gap flux density of SIPM and SPM with different αp for load condition

As Figure 4 shows, αp=0.7, 0.75, 0.8, 0.85, 0.9 and 1 respectively. The 3rd harmonic back-
EMF of SIPM and SPM show a trend of uprise on 3rd harmonic air-gap flux density with
the αp rised from 0.7 to 1. The 3rd harmonic air-gap flux density amplitude of SPM is
higher than that of SIPM. The 3rd harmonic reaches 0.15T the peak as the pole arc ratio
is 1. When the pole arc ratio is less than 0.8, the air gap flux density waveform is close
to sinusoidal wave, and the 3rd harmonic decreases obviously.
Also the fourier analysis of 3rd harmonic EMF of SIPM and SPM is presented in
Figure 5. The 3rd harmonic back-EMF also increases with the pole arc which is similar
to 3rd harmonics of air-gap flux density. With the same αp, the 3rd harmonic back-EMF
amplitude of SMP is higher than that of SIPM. The 3rd harmonic back-EMF reaches
27.2V the peak when the pole arc ratio is 1.
538   Harmonic Distribution Optimization of Surface Rotor Parameters

30
SIPM
SPM
Amplitude (V) 20

10

0
0.7 0.75 0.8 0.85 0.9 1
αp

Figure 5. 3rd harmonic back-EMFs of SIPM and SPM with various αp

3.2 Eddy current loss

Fourier analysis results of air-gap flux density, back EMF and load winding current
with various SMA values are shown in Figure 6, Figure 7 and Figure 8 respectively.

0.5
SMA=3
0.4 SMA=2.5
SMA=2
Amplitude (T)

0.3
SMA=1.5
0.2 SMA=1

0.1

0
1 3 5 7 9 11
Harmonic Order

Figure 6. Air-gap flux density harmonics with variety of SMA (αp=0.85)

200
SMA=3
SMA=2.5
150
Amplitude (V)

SMA=2
100 SMA=1.5
SMA=1
50

0
1 2 3 4 5 6
Harmonic order

Figure 7. Back-EMF harmonics with various SMA (n=45000rpm)


 Harmonic Distribution Optimization of Surface Rotor Parameters   539

15
SMA=3
SMA=2.5
Amplitude (A)

10 SMA=2
SMA=1.5
SMA=1
5

0
1 3 5 7 9 11
Harmonic order

Figure 8. Load winding current harmonics with various SMA

As Figure 8 shows, fundamental component of load current is increased as well as the


SMA value is decreased, high order harmonics such as 5th, 7th and 11th harmonic shows
a growing trend also.
As Figure  9 shows, both of the rotor eddy current loss and hysteresis loss are
increased while the SMA is decreased, and the eddy current loss is more sensitive than
hysteresis loss since eddy current loss is increased faster.

1
Eddy current loss
Hystersis loss
0.8
Losses (W)

0.6

0.4

0.2

0
3 2.5 2 1.5 1
SMA

Figure 9. Eddy current and hysteresis loss of rotor with various SMA (n=45000rpm)

4 Experimental Investigation

A 2.5 kW, 45000rpm high-speed BLDCM is designed based on harmonics distribution


of back EMF and eddy current. Main parameters of the motor are listed in Table 1. And
the prototype is shown in Figure 10.
Also the harmonic distribution of winding current for load condition is shown in
Figure 11, the distribution is matched with analytical result.
The sensorless control is based on 3rd harmonic flux linkage estimation. The
3 harmonic back-EMF, estimated 3rd harmonic flux, estimated rotor flux and the
rd

synthetic flux signal are sequentially displayed in Figure 12 respectively. Test result of
no load loss is shown on Figure 13.
540   Harmonic Distribution Optimization of Surface Rotor Parameters

Figure 10. Prototype of high-speed motor

Table 1. Specification of High-speed BLDCM

Quantity Unit Value

Power kW 2.5
Supply Voltage VDC 270
Max. Current Arms 45
Toque Nm 0.7
Rated speed rpm 45,000
Max. Speed rpm 60,000
Inductance mH 0. 37
PM Structure Surface Mounted
Pole Ratio - 0.85
SMA 2.2
Pole-slot number - 4pole-12slots
Core Material - 270WW20
Magnet Material - Sm-Co 2:17
Enclosure Material - Carbon Fiber

12

10

8
Amplitude (A)

0
1 3 5 7 9 11
Harmonic order

Figure 11. Load winding current harmonic distribution


 Harmonic Distribution Optimization of Surface Rotor Parameters   541

20

-10

-20
0.493 0.494 0.495 0.496 0.497 0.498
150
100
50
0
0.493 0.494 0.495 0.496 0.497 0.498
400

200

0
0.493 0.494 0.495 0.496 0.497 0.498
400

200

0
0.493 0.494 0.495 0.496 0.497 0.498
Time (s)

Figure 12. 3rd harmonic back-EMF and flux

150

100
Loss (W)

50

0
0 10000 20000 30000 40000 50000 60000 70000
n (rpm)

Figure 13. No load loss of prototype


542   Harmonic Distribution Optimization of Surface Rotor Parameters

5 Conclusion

The air-gap length of high-speed BLDCM is normally large, that motor performance
is more sensitive to rotor parameters. Subdomain model and FEM are adopted in
this paper to study relative influences on flux density, back-EMF and eddy current
loss caused by rotor parameters. Pole arc, salient pole ratio, magnet thickness and
air-gap length are considered. Conclusions are as follows :
1. Once the air-gap length and magnet thickness are determined, harmonics of flux
density and EMF are propotional to αp. Due to the salient pole, the air- gap flux
density tangential component of SIPM is vulnerable under heavy load condition
comparing with SPM, when the motor is designed to improve the short term
performance by rising of electrical load, the distortion can be even more serious.
The 3rd harmonic content under high salient pole ratio is lower, therefor a lower
SNR is caused.
2. It is assumed that the distance between rotor core outer radius and stator core
inner radius are constants, then the ratio of magnet thickness and air-gap length
changes may lead to the changing of air-gap flux density and winding current
harmonics distribution. Harmonics of flux density and back-EMF are grown with
SMA, harmonic amplitude of winding current decreases on the contrary. When
SMA=1 and air-gap length is approximately equal to magnet thickness, high order
harmonics of winding current for instance 5th, 7th and 11th increase significantly,
also indicates a rising of rotor eddy current loss.

An optimal design of the 2.5kW 45000rpm high-speed BLDCM is presented with the
improvement of rotor parameters such as the ratio of magnet thickness and air-gap
length and the pole arc ratio. The proposed design is verified by experimental results.

Reference
[1] D. Gerada, A. Mebarki, N. L. Brown, C. Gerada, A. Cavagnino, and A. Boglietti, “High-Speed
Electrical Machines: Technologies, Trends, and Developments,” IEEE Transactions on Industrial
Electronics, vol. 61, no. 6, pp. 2946-2959, Jun, 2014.
[2] S. G. Burrow, P. H. Mellor, P. Churn, T. Sawata, and M. Holme, “Sensorless operation of a
permanent-magnet generator for aircraft,” IEEE Transactions on Industry Applications, vol. 44,
no. 1, pp. 101-107, Jan-Feb, 2008.
[3] M. Ganchev, C. Kral, and T. M. Wolbank, “Compensation of Speed Dependence in Sensorless
Rotor Temperature Estimation for Permanent-Magnet Synchronous Motor,” Industry
Applications, IEEE Transactions on, vol. 49, no. 6, pp. 2487-2495, 2013.
[4] F. R. Salmasi, T. A. Najafabadi, and P. J. Maralani, “An Adaptive Flux Observer With Online
Estimation of DC-Link Voltage and Rotor Resistance For VSI-Based Induction Motors,” IEEE
Transactions on Power Electronics, vol. 25, no. 5, pp. 1310-1319, May, 2010.
 Harmonic Distribution Optimization of Surface Rotor Parameters   543

[5] G. Pellegrino, E. Armando, and P. Guglielmi, “Direct-Flux Vector Control of IPM Motor Drives in
the Maximum Torque Per Voltage Speed Range,” IEEE Transactions on Industrial Electronics,
vol. 59, no. 10, pp. 3780-3788, Oct, 2012.
[6] J. X. Shen, Z. Q. Zhu, and D. Howe, “Sensorless flux-weakening control of permanent-
magnet brushless machines using third harmonic back EMF,” IEEE Transactions on Industry
Applications, vol. 40, no. 6, pp. 1629-1636, Nov-Dec, 2004.
[7] J. X. Shen, and S. Iwasaki, “Sensorless control of ultrahigh-speed PM brushless motor using
PLL and third harmonic back EMF,” IEEE Transactions on Industrial Electronics, vol. 53, no. 2, pp.
421-428, Apr, 2006.
[8] J. M. Liu, and Z. Q. Zhu, “Improved Sensorless Control of Permanent-Magnet Synchronous
Machine Based on third-Harmonic Back EMF,” IEEE Transactions on Industry Applications, vol.
50, no. 3, pp. 1861-1870, May-Jun, 2014.
[9] J. T. Shi, X. Liu, D. Wu, and Z. Q. Zhu, “Influence of Stator and Rotor Pole Arcs on Electro-
magnetic Torque of Variable Flux Reluctance Machines,” IEEE Transactions on Magnetics, vol.
50, no. 11, Nov, 2014.
[10] Z. F. Chen, C. L. Xia, Q. Geng, and Y. Yan, “Modeling and Analyzing of Surface-Mounted
Permanent-Magnet Synchronous Machines With Optimized Magnetic Pole Shape,” IEEE
Transactions on Magnetics, vol. 50, no. 11, Nov, 2014.
[11] N. Bianchi, S. Bolognani, A. Faggion, and E. Fornasiero, “Analysis and Experimental Tests of
the Sensorless Capability of a Fractional-Slot Inset PM Motor,” IEEE Transactions on Industry
Applications, vol. 51, no. 1, pp. 224-231, Jan-Feb, 2015.
[12] K. Wang, M. J. Jin, J. X. Shen, and H. Hao, “Study on rotor structure with different magnet
assembly in high-speed sensorless brushless DC motors,” Iet Electric Power Applications, vol.
4, no. 4, pp. 241-248, Apr, 2010.
[13] L. J. Wu, Z. Q. Zhu, D. A. Staton, M. Popescu, and D. Hawkins, “Comparison of Analytical
Models of Cogging Torque in Surface-Mounted PM Machines,” IEEE Transactions on Industrial
Electronics, vol. 59, no. 6, pp. 2414-2425, Jun, 2012.
[14] W. Fei, and P. C. K. Luk, “A New Technique of Cogging Torque Suppression in Direct-Drive
Permanent-Magnet Brushless Machines,” IEEE Transactions on Industry Applications, vol. 46,
no. 4, pp. 1332-1340, Jul-Aug, 2010.
[15] D. Zarko, D. Ban, and T. A. Lipo, “Analytical calculation of magnetic field distribution in
the slotted air gap of a surface permanent-magnet motor using complex relative air-gap
permeance,” IEEE Transactions on Magnetics, vol. 42, no. 7, pp. 1828-1837, Jul, 2006.
[16] Z. Q. Zhu, L. J. Wu, and Z. P. Xia, “An Accurate Subdomain Model for Magnetic Field Computation
in Slotted Surface-Mounted Permanent-Magnet Machines,” IEEE Transactions on Magnetics,
vol. 46, no. 4, pp. 1100-1115, Apr, 2010.
[17] T. Lubin, S. Mezani, and A. Rezzoug, “2-D Exact Analytical Model for Surface-Mounted
Permanent-Magnet Motors With Semi-Closed Slots,” IEEE Transactions on Magnetics, vol. 47,
no. 2, pp. 479-492, Feb, 2011.
[18] T. Lubin, S. Mezani, and A. Rezzoug, “Two-Dimensional Analytical Calculation of Magnetic Field
and Electromagnetic Torque for Surface-Inset Permanent-Magnet Motors,” IEEE Transactions on
Magnetics, vol. 48, no. 6, pp. 2080-2091, Jun, 2012.
[19] K. Atallah, D. Howe, P. Mellor, and D. Stone, “Rotor loss in permanent-magnet brushless ac
machines,” Industry Applications, IEEE Transactions on, vol. 36, no. 6, pp. 1612–1618, Nov
2000.
[20] F. Deng, “Commutation-caused eddy-current losses in permanent-magnet brushless dc
motors,” Magnetics, IEEE Transactions on, vol. 33, no. 5, pp. 4310–4318, Sep 1997.
Sheng-yang GAO*, Xian-yang JIANG, Xiang-hong TANG
Vehicle Motion Detection Algorithm based on Novel
Convolution Neural Networks
Abstract: In order to detect forward moving vehicles using a monocular camera
settled on the car, a kind of moving vehicle detection algorithm with a new convolution
neural network is proposed in this paper. A novel convolution neural network is
firstly utilized to classify and extract the features of image. Then, the intelligent
pixel-labeling is applied according to the features, i.e., the same feature is labeled
with the same pixel values. Novel convolution neural networks are able to extract the
vehicle features more accurately. The numerical results show that the average time
of the detection is 32 ms with the new detection algorithm, which is better than other
correlation algorithms. Moreover, this algorithm breaks through the bottleneck of the
traditional methods that the vehicles can be detected at night or when the light is
insufficient. The target vehicle can be detected more exactly in a nearly real time.

Keywords: moving vehicle detection; target detection; convolution neural network;


feature extraction

1 Introduction

With the further development of automobile driving system research, there are
many different kinds of sensors used in current automobile driving systems, such as
ultrasound, radar, computer vision, laser, infrared and so on. Among these, computer
vision occupies the mainstream with its advantages in terms of the price, the size of
the device, system complexity. Whatever, in the field of moving vehicle detection,
the usage of a monocular car-camera set in the front of the vehicle is to be a hot-spot,
which refers to use some certain detection algorithms to accurately detect moving
vehicles ahead and real-timely tracking the target vehicle while driving.
The knowledge recognition method is widely used in computer vision based on
a monocular camera. The knowledge includes symmetry, color, shading, geometric
features, and texture lights and other prior information. The single-step detection
method was commonly used in the early stages, such as the symmetry template
matching detection method [1], and other methods [2-10], namely, matching features
according to merging some characteristics of the knowledge, then to detect a forward

*Corresponding author: Sheng-yang GAO, School of Communication Engineer, HangZhou Dianzi


University, Hangzhou, China, E-mail: 747579583@qq.com
Xian-yang JIANG, Xiang-hong TANG, School of Communication Engineer, HangZhou Dianzi University
Hangzhou, China
 Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks   545

moving vehicle in accordance with comparing the threshold set by algorithm. There
will be a large number of false-alarms while only using simple geometric information,
such as signposts or construction beside the road. Gradually with the development
of machine learning, detection methods based on appearance [11-12] have been
proposed. The effect of machine learning can be a good way to improve detection
accuracy, but it cannot overcome a lack of insufficient light and other environmental
problems, while the system initialization time for extracting the feature is also a
bottleneck.
Recently, the knowledge-based approach has been further extended to a two-
step detection frame, which is summarized from [13-17] is as follows: 1) extract Region
of Interest (ROI), which is more likely to contain a vehicle; 2) verify the ROI. However,
using this method, the following disadvantages appear: 1) The method’s real-time
does not perform well; 2) The detection system works well in good light conditions
but not in the dark.
Using convolution neural networks, the novel vehicle detection algorithm
proposed in this paper can successfully detect forward cars by a single camera. It
can get very accurate characteristics of vehicles so as to separate the target vehicle
accurately. What’s more, the effect of detection adapts to a high-speed environment.

2 System overview

The moving vehicle detection system consists of three modules, as shown in Figure 1.
The first part is the input-video source, in which the system will process the
images, such as decompression, rotation, removal of cross pictures. The input real-
time traffic video is captured by only one camera, settled on the front windshield.
Besides, the image needs to be translated into a proper format.

Video
Source(Pictures)

Feature
Extraction
Hypothesis
Generation Process of
pixel-labeling
Decoding

Hypothesis Image
Verification Filtering

Figure 1. The frame of vehicle detection and tracking.


546   Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks

The second part is a module of hypothesis generation, and ROI is improved by the
convolution neural network module. The third part of the system is the hypothesis
verification. The second part and the third part constitute the new vehicle detection
algorithm. After verifying the hypothesis, the misjudgment of the houses, the road
signs and other objects can be excluded to ensure the position of the target vehicle.
Meanwhile, the detection accuracy will be improved for filtering out glitch noise
introduced by the system. The output shows the final map, which contains the
position information of the vehicle.

3 Moving vehicle detection algorithms

In order to improve the accuracy of the detection, the hypothesis generation is divided
into two steps: firstly, it extracts avehicle’s feature through the convolution neural
network encoding, which can classify the vehicles into five categories, such as cars,
sport utility vehicles, trucks, buses and vans. Then, the extracted characteristic pixels
are assigned a certain value, as a result the pixels belonging to the same feature has
the same values of a pixel tag, which can effectively extract ROI in an image. Finally,
the median filtering method will be applied to verify the ROI to determine the location
information of the target vehicle.

3.1 Convolution Neural Networks Structure

Convolution neural network structure used in this paper is shown in Figure 2. The
neural network consists of two parts, the convolution feature extraction layer and the
BP neural networks. The number of convolution layers is 5, and the input is a single
frame of a video or a single image. The initial data amount of the picture is 1024,
which has been previously processed to be the size of 32×32. The image is then passed
to layer S1, operating with the five different convolution kernels respectively. Then,
five feature maps will be obtained, which may contain relative characters. And each
feature map’s size should be (32-5+1) × (32-5+1) = 28×28. Thus, the amount of the data
reduced from 1024 to 784. Next, layer C2 down-samples S1, and the size of the pooling
is (2, 2), thus further compression feature map size is 14×14. Then the compressed
feature map, layer S3, is again operated with convolution kernel so as to obtain the
size(14-5+1) ×(14-5+1) = 10×10of layer C4. The purpose of the convolution is to weak
the difference of the moving vehicles’ displacement by blurring the image. Since the
data amount of layer S3 at this stage is still too large, layer C4 continues the pooling
operation to obtain layer S5, whose size is 5×5. Layers S5 reconstruct layer F6, which is
the output result of the system. Since the output results include five types of vehicles,
therefore output layer F6 should be 10 features maps, representing the corresponding
 Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks   547

type of vehicle. Considering this, the n in Figure 2 is valued to be 10×5×5. Finally, the
results of the feature extraction are output sequential.

Feature extraction in convolution layer BP neural network

1
1 1
1

2 2
2

F6
Input . .
2

..
C2
S1 S3 . .
C4 S5 .
n-1
N-1
. .
n N Y
N

Input layer Hidden layer Output layer

Figure 2. The chart of convolution neural networks.

Each image pixels is inputed to the convolution layer to compute as the following
function,
= δi, δj <
yij = f ks ({ xsi +δi , sj +δj }, 0 < = k)
(1)

Where k is the size of nucleus, s is the sub-sampling factor, fks determines the type of
the layer. Since the calculation process of the convolution layer only depends on the
relative spatial coordinates, so that the data (i,j) is recorded as a position vector xij.
The feature extracted in layer S1 and S3 is computed as the formula below,
=x lj f ( ∑ xil −1 ∗ kijl + bil ) (2)
i∈M j
l
where xj represents the j-th characteristic map of the l-th layer, k is used as a
l

l
convolution kernel in the l-th layer, and b is a bias through l-th layer.
The convolution kernel will share the same weighting parameters to extract the
local features of the image. The down-sampling process complies with the following
Equation,
=x lj f ( β lj down( X lj−1 ) + blj ) (3)

where xj represents the j-th characteristic map of the l-th layer. β j l is the coefficient
l

of the j-th characteristic map of the l-th layer. down(⋅) refers to the down-sampling
operation, and b l is the bias. The number of the features and the characteristics
obtained from the process of the convolution and the down-sampling are kept
constant through the BP neural network detection judgment, and only the size of the
image changed.
The convolution layer outputs the feature data to the BP neural networks [18]. The
hidden layer of this section has 250 neurons as well as the input layer, and the neuron
number of output layer is 5. Namely, the value of N is 250, the value of Y is 5 in Figure
2. The structure of the single neuron is described in Figure 3, in which the activation
function is as follows,
548   Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks

1
S ( x) =
1 + e − x (4)

x1
ω1

x2 ω2 ξ
= ∑ω ⋅ x ω
ω3 y
y = σ (ξ )
x3 •


ωn
xn

Figure 3. The structure of neuron.

Through the whole system, the output of the former layer is the input of the next
layer, which is expected to be computed with the kernel, and the convolution kernel is
obtained by training, then the obtained results is computed with Eq.(3) to get feature
map as output.

3.2 Hypothesis Generation by Pixel-Encoding

A neural network convolution decoding system is applied in this paper as in [19],


which not only encodes the output characteristics of the image but also values the
pixel-tags. In contrast with the down-sampling, the up-sampling operation the image
to be original size as Eq.(5),
δ lj = β lj +1 ( f '(u lj )  up (δ lj +1 )) (5)

where up (⋅) is the calculation function for the up-sampling. βj l is the coefficient of
the j-th characteristic map of the l+1-th layer. This algorithm restores the size of the
output image before the down-sampling by the input with Kronecker-operator for
the operation, since the input image is copied n times in the horizontal and vertical
direction, and the expression is,

up ( x)= x ⊗ ln×n (6)

Then the classified image will be iterated back so as to get the feature maps. The entire
detection algorithm is constructed by the convolution neural network and intelligent
pixel tags codes system, shown in Figure 4. The objects on the road in the image can
 Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks   549

be detected using different classified marks by the algorithm in real-time, in which


the same class of the objects is represented by the same pixel value.

.. .. .
. .. . . . . .. .
.
.

Figure 4. The structure of convolution neural networks for semantic pixel-wise labeling.

The classified image in Figure 5 can be extracted by a specified pixel-value of the target
vehicle, including five categories such as cars, trucks, vans, sports utility vehicles and
buses. These types of vehicles are labeled with five different pixel values, the location
of the target vehicle can be extracted effectively as a region of interest.

Figure 5. The result classified by convolution neural networks with architecture for semantic pixel-
wise labeling.

3.3 Hypothesis Verification - Refine The Results of Detection

Considering the interference noise, introduced by the system and the misjudgment
of detection, the filter process is applied to verify the ROI as well as to refine the
detection.
Median filter is a nonlinear signal processing technology, which can effectively
suppress noise. The basic principle is that the true median value of a digital image is
instead of the average value of its neighborhood, so that the isolated noise points can
be eliminated. The operation method is to use a certain two-dimensional (2D) sliding
template, which has a certain structure. The inner panel pixels are sorted to generate
monotonic up (or down) 2D data series as follows,
, y ) med { f ( x − k , y − l ), (k , l ∈ W )} (7)
g ( x=

where f ( x, y ) is the original image and g ( x, y ) is the output image. W is a 2D template,


generally 3×3 or 5×5 regions.
550   Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks

The result of the hypothesis verification is shown in Figure 6. It is clear that the
location of the target vehicles has been extracted successfully.

Figure 6. The result of hypothesis verification.

4 Training scheme for the convolution

Since the convolution neural networks are used to recognize the target vehicles, the
relevant kernel should be extracted before the detection is carried out. The Hard
C-Means (HCM) algorithm is the scheme to train the five types of vehicles, which
is an unsupervised clustering learning algorithm. Assume a sample set of vehicles
X = { Xi | Xi ∈ R P , i = 1,2,..., N } , which can be divided into five categories. A matrix U
(5×N) can be used to represent the classification results, the element uil of U is shown
as follows,

 1 whenXl ∈ Ai
uil = 
0 whenXl ∉ Ai (8)

where Xl represents a vehicle in the sample set. Ai represents the classification of the
vehicle, i.e., A1 for the car, A2 for the SUV, A3 for the van, A4 for the large trucks and
A5 for the bus.
The HCM algorithm has the specific steps as follows,
1) Determine the amount of vehicles’ clustering categories c = 5, N = 250;
2) Consider the difference of five vehicle categories, set the tolerance error ε = 0.01 ;
b
3) Specify the initial matrix to classify U , b = 0;
4) Compute the center vector Ti accordance with U ,
b

1 N
Ti = N ∑u X il l

∑u il
l =1

l =1 (9)
b +1
5) Update U to be U b
in accordance with a predetermined method,
 Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks   551

1 when dilb = min{d bjl }


b +1
uil = 1≤ j ≤ c
i=1,...,c;l=1,...,N
 0 otherwise
(10)

where dil = Xl − Ti ;
6) Compare the updated matrix. if U b − U b +1 ≤ ε , then go to step 7), else make
b = b + 1 and return to the step 4),
7) At this step, the features of vehicles can be effectively extracted. The link weights ωij in
the hidden layers are adjusted by iterating method of the least-squares,which enables
the energy function (11) to be least with the input samples { Xi | Xi ∈ N P , i = 1,2,..., N }
and respectively output samples {Di | Di ∈ R q , i = 1,2,..., N } ,
1 N q 2
E= ∑∑ e jk
2 N =j 1 =k 1 (11)
M
ejk =− djk ∑ ω ikG ( Xj , Ti ) (12)
djk fk ( Xj ) =−
i =1

So that the link weights ωij are adapted to be suitable, and ωij is complied with the
formula (13) and (14),

∂E
ωijb=
+1
ωijb − η
∂ω ij (13)

∂E 1 N q
= − ∑∑ ejkG ( Xj , Ti )
∂ω ij N =j 1 =k 1 (14)

5 Numerical results

The computer simulation system is applied and the results are also compared with the
other relative algorithms. The MATLAB experiments are run on the computer configured
with the Intel Core i3 dual-core processor (3.3GHz) and 2G memory. The images used in
the experiments are recorded by HP F800G tachograph run on the road.

5.1 Training Convolution Neural Networks by Sampling Data

With the method proposed in this paper, all types of vehicle will be divided into five
categories of vehicles i.e., cars, sports utility vehicles, vans, buses and trucks. Each
convolution kernel respectively represents a type of feature extraction. The vehicle
sample data-set is downloaded from the sample gallery of MIT Artificial Intelligence
Laboratory [20]. The sample set of the vehicles were partially exhibited in Figure 7.
There are 30 samples blow as the experimental sample training. the experiment has 5
categories of training samples, and it took totally 8.3 hours.
552   Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks

Figure 7. Partial samples of vehicles.

5.2 Single Frame Image Detection

It is apparently accurate about the location of the vehicle extracted by the new neural
network algorithm from Figure 8. There is basically no other background in the box,
besides the target vehicle, and the recognition accuracy is up to 95%.

Figure 8. Vehicle detection of single image.


 Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks   553

5.3 Detection Performance Comparison of different Algorithms

Figure 10 shows the tests with different types of vehicle motion detection algorithm.
The top of the Figure is the image to be detected. Figure 9a shows the output with the
method based on the apparent in [4] (Apparent Method, AM). Figure 9b demonstrates
the detection of template matching used in [17] (Template Matching Method, TMM).
The detection by information fusion [16] (Information Fusion Method, IFM) is
illustrated in Figure 9c, and Figure 9d displays the result of using the convolution
neural networks algorithm proposed in this paper (Convolution Neural Networks
Method, CNNM). From Figure 9a, it can be clearly aware of the lack of objective
information. The performance of Figure 9b will be better than that of Figure 9a,
but the test results will contain some background information so that the accuracy
decreases. Detecting information fusion enables the accuracy increased, but there
is still some background information in Figure 9c. The two-step detecting frame is
combined convolution neural network feature extraction and template matching,
which is possible to achieve higher detection accuracy level, as show in Figure 9d.

(108,250) (108,450) (128,255) (128,450) (140,275) (140,430)


(138,292) (138,412)

(222,292) (222,412) (228,250) (228,450) (232,450) (232,450) (220,275) (220,430)

a b c d

Figure 9. Tests of four different vehicle detection algorithms.

5.4 Comparison of Different Results Detected at Night

Figure 10 shows the comparison of four different algorithms while detecting at night.
The top of Figure are the original images, Figure 10a shows the result of AM. Figure
10b demonstrates the detection of TMM. The detection by IFM illustrates in Figure
10c, and Figure 10d displays the output CNNM. From Figure 10a the detection method
only detect lamp since it takes the vehicle’s headlights as a standard. What’s worse,
554   Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks

the other two methods cannot effectively detect the vehicle at night as illustrated in
Figure 10b and 10c. The novel detection algorithm achieve the target detection and
tracking vehicles at night, overcoming the bottleneck of traditional vehicle detection
and tracking algorithm as shown in Figure 10d.

a b c d

Figure 10. Detection renderings of vehicle at night.

5.5 Performance Comparison

Table 1 shows the accuracy, the real-time and robustness of three classical vehicle
detection algorithms and the proposed algorithm. According to the experimental
results, the running time of the proposed algorithm detecting a single frame image
is 31.9ms, which is much faster than optical flow detection about 46.83%, and faster
than adaptive detection by Kalman filter about 16.05% as well, and much higher
accuracy and robustness than the others.

Table. 1 The comparison oftime consuming, accuracy and robustness about different algorithms

Algorithms Running time(ms) Accuracy Robustness

Optical flow detection 60 medium low

Edge detection 30 low medium

Adaptive detection by Kalman filter 38 medium high

The method in this paper 31.9 high high


 Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks   555

A video is further tested with four different methods in Figure 11. The performance of
AM is presented by a1, the performance of TMM by a2, and the performance of IFM by
a3. Whatever a4 illustrates the performance of CNNM. The times of different methods
tested with a video image of the front 20 frames are shown in Figure 11. The average
detection time of CNNM is about 35ms, comparing with that of AM about 35ms, that
of TMM about 40ms, and that of IFM about 46ms. By expanding the number of video
frames to 1000, the consuming time is compared as follows, about 35ms for AM, about
40ms for TMM, about 46ms for IFM and about 32ms for CNNM.

60

55

50

45

40
times/ms

35

30

25

20 a1 - AM
a2 - TMM
15 a3 - IFM
a4 - CNNM
10
2 4 6 8 10 12 14 16 18 20
Frames

Figure 11. Detection time comparison chart of four different vehicle detection algorithms.

6 Conclusion and future work

It is difficult to satisfy the requirements of real-time and accuracy when traditional


methods are used to detect a moving vehicle in different environments. For solving
these problems, the convolution neural network system improves the performances,
using five different types of vehicle characteristics convolution kernel. The detection
accuracy is improved by training vehicle characteristics, and the problem of detecting
at night or bad light conditions is successfully solved. The algorithm also has
better robustness and its output is much more closer to the effect of human-eye’s
classification. Even so, the achievements of the algorithm is mainly contributed to
driver assistance. However, if the detection system was used to test at a high speed,
then a serious delay will appear. Thus, the whole frame system needs improve further.

Acknowledgment: This work was supported by the Zhejiang Provincial Nutural


Science Foundation of China undergrants LY14010018.
556   Vehicle Motion Detection Algorithm based on Novel Convolution Neural Networks

References
[1] G. Marola, “Using symmetry for detecting and locating objects in a picture,” Computer Vision,
Graphics, and Image Processing, vol. 46, no. 2, pp. 179-195, 1989.
[2] X. Hu, Y. Qi, X. Shen, “A Real-Time Anti-Aliasing Shadow Algorithm Based on Shadow Maps,” in
Proc. IEEE Conf. Pattern recognition, pp. 1-5, 2008.
[3] C. A. Pagot, J. D. Comba, M. M. De Oliveira Neto, “Multiple-depth shadow maps,” in Proc. IEEE
17.th Barzilian Symp. Computer Graphics and Image, pp. 308-315, Oct. 2004.
[4] A. Borza, S. Saito, “Eye Tracked Shadow Maps,” in Proc. 3rd Global Conf., pp. 757-761, Oct.
2014.
[5] N. D. Matthews, et al, “Vehicle detection and recognition in greyscale imagery,” Control
Engineering Practice, vol.4, no.4, pp. 473-479, 1996.
[6] Y. Li, B. Tian, Q. Yao, “Vehicle detection based on the AND–OR graph for congested traffic
conditions,” IEEE Trans. On Intell. Trans. Syst., vol. 14, no.2, pp. 984-993, 2013.
[7] R. K. Satzoda, M. M. Tricedi, “Multipart Vehicle Detection Using Symmetry-Derived Analysis and
Active Learning,” IEEE Trans. On Intell. Trans. Syst., vol. 17, no.4, pp. 926-937, 2016.
[8] L. C. Leon, J. R. Hirata, “Vehicle detection using mixture of deformable parts models: Static and
dynamic camera,” in Proc. IEEE Conf. On Graphics, Pattern and Images, pp. 237-244, 2012.
[9] S. Sun, et al, “Real-time vehicle detection using Haar-SURF mixed features and gentle AdaBoost
classifier,” in Proc. 27th Chinese Control and Decision Conf., pp. 1888-1894, 2015.
[10] C. T. Hsieh, et al, “A real-time mobile vehicle license plate detection and recognition for vehicle
monitoring and management,” in IEEE Conf. Pervasice Computing, pp. 197-202, Dec. 2009.
[11] Z. Sun, R. Miller, G. Bebis, et al, “A real-time precrash vehicle detection system,” in Proc. IEEE
Workshop Applications of Computer Vision, pp. 171-176, 2002.
[12] Y. Li, B. Tian, et al, “Vehicle detection based on the deformable hybrid image template,” in Proc.
IEEE Int Conf. Vehicular Electronics and Safety, pp. 114-118, 2013.
[13] Z. Sun, G. Bebis, R. Miller, “Monocular precrash vehicle detection: features and classifiers,”
IEEE Trans. on Image Proc. Vol. 15, no. 7, 2019-2034, 2006.
[14] J. Jin, D. Kim, J. H. Song, et al, “Hardware architecture design for vehicle detection using a
stereo camera,” in Proc. IEEE 11th Int. Conf. Control, Automation and Systems, pp. 1761-1765,
2011.
[15] R. Okada, Y. Taniguchi, K. Furukawa, et al, “Obstacle detection using projective invariant and
vanishing lines,” in Proc. IEEE Int. Conf. Computer Vision, pp. 330-337, 2003.
[16] J. Cui, F. Liu, Z. Li, et al, “Vehicle localisation using a single camera,” in IEEE Int. Vehicles Symp.,
vol. 1, no. 1, pp. 871-876, 2010.
[17] N. Yu, P. Jiao, Y. Zheng, “Handwritten digits recognition base on improved lenet5,” in IEEE 27th
Chinese Control and Decision Conf., pp. 4871-4875, May, 2015.
[18] Y. Xu, H. Zhang, “Application of improved BP algorithm in vehicle license plate recognition,” on
IEEE Int. Joint Conf., Artificial Intelligence, pp. 514-517, April, 2009.
[19] V. Badrinarayanan, A. Kendall, R. Cipolla, “Segnet: A deep convolutional encoder-decoder
architecture for robust semantic pixel-wise labelling,” arXiv preprint arXiv:1511.00561, pp. 1-5,
2015.
[20] http://groups.csail.mit.edu/vison/welcom.
Yong LI, En-de WANG, Zhi-gang DUAN, Hui CAO*, Xun-qian LIU
The Bank Line Detection in Field Environment Based
on Wavelet Decomposition
Abstract: For river bank line extraction problem in field environment operation
image based on monocular vision. This paper proposes a detection method based on
wavelet transform. The method using edge information of the different scale images
by wavelet decomposition. The riparian region can be estimated and the river bank
line is fitted with the edges information. A lot of experiments show that this algorithm
can detect the river bank line in the field environment, and it is more effective than
the Hough transform method.

Keywords: Field environment, river bank line detection, wavelet transform, least
square fitting

1 Introduction

To achieve autonomous detection and localization of the water surface mobile robot
is the new direction and hot spot in the field of robot research in recent years, whose
key technology is the research of the detection method [1]. Water surface mobile robot
that belongs to a kind of mobile robots and is also called the Unmanned Surface
Vessel (USV) is a kind of unmanned surface vessels mainly for the tasks which are
dangerous or not suitable for boats with people to complete [2]. This kind of robot
has broad application prospects in the moving water quality inspections, emergent
responses of unexpected pollution incidents, maritime search and rescue, navigation
and hydrographic surveying fields. Especially in the military, the water surface mobile
robot can be used for reconnaissance, search, detection and mine clearance, anti-
submarine warfare, anti-special operations and patrols, piracy combat, terrorism
defeat and so on. It can perform a variety of war and non-war military tasks and will
play an important role in the possible future conflicts at sea [3].

*Corresponding author: Hui CAO, Shenyang Institute of Automation, Chinese Academic of Science,
Shenyang, Liaoning, 110016, China
Yong LI, Zhi-gang DUAN, Hui CAO, Shenyang Institute of Automation, Chinese Academic of Science,
Shenyang, Liaoning, 110016, China
Yong LI, Xun-qian LIU, College of Information Science and Engineering, Northeastern University,
Shenyang, Liaoning, 110819, China
Yong LI, En-de WANG, Hui CAO, Key Laboratory of Optical Electrical Image Processing, Chinese Aca-
demic of Science, Shenyang, Liaoning, 110016, China
558   The Bank Line Detection in Field Environment Based on Wavelet Decomposition

Detection of the river bank images based on the monocular vision has always
been the basis for the detection and avoidance of the multi information fusion
of the water surface mobile robot [4]. At present, the method of river and coastal
detection by using vision from home and abroad is generally divided into threshold
segmentation method [5], edge detection method [6,7], region growing method [8],
texture analysis method [9] and active contour model method [10,11]. The riparian
image effect of those methods in the field operation environment is not obvious as the
riparian image has considerable interference factors, especially the trees reflections
have a lot of influenced on region-based method, threshold segmentation and active
contour method, so that it is difficult to obtain accurate bank line information. In
addition, the bank line detection of riparian lines is less at home and abroad for the
field work environment. In order to obtain bank line information of bank images in
the field operation environment more accurately and more robustly, this paper uses
the characteristics of the different direction component after wavelet decomposition
and obtains the image of horizontal features through three wavelet decompositions.
Then, this paper determines shoreline areas and finally extracts the river bank line by
least squares fitting using the edge information of images in different sizes under the
large scale.

2 The introduction of the algorithm

As shown in Figure 1, the algorithm of this paper is divided into the following steps:
(1) get gray images, (2) decompose the gray image with wavelet for three times, (3)
extract the edge images of different size under the large scale, (4) obtain the bank line
region, (5) extract the accurate bank line.

Wavelet
decomposition Original size

Input image Gray image

Small size

Decision rules

Shoreline image Edge region of shoreline

Edge image

Figure 1. The flowchart of the proposed algorithm.


 The Bank Line Detection in Field Environment Based on Wavelet Decomposition    559

2.1 The three wavelet decomposition

According to the Mallat algorithm, two-dimensional image after the wavelet transform
was broken into a series of sub images which contain a specific scale and direction
information. Each sub image of high-frequency components of different directions
can reflect the mutation of images in different directions [12]. The texture direction of
separately [13], because each sub image after the wavelet decomposition is obtained
by low-pass or high-pass filter, it is of noise filtering and direction selectivity. As for
the images of directional texture, they have different performances in high-frequency
component in different directions to separate the directionless of textual performances
[14,15]. As shown in Figure 1, the input image is the field operational environment
image. The river in the image is corrugated type. And the direction of corrugated is
mainly in the horizontal direction. The sky region is flat area, while riparian zones
have mutation in the horizontal direction and the vertical direction. Therefore, in this
paper, we consider using the wavelet decomposition method to carry out the analysis
of the banks in the images, and then use the image to analyze the position of the bank
line.
First of all, this paper uses the Matlab toolbox to construct the decomposition
filter to get the filter coefficients, that is, the parameters of the wavelet:
The low-pass filter coefficient: Lo_D=[0.0000, 0.0625, 0.2500, 0.3750, 0.2500,
0.0625, 0.0000, 0.0000];
High-pass filter coefficient: Hi_D=[-0.00008, 0.01643, 0.10872, 0.59261, 0.59261,
0.10872, 0.01643, 0.00008];
According to the parameters of the wavelet, the wavelet function F can be
constructed. Then, the input image I in Figure 1 is processed to be gray image Ig..
Next, the wavelet function F is used to decompose the gray level image by using
the constructed wavelet function F to get the four components after the first
decomposition, that is [LL,LH;HL,HH]. Among them, HL takes the left lower corner
of the matrix which is obtained by wavelet transform and column transform,that is,
the vertical detail section; similarly, LH takes the upper right corner of that matrix,
that is the part of horizontal details. Define the matrix of four components from top
to bottom, from left to right in turn after wavelet level decomposition as follows: the
average part, horizontal details, vertical details and diagonal details.
According to the size of the original image, wavelet function F is used to
decompose the image Ig to obtain the component LLb of the average part as shown
in Figure 2a. Use F to decompose the image Ig with wavelet by down-sampling to get
the component LLs of the average part of the small-size image, as shown in Figure 2b.
560   The Bank Line Detection in Field Environment Based on Wavelet Decomposition

(a) (b)
Figure 2. Average component of wavelet decomposition:(a) Average component of the original size,
(b) Average component of small scalel size

Then, using wavelet decomposes the component of the average part of the original size
and the sample size to obtain the component of the average part of two dimensions
and continuing to carry out the wavelet decomposition, that is, using the wavelet
decomposes the image Ig three times. The three wavelet decompositions’ results of
the sample are shown in Figure 3. In the final display, the HL is in the upper right
corner of the position and the LH is in the lower left corner.

Figure 3. Three wavelet decomposition results

After three wavelet decompositions, different components under the large scale can
be obtained due to the river differs from the bank a lot in the horizontal component.
Therefore, this paper chooses the horizontal details under a large scale for the following
research,that is, handling the obtained image of horizontal details of different sizes
under a large scale after decomposing the input image I shown in Figure 1.

2.2 The extraction of the edge under the large scale

After the three wavelet decompositions, horizontal details of different dimensions are
available, as shown in Figure 4a. Due to the horizontal direction of the bank line, we
 The Bank Line Detection in Field Environment Based on Wavelet Decomposition    561

can know that the bank area and the non-riparian area of the horizontal detail image
have obvious gradient changes, so the bank line area can be obtained by the edge of
the image, as shown in 4a.
As shown in Figure 4b is the edge image extracted by using the Canny operator
from the horizontal detail image of the large-scale original size and the small size;
Figure 4c is the horizontal edge image extracted from the horizontal detail image of
the large-scale original size and the small size.
In Figure 4b and 4c, under the original size, the edge information extracted by
the Canny operator is a little redundant, and horizontal edge images can filter out
numerous the edge of the non-riparian zone; the horizontal edge images of the small
size of will lose part of the boundary information of the river’s lower area while and
small-size edge information of the Canny operator image is relatively completed.
Therefore, in this paper, we use the horizontal edge to get the edge image Eh
in terms of the large-scale original horizontal detail image and as for the small-size
horizontal detail image, we use the Canny operator to get the edge image Ec.

(a) (b) (c)

Figure 4. Three wavelet decomposition results

2.3 The extraction of the region of the bank line

From Figure 4, the bank line will generally have considerable horizontal edges. So,
it can be judged by the number of the edges to determine the area of the bank line:
(1) To obtain the average value of the vertical: coordinates of the edge concentrated
area of the small-size edge image:
Since the edge image is the binary image, sum and average the edge image in the
vertical direction, as shown in the formula (1).
Ym = mean( sum( Ec ) (1)

Among them, ‘mean ()’ is the function for the mean value, ‘sum ()’ is the function to
sum the pixel of the image in the vertical direction.
562   The Bank Line Detection in Field Environment Based on Wavelet Decomposition

(2) To obtain the proportional threshold parameter w of the fitting area:


According to the formula (2), we can obtain the ratio w1 that the vertical coordinate
value of the edge point is greater than the average Ym, and the ratio when less than
the average Ym, that is w2.

w1 = sum( Ec > Ym ) / sum( Ec )


 (2)
w2 = sum( Ec < Ym ) / sum( Ec )
To determine the threshold w according to the following formula:

0.65 ( w1 − w2 < 0.2)


w=
max(w1 , w2 ) else (3)
(3) Edge region to be fitted:
Firstly, according to the formula (4), get the threshold line Ymm to be fitted.

Ymm = (1 − w) × mean( EcY ( Ec > Ym ) + w × mean( EcY ( Ec < Ym ) (4)


Among them, EcY (•) is the value that meets the conditions of the the vertical
coordinates in the bracket.
Then, restore the threshold line to that of the original small-size image. That is:

Yst = floor (Ym × 23 ) (5)


Among them, floor( ) is the integral function.
Finally, to determine the fitting area Ir according to the threshold line to. That is,
Ir is depended on the axis of X and Y:
The range of the direction Y is from (Yst-a) to (Yst+b);
The range of the direction X is the entire horizontal axis.
Because the size of the images used in this paper is normalized to 280×680,
therefore, this paper takes a=20, b=30 and the entire horizontal axis is 680.
Then, the edge image Ibr of the fitting area is: (the Edge region of the shoreline as
shown in Figure 1)

I br = I r . * Eh (6)

2.4 The extraction of the bank line

Based on the least square fitting of the edge set of the fitting area Ibr, the bank line
can be obtained, as is shown in Figure 1:

y = k × x + d (7)
 The Bank Line Detection in Field Environment Based on Wavelet Decomposition    563

3 The result and analysis of the experiment

In this paper, 20 different conditions of the field environmental riparian image as a


sample and the standardization of images size is 680×280. The simulation experiments
were made by Matlab2014 (a) in the ordinary PC Windows 3GHz 7 environment. In
order to show the performance of this algorithm, as shown in Figure 5, this paper
presents partial extractions of the bank images’ bank lines under different field work
environment. Among them, the red lines of the second row in Figure 5a, Figure 5b and
Figure 5c are the extracted bank lines by our method (the least squares fitting method);
and the red lines of the third row are the extracted bank lines by Hough Transform
method. According to the comparison, the method of least square fitting is more close
to the real condition, and the effect is better than Hough Transform method.

Figure 5. The river bank detection results


564   The Bank Line Detection in Field Environment Based on Wavelet Decomposition

In Figure 5a, only partial banks with trees are shown. In Figure 5b, partial banks
with trees, buildings or only of small area are shown. In Figure 5c, the effects when
the bank is reflected in the water that has the similar color are displayed. In Figure 5,
this paper can relatively accurately detect the location of the bank line in the field
operation environment by the algorithm whose results can inhibit the influences
caused by the complex natural environment to detect the target.
Above all, the experiment results can be applied to the mono-vision landing
forecast water surface moving robot meets based on, the collision detection of the
river bank during the obstacle avoidance and so on.

4 Conclusion

As field environment operation of riparian image is complex, and there are many
interference factors, especially the river bank line extraction is difficult with
reflection of trees. This paper presents a detection method based on three wavelet
decomposition. It extracts edges of different scales of decomposition image, and we
proposed riparian region estimation and bank line fitting method. By using the edges
of different size images. Simulation experiments show that the method can be more
accurate and robust to detect the bank of river in field operation environment images.
And the extraction of the river line results can meet the prediction and application of
collision detection for a mobile robot landing in the water and so on.

Acknowledgment: The authors would like to thank Yangjie Wei in Northeastern


University for her dataset and valuable suggestions to this paper. This work was
supported by the National Natural Science Foundation of China (No. 61175031) and
Young Teacher’s independent research program of Yanshan University (Class A, No.
15LGA014).

References
[1] Liu Qian. Researches of the Target Detective and Locative Methods for the Surface Mobile
Robot. CA, Shenyang: Shenyang Ligong University, 2013.
[2] Yan Rujian, Pang Shuo, and Sun Hanbing,et. al, “Development and Missions of Unmanned
Surface Vehicle”,Journal of Marine Science and Application, vol 09, Mar. 2010, pp.451-457.
doi:10.1007/s11804-010-1033-2
[3] Wu Gongxing,”Control and Simulation of water surface intelligent high speed unmanned craft”.
CA Ha’erbin: Harbin Engineering University,2008. doi:10.7666/d.y1436040.
[4] Wang Bo,Su Yumin,Wan Lei,Zhuang Jiayuan,Zhang Lei. “Sea Sky Line Detection Method of
Unmanned Surface Vehicle Based on Gradient Saliency”, Acta Optica Sinica, 2016, 36(5):
0511002.
 The Bank Line Detection in Field Environment Based on Wavelet Decomposition    565

[5] Joo-Hyung Ryu,Joong-Sun Wan,Kyung DuckMin.”Water Extraction from Landsat


TM Data in a Tidal Flat :A case Study in Gomso Bay”,Korea.Remote Sensing of
Environment,2002,83(3):442~456.
[6] Liu Hongxing, Jezek K C. Automated extraction of coastline from satellite imagery by integrating
canny edge detection and locally adaptive thresholding methods. Int J Remote Sensing,
2004,25(5):937-958.
[7] XIAO Chuanmin, SHI Zelin, XIA Renbo, WU Wei. Edge-Detection Algorithm Based on Visual
Saliency. Information and control, 2014, 43(1): 9-13. DOI:10.3724/SP.J.1219.2014.00009.
[8] Kevin White,Hesham M,El Asmar.Monitoring changing position of coastlines using Thematic
Mapper imagery, an example from the Nile Delta [J]. Geomorphology 29(1999)93–105.
[9] Zhou Yanan,Zhu Zhiwen,Shen Zhanfeng,Cheng Xi, “Automatic Extraction of Coastline from
TM Image Integrating Texture and Spatial Relationship”, Journal of Peking University( Natural
Science), 2012,48(2):273-279.
[10] Yue Ouyang,Jinsong Chong, Yirong Wu Two coastline detection methods in Synthetic
Aperture Radar imagery based on Level Set Algorithm[J].International Journal of Remote
Sensing,2010,31:4957-4968.
[11] LÜ Yongli, JIANG Bin, BAO Jianrong. Efficient Wavelet Image Inpainting Algorithm Based on
Pixel Weight Values. Information and control, 2015, 44(1): 104-109. DOI: 10.13976/j.cnki.
xk.2015.0104.
[12] Wei Ying, Tong Guofeng, Shi Zelin, et al. A target detection method based on a new multi-scale
fractal feature[J]. Journal of Northeastern University (Natural Science),2005, 26( 11): 1062-1065.
[13] Wei Ying, Wang Xiaozhe et al. Target Detection Method Based on Wavelet Multi-scale Extended
Fractal Feature [J]. Journal of Northeastern University( Natural Science),2006,27(11): 1185-1188.
[14] H. Yang, J. Zhang, Y. Ji, Y. He, and Y. Lee, “Experimental demonstration of multi-dimensional
resources integration for service provisioning in cloud radio over fiber network,” Scientific
Reports, vol. 6, 30678, Jul. 2016.
[15] H. Yang, J. Zhang, Y. Zhao, Y. Ji, J. Han, Y. Lin, and Y. Lee, “CSO: Cross Stratum Optimization for
Optical as a Service,” IEEE Communications Magazine, vol. 53, no. 8, pp. 130-139, Aug. 2015.
Xiao-ming LI, Jia-yue YIN*, Hao-jun XU, Chengqiong BI, Li ZHU,
Xiao-dong DENG, Lei-nan MA

Research on Measurement Method of Harmonics


Bilateral Energy
Abstract: With the fast growth of the national economy and development of science
and technology, more and more non-linear power load access to the grid system, Grid
voltage and current harmonic content increasingly high, and it has become a very
acute problem. And grid system multiple harmonic sources usually work together,
so it is very difficult to distinguish the pollution liability of harmonic source. To
clear pollution liability of harmonic in each harmonic source, which has a close
relationship with the measurement methods of harmonics bilateral energy. To solve
this problem, this paper proposes a measurement method of harmonics bilateral
energy, by studying each harmonic source individual work in grid, to calculation and
analysis harmonic pollution. In addition to the measurement harmonic energy source
itself, also measurement the harmonic generated in the power grid and other payload.
And it can provide a basis for reasonable electricity charges. This paper has create
a digital simulation to show its correctness and feasibility. The method has guiding
significance in design and manufacture of new energy meter which can measurement
harmonics bilateral energy.

Keywords: Harmonic Source; Harmonic Measurement; Bilateral Measurement;


Digital Simulation.

1 Introduction

Power system harmonic is electrical signal which have periodic electrical quantities
integral multiple of the fundamental wave frequency in sine waves [1]. In the power
system, the nonlinear load consumption fundamental wave power of the system, and it
will be fed back to the power system harmonic power. And linear load will be forced to
absorb harmonic power, the harmonic power were absorbed from the system’s various
harmonic source. Due to the large system harmonic sources, harmonic voltage and
harmonic current pollution in system voltages and currents, reducing system power
quality [2,3]. This paper based on the total distortion of the waveform to calculate the

*Corresponding author: Jia-yue YIN, Power System and Automation, School of Electrical Engineering,
Wuhan University, Wuhan, China, E-mail: 1162484454@qq.com
Xiao-ming LI, Hao-jun XU, Li ZHU, Cheng-qiong BI, Xiao-dong DENG, Power System and Automation,
School of Electrical Engineering, Wuhan University, Wuhan, China
Lei-nan MA, Substation Attendant, State Grid Hubei Electric Power Company in Huangshi power
company, Huangshi, China
 Research on Measurement Method of Harmonics Bilateral Energy   567

user’s active power consumption in electric energy measurement method on today


[4]. And research on measurement methods of harmonics bilateral energy, and put
forward the method, which can able to accurately calculate the support on the road
all the harmonics of the power source. This method is feasible and effective.

2 Research on Model of harmonics bilateral energy

Bilateral means the flow of power, and assumed power flow from the power system
to the load as the positive direction, direction from the load to the power grid as the
negative direction, if realize the inflow and outflow load power were measured, thus
the metering mode is called measurement method of harmonics bilateral energy. In
this method, not only measurement the fundamental wave power consumption, but
also calculate the harmonic source for its own branch where the generated harmonic
power as well as the harmonic source generated harmonic power to other load branch;
Or in addition to the measurement of the fundamental wave power consumption of
the load branch, also measured the load generated harmonic power, and external
harmonic source’s harmonic power load in the branch generated. As shown in
Figure 1 which is on behalf of the power system.

Us
Harmonic

Harmonic

Harmonic

Harmonic
Harmonic
source

source

source
Linear

Linear

source
source
load

load

Figure 1. Equivalent diagram of power system

US represents the system voltage source, it can take the load equivalent linear load
and nonlinear load in parallel. And the nonlinear load is harmonic source in the
power system.
Figure 2 is equivalent to Figure 1. Figure 2 is the circuit diagram of a bilateral
energy measurement model of the power system.

Ts T1 T2 Ti Tm Tm+1 Tj Tx
a
Is P1(k) I1 P2(k) I2 Pi(k) Ii Pm(k) Im Im+1 Ij Px(k) Ix
Zs Z1 Z2 Zi Uab Zm Zm+1 Zj Zx

Us U1k U2k Uik Umk


b

Figure 2. The model of bilateral energy measurement


568   Research on Measurement Method of Harmonics Bilateral Energy

Suppose n is the highest harmonic in the system, and the represents a particular
harmonic, k=2,3,…,n, system load branch has x bars, j represents a particular load
branch, j=1,2,…x, the number of harmonics in the source system is m, And i represents
the number of harmonics sources k, i=1,2,…,m. When the harmonic effect, it can
be superposition theorem. Harmonic power of each branch corresponds to each
harmonic voltage source superimposed effect alone [5,6].
m U j(k )
∑Z
j =1 j(k )
U ab ( k ) =
1 x
1
+∑
Z s(k ) Z j(k )
j =1
(1)

In Figure 1, Uab(k) represents voltage between two points a and b in the harmonic
k. Uj(k) represents the harmonics load branch source j in k harmonic. j(k) represents
impedance on the harmonics load branch source j at k harmonic, j(k)=Rj(k)+jXj(k), Zs(k)
represents the system power at harmonic k equivalent impedance, Zs(k)=Rs(k)+jXj(k).
U ab ( k ) − U i ( k )
I j(k ) =
Z j(k ) (2)

Ij(k) represents harmonic current on the harmonics load branch source j in k harmonic.
Ui(k) represents load branch has obtained harmonic source’s voltage, i=1,2,…,m.
Pj ( k ) = I j ( k ) 2 R j (3)

Pj(k) represents the load branch j in harmonic k active power consumed by the load.
From superposition theorem, when system voltage source Us working alone, Figure 1
can be modeled as shown in Figure 3.

Is I1 I2 Ii a Im Im+1 Ij Ix
Zs Z1 Z2 Zi Uab Zm Zm+1 Zj Zx

Us Umk
b

Figure 3. The circuit of the system source working alone

As same as Figure 2.

Us 1 x
1
= U ab (1) ( +∑ )
Z s (1) Z s (1) j =1 Z j (1)
(4)

The system voltage source is Us, it only produced the fundamental source. ZS(1)
represents the system voltage source equivalent impedance which under the
fundamental source. Zs(1)=Rs(1)+jXs(1), Zj(1) is the impedance of the load branch in
 Research on Measurement Method of Harmonics Bilateral Energy   569

fundamental source. Zj(1)=Rj(1)+jXj(1), Uab(1) represents the fundamental source between


two points a and b.
U ab (1)
I j (1) =
Z j (1) (5)

Ij(1) represents the current load branch j on fundamental wave source.


Pj (1) = I j (1) 2 R j (6)

Pj(1) represents load branch j active power consumption on fundamental source.


Found one by one to determine whether it contains harmonic source load branch
by open circuit [7,8]. By observing the current Ij(k), it is inflow load branch or outflow
load branch, and achieve harmonics bilateral energy measurement.

3 The Digital Simulation OF harmonics bilateral energy


measurement

The simplified model shown in Figure 4.

IR3 IR2 IR1 IR4

R3=15Ω R2=10Ω R1=0.5Ω


L=0.01H Ux
AC
US=100sin(2πft)
R4=1Ω

Figure 4. The circuit of bilateral energy measurement

Us represents the system voltage source, Us = 100 sin (2πft), f = 50 Hz, R1 = 0.5Ω,
R2 = 10Ω, R3 = 15Ω, R4 = 1Ω, L = 0.01H. The results are shown in Table 1, which is about
Physical of the electrical magnitude and Table 2 is about phase angle. The Table 3 is
about model consumption of the fundamental wave and harmonic components in the
active power.
The active power that the fundamental wave source emitted is equal to the active
power that all linear elements and non-linear element consumed. The harmonic
active power that harmonic nonlinear element emitted is equal to that harmonic
active power its linear element consumed.
570   Research on Measurement Method of Harmonics Bilateral Energy

Table 1. Physical of the electrical magnitude

Harmonic 0 1 2 3 4 5 6 7

Us(V) 0.00 100.00 0.00 0.00 0.00 0.00 0.00 0.00


Ux(V) 23.84 54.84 16.83 1.97 3.56 1.95 1.68 0.96
UR4(V) 16.31 25.85 11.52 1.34 2.44 1.33 1.15 0.66
UL(V) 7.53 80.44 5.32 0.62 1.13 0.61 0.53 0.30
IR1(A) 15.06 39.20 10.63 1.24 2.25 1.23 1.06 0.61
IR2(A) 0.75 8.04 0.53 0.06 0.11 0.06 0.05 0.03
IR3(A) 0.50 5.36 0.35 0.04 0.08 0.04 0.04 0.02
IR4(A) 16.31 25.85 11.52 1.34 2.44 1.33 1.15 0.66

Table 2. Physical of the electrical phase angle

Harmonic 0 1 2 3 4 5 6 7

Us() 0.00 -1.57 0.00 0.00 0.00 0.00 0.00 0.00

Ux() 3.14 -1.50 -0.27 -3.07 -0.25 -2.86 1.03 0.38

UR4() 0.00 -1.67 2.87 0.07 2.89 0.28 -2.12 -2.76

UL() 3.14 -1.56 -0.27 -3.07 -0.25 -2.86 1.03 0.38

IR1() 3.14 1.51 -0.27 -3.07 -0.25 -2.86 1.03 0.38

IR2() 3.14 -1.56 -0.27 -3.07 -0.25 -2.86 1.03 0.38

IR3() 3.14 -1.56 -0.27 -3.07 -0.25 -2.86 1.03 0.38

IR4() 0.00 -1.67 2.87 0.07 2.89 0.28 -2.12 -2.76

Table 3. The consumption of the fundamental wave and harmonic components in the active power.

Harmonic 0 1 2 3 4 5 6 7 Sum(W)

PR1(W) 113.36 384.21 28.25 0.39 1.27 0.38 0.28 0.09 528.22

PR2(W) 5.67 323.56 1.41 0.02 0.06 0.02 0.01 0.00 330.76

PR3(W) 3.78 215.71 0.94 0.01 0.04 0.01 0.01 0.00 220.51

PUS(W) 0.00 -1956.49 0.00 0.00 0.00 0.00 0.00 0.00 -1956.49

PR4(W) 266.08 334.20 66.31 0.90 2.97 0.89 0.66 0.22 672.23

PL(W) -388.89 698.82 -96.91 -1.32 -4.34 -1.30 -0.97 -0.32 204.77

Add(W) 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
 Research on Measurement Method of Harmonics Bilateral Energy   571

In the fundamental wave, The harmonic source absorb system source’s


fundamental wave power, and then convert the power to DC energy and harmonic
energy [9,10]. So, This causes harmonic source actually bear the actual cost of energy
is less than its cost should be borne, and linear elements bear the actual cost of
energy is higher than its cost should be borne. This has resulted in uneven energy
billing. According to the measurement method of harmonics bilateral energy, to
detection branch at the fundamental wave and harmonic emitted or absorbed power.
Let harmonic source to bear responsibility of harmonic pollution. Then achieve
harmonics bilateral energy.
Therefore, the results of this analysis has verify the correctness of measurement
method of harmonics bilateral energy. Further deepen the study of harmonics energy
measurement mode. This method not only separate the fundamental wave power and
harmonic power, but also refine analysis of each harmonic load power. Meanwhile
identification harmonic source in power system and trace each harmonic power
source [11,12]. Quantization harmonic power of each harmonic source feedback to the
power system, Clarify harmonic power that harmonic source provide to each branch.

4 Conclusion

Conclude the analysis results of this paper, we can get the conclusions as follows:
1. In the process of harmonics bilateral energy measurement, and abandon the
existing measurement methods based on the total distortion of the waveform
to calculate active power consumed by the user. And separate the fundamental
wave and harmonic. Refinement analysis of each harmonic power load. and
identification system harmonic source. Retroactive source of each harmonic
power. And improve the accuracy of energy measurement when harmonic
influence.
2. Research on measurement method of harmonics bilateral energy, When the
measured branch in a harmonic power consumption is negative. Description of
the measuring branch has the harmonic source.
3. All branches in the harmonic, the algebraic of various elements’ power
consumption is 0.
4. The energy emitted from harmonic source is absorbed and transformed from a
part of the system voltage source. The non-linear loads will consume the system
fundamental power, then feedback to the power system harmonic power; linear
load will be forced to absorb the harmonic power, the absorption of harmonic
power from various harmonic source system.

Acknowledgment: This research is financially supported by Scientific and


Technological Project of State Grid Hubei Electric Power Company in Xiangyang
power company, the coding of State Grid ERP (5215D01502QM).
572   Research on Measurement Method of Harmonics Bilateral Energy

References
[1] Chen haozhong. Power Quality. Beijing: Tsinghua University Press, 2006.
[2] George, J. W. Power system harmonics basic principles, analysis and filter design. Beijing:
Mechanical Industry Press, 2010.
[3] Yang Jintao, Le Jian, Wang Ni, Liu Kaipei. Measurement error under harmonic background
Energy Measurement System. Automation of Electric Power Systems, 2015, 13: 144-150.
[4] Luo Yaqiao, Hu Chong. The influence of harmonics on electric energy metering analysis. Electric
Power Automation Equipment, 2009, 05: 130-132.
[5] Wen He, Ten Saosheng, Hu Xiaoguang, Wang yong, Zen Bo. The improvement and application
of energy measurement methods that had the presence of harmonics. Journal of Scientific
Instrument, 2011, 01: 157-162.
[6] Yao Li. Under the influence of the harmonic energy metering. Electric measurement and
instrument, 2005, 10:24-27+14.
[7] Tong Xiangqian, Xue junyi. Considering the harmonic pollution on the users of electricity
measurement. Automation of electric power systems, 2002, 22:53-55.
[8] Zhao Wei, Peng Hongliang, Sun Weiming, Le Jian. Under the condition of harmonic power
metering scheme based on quantitative analysis of measurement error. Automation of electric
power systems, 2015,12:1-125+151.
[9] Sun Hongwei, Yu Xijuan, Wang Dawei, Gao Shuping, The influence of harmonics on electric
energy metering and its simulation. Power electronics technology, 2006, 04:12 3-126.
[10] Shen Saodong, Wei Xing, Harmonic simulation study of the impact of active power
measurement. Electric power automation equipment, 2008, 02:54-57.
[11] Qiu Taoxin, The influence of electric power harmonic in electric energy metering research.
South China University of Technology, 2012.
[12] Li xiang. Power harmonic source identification method research. Chongqing university, 2014
Long-fei WANG*, Wei ZHANG, Xiang-dong CHEN
Traffic Supervision System Using Unmanned Aerial
Vehicle based on Image Recognition Algorithm
Abstract: As the traffic pressure is increasing, the Intelligent Transportation System
(ITS) has been developing rapidly, and the Traffic Supervision System (TSS) is a crucial
part of the ITS. In this article, a better TSS using unmanned aerial vehicle (UAV) with
a fixed camera is proposed. Supervised by the camera, the real-time traffic condition
transferred by WiFi is firstly dealt with the Adaboost algorithm which can recognize
vehicles automatically in the personal computer, then according to the density of the
vehicles, different strategies will be taken to relieve the condition, at the same time,
the mobile phones and personal computers can accept the accurate location and flight
envelop of the aerial vehicle by GPS. In the test of the real condition, this proposal
improves the existing mode of transferring traffic information, and there are obvious
advantages in monitoring traffic, preventing and controlling traffic congestion.

Keywords: Intelligent Transportation System, Adaboost algorithm, Unmanned aerial


vehicl

1 Introduction

As the rapid development of the society, traffic flow has increased to the extent that
traffic accidents have influenced the security of social public traffic significantly. So
ITS is proposed to solve this problem, and ITS includes four parts: Vehicle Control
System (VCS), Vehicle Management System (VMS), Traffic supervision system (TSS)
and Travel Information System (TIS). Among them, the TSS is particularly important
because it is the foundation of the other systems and the source of traffic data. At
present, though road monitoring system using fixed cameras are widely used for TSS,
there are some deadly deficiencies as follows:
1. Angle of the camera is scheduled, limiting the scope greatly.
2. Comprehensive supervision manage with high cost.
3. Supervising the traffic condition fully of the road is hard to realize.
4. TSS based on the fixed cameras lacks intelligent process and needs manual
intervention.
5. Delay in time and space will miss the optimal time solution.

*Corresponding author: Long-fei WANG, College of Telecommunication and Information Nanjing


University of Posts and Telecommunication, Nanjing, China, E-mail: awanglf@163.com
Wei ZHANG, College of Telecommunication and Information Engineering, Nanjing University of Posts
and Telecommunication, Nanjing, China
Xiang-dong CHEN, School of Computer Science, Nanjing University of Posts and Telecommunica-
tions, Nanjing, China
574   Traffic Supervision System Using Unmanned Aerial Vehicle

Unmanned aerial vehicles, belonging to the non-manned aircraft operated by radio


remote and control programs, have already achieve great success in military and civil
regions. In military, concerning the advantages of high survival performance and
all-day detection capability, the UAV is used in electronic interference, anti radiation
attacks and other electronic warfare missions. In civilian areas, owing to it’s flexible
flight, precise positioning, small size, and high security, the UAV is widely applied in
agricultural condition supervision [1], express system [2], natural disaster monitoring
[3], intelligent transportation and other fields. Specific illustration is shown in Figure
1 below.

Figure 1. Specific examples in using the UAV

Professor Coifman performed MLB’s manufacturing aircraft system and GIS platform
to monitor the road’s level of service (LOS), annual average daily vehicle traffic
(AADT) and other parameters [4]. Professor Hickman’s research team used aerial
digital video to acquire intersection queue length, and in turn, they examined aerial
video processing results [5]. Researcher Liu Xiaofeng carried out the corresponding
research about path planning, traffic information collection and traffic incidents
detection [6].
All of the researches above have great achievements, however, none of them
combined the image recognition technology with the intelligent monitoring system,
at the same time, used the platform provided by the UAV. The great significance of
this article is to realize low cost and deep degree of intelligence, making the system
efficient and effective. Practice has proved that the system is conducive to the timely
prevention of traffic congestion and real-time response.
In this paper, we firstly introduce the overall design scheme, including target
function and the system’s work flow. Then state the principles and steps of the core
algorithm. Finally, evaluate the merits and demerits of the system.
 Traffic Supervision System Using Unmanned Aerial Vehicle   575

1.1 The Overall Design Scheme

Restricted by the power and energy of UAV, the system now is unable to achieve
all-day cruising flight. So an alternative way is to use UAV in rush hour or main roads
of city. In addition, this system is adapted in special natural terrain and sparse road
networks to compensate the devoid of TSS [7].
1) Target function
a) An UAV with a fixed camera cruises according to scheduled path, receive and
transfer the real-time traffic condition.
b) Haar-like feature based on Adaboost algorithm [8] computes the number of vehicles
(m).
c) T
 he traffic density (n) is computed as the ratio of number of the vehicle (m) and the
total area (S) of the vehicle lane in the video screen.

m × S0 0 ≤ n ≤ 1
n=
S
Schematic diagram is shown in Figure 2. In this Figure, S represents the total area of
the lane from the video (assume that the aircraft flies in a constant height and the area
of the lane from the video is constant), S0 is the average area of vehicles.

Figure 2. The schematic diagram


576   Traffic Supervision System Using Unmanned Aerial Vehicle

1.2 Work flow of The System

There are two parts in the traffic supervision system using unmanned aerial vehicle
based on Image recognition algorithm: data acquisition and intelligent management.
The whole system is pictured in Figure 3.

Figure 3. The workflow of the system

During the process of data acquisition, the camera fixed in USV shoots real-time
traffic videos, at the same time, the videos are transferred to the personal computer.
This function depends on a wireless imagine sensor, which adopts Orthogonal
Frequency Division Multiplexing (OFDM) and is based on IEEE 802.11a. Its maximum
transmission rate can reach 54Mbit/s.
In the intelligent management, the system firstly computes the density of the
vehicles (n) of each video frame. Then according to the different value of n, there are
different countermeasures, for example, when there is slight congestion (N1<n<N2), a
warning is sent to the traffic police department. When the traffic condition worsens
(N2<n<100%), an emergency aircraft will arrive at the location to shoot details and
traffic police will receive a notification in their personal equipment to catch up there.
 Traffic Supervision System Using Unmanned Aerial Vehicle   577

For setting the value of N1 and N2, a majority of factors has to be considered,
including the average traffic flow, the road condition, and the climate and topography.
Consequently, further researches are needed for accurate values of N1 and N2 in
different road configuration. The concise process of data acquisition and intelligent
management is in the Figure 4.

Figure 4. The process of data acquisition and intelligent management

2 The core algorithm of the system

2.1 Concise Ideas of Adaboost

Adaboost improves Boosting algorithm based on PAC model. The algorithm points
out, for a weak classifier, as long as it is given enough samples, it can be promoted
to a strong classifier. Adaboost can adjust automatically the error rate according to
the feedback of the weak learning, which means Adaboost does not require prior
knowledge about the performance of the weak learner.
578   Traffic Supervision System Using Unmanned Aerial Vehicle

2.2 Feature Extraction

For real-time identification of vehicles, the method based on feature is much more
efficient than that based on the pixel. Haar-like feature is a typical way of using
feature, which has five rectangle features as follows in the Figure 5.

Figure 5. The process of data acquisition and intelligent management

The characteristic value is the difference between pixels of white parts and pixels
of black parts. Take the integral image into consideration, the integral image of one
point is defined as the sum of all pixels in the left upper rectangle of the point to the
origin. For example, in the Figure 6, the integral graph of point A is the sum of all
pixels in the grey rectangle region.

Figure 6. Diagram of one integral image

Take the rectangle feature like Figure 5a to compute characteristic value for example.
From the Figure 7, the rectangle characteristic value equals the sum of pixels in
white parts subtracts the sum of pixels in black parts= [(i4 + i1 ) − (i2 + i3 )] − [(i6 + i3 ) − (i5 + i4 )] ,
i j represents the integral image of point j. So once know the coordinates of the
corresponding points, the calculation of the characteristic value of the rectangle
feature of any type can be obtained by simple addition and subtraction.
For a child window (20*20) of the detector, there are 78460 rectangle features,
and the characteristic distribution of the vehicles and the non-vehicles is shown in
Figure 8. The most important step is to choose suitable rectangle features which can
distinguish vehicles and non-vehicles.
 Traffic Supervision System Using Unmanned Aerial Vehicle   579

Figure 7. Diagram of one integral image

(a)

(b)
Figure 8. The characteristic distribution of the vehicles in(a) and the non-vehicles in(b)
580   Traffic Supervision System Using Unmanned Aerial Vehicle

Through the Adaboost algorithm, the advantage value of identifying vehicles and
non-vehicles is the point where the curve intersects x-axis in Figure 8. In Figure 8a,
the Figure addressed by the short yellow segment represents the optimal value to
distinguish vehicles. Likewise, the best number to recognize non-vehicles is lighted
in Figure 8b obviously. So strong classifier works according to these key characteristic
values.

2.3 The Strong Classifier

Step1: Give a set of samples for training (x1,y1), (x2,y2), (x3,y3),...,(xn,yn). When yi =1, it
means vehicles, however, when yi =0, it means non-vehicles (i = 1,2,3,...n).

Step2: Weight initialization, wi,1, is the average value.

Step3: For k = 1,2,3 n wi ,k


qi ,k = n
Weight normalization:
∑w
j =1
j ,k

Seek the optimal weak classifier hk(x) with the error rate εk
wi ,k

w =
Adjust the weight: i ,k +1 w  ε k 
 i ,k  1 − ε 
  k 
 N
1 N

1
H (χ ) = 
∑ λ h (χ ) ≥ 2 ∑ λ
k =1
k k
k =1
k

else 0
The final strong classifier is: 
Adaboost can adjust the error rate by feedback. With massive circulation, it can
achieve strong classifier with good classification performance.

2.4 The Optimal Weak Classifier


1 pi f i ( x) < piθ i
One classifier of the ith feature is hi ( x, f i , pi ,θ i ) =  0 else , pi is the inequality
direction indicator.
N
f i represents the ith feature. θ i is the threshold. An ith error
rate is ε f = ∑ qi | hi ( x, f i , pi , θ i ) − yi | . The optimal weak classifier needs ε = min{ε } and
i
j =1 t fi
the coefficient pt , f t , θ t , which can be achieved by method of exhaustion. As a result,
the optimal classifier is ht = h( x, f t , pt , θ t ) . The overall algorithm flow chart is shown
in Figure 9.
 Traffic Supervision System Using Unmanned Aerial Vehicle   581

Figure 9. The algorithm flow chart

3 Test of the sysytem

Make the UVA in cruise mode, flying along the specified path. From the personal
computer, there are the vehicle density and different solutions. Finally lock the
location based on GPS.

3.1 Test of the Data Transmission

By using software of Planner Mission, it can display the status and malfunction
information. The operation interface is shown in Figure 10. At the bottom left of the
interface, there is real-time status of the UVA, including height, speed, angle, signal
intensity, target distance and the distance; on the right, it shows the GPS positioning
information.
582   Traffic Supervision System Using Unmanned Aerial Vehicle

Figure 10. Operation interface in personal computer

3.2 Identification Interface of the Vehicles

By programming in Visual Studio, an interface which can adjust video format is


shown in Figure 11. From the interface, the density and number of vehicles can be
clearly known. Then according to the density, different actions will be taken.

Figure 11. Identification interface of the vehicles

3.3 Orientation Interface of Mobile Phones

Using the Droid Planner, the traffic police department can know the accurate location
where terrible congestion or accidents happen. It is convenient and timely. The
interface can be seen in the Figure 12.
 Traffic Supervision System Using Unmanned Aerial Vehicle   583

Figure 12. The orientation interface

4 Conclusion

The supervision system of intelligent transportation, designing for monitoring the


traffic condition and using the quad rotor UAV with image recognition algorithm of
vehicles, improves the intelligence of the previous system. The improved system is of
great significance to enhance the efficiency of monitoring traffic condition, relieve
traffic pressure and prevent traffic congestion. After being debugged and improved,
the system is now able to realize the expected function and actions, as a result, hard
traffic can be greatly ameliorated.
There are still some deficiencies in this system. (1) The parameters of different
roads are hard to represent with uniform formula because of the different road
condition and the capacity of the traffic. (2) Inadequate positive and negative samples
restrict the accuracy of Adaboost algorithm. (3) When it’s rainy, snowy and in other
inclement weather, the system will be affected.
In order to further enhance the reliability and security of the system, the core of
the next work will be around following issues. First of all, researchers need to establish
an unified standard to measure the road traffic conditions, considering the difference
of flight altitude and the influence of different roads. Secondly, the training sample
database needs to be broadened for better accuracy of detection. Thirdly, researchers
should pay attention to the quality of videos in rainy, snowy and foggy days. In the
584   Traffic Supervision System Using Unmanned Aerial Vehicle

last, distance of information transmission and the scope of supervision also should
be further studied.

Acknowledements: This research was supported by the Project of Natural Science


Research of Jiangsu University under grant No. 15KJB510023.

References
[1] Jia Pengyu,Feng Jiang,Yu Libao. The Application of Small UAV in Crop condition monitoring[J].
Journal of Agricultural Mechanization Research, 2015(4):261-264.
[2] Deng Yang,He Jun,Li Qi. The research and design of automated UAV express system. Computer
CD Software and Application, 2014(12):102-104.
[3] Li Yun,Xu Wei, Wu Wei. The application and research of UAV technology in Disaster
monitoring[J]. Science of disaster, 2012, 29(3).
[4] Coifman B, Mccord M, Mishalani RG, M Iswalt.Roadway traffic monitoring from an unmanned
aerial vehicle[J]. Intelligent Transport Systems Iee Proceedings,  2006, 153(1):11-20.
[5] Agrawal A, Hickman M. Automated extraction of queue lengths from air borne imagery[C].
International IEEE Conference on Intelligent Transportation, 2004:297-302.
[6] Liu Xiaofeng, Chang Yuntao, Wang Xun. An UAV Allocation Method for Traffic Surveillance in
Sparse Road Network[J]. Journal of Highway and Transportation Research and Development,
2012, 29(3).
[7] Li Yun, Xu Wei, Wu Wei. Disaster monitoring UAV technology research and application of [J].
Disaster Science, 2011, 26 (1): 138-143.
[8] Wen Xuezhi, Fang Wei, Zheng Yuhui. An Algorithm Based on Haar-Like Features and
Improved AdaBoost Classifier for Vehicle Recognition[J]. ACTA ELECTRONICA SINICA, 2011,
39(5):1121-1126.
Yu-fei LI*,Ya-yong LIU,Lu-ning XU,Li HAN, Rong SHEN, Kun-quan LU
Improving the Durability of Micro ER Valves for
Braille Displays Using an Elongational Flow Field
Abstract: The restrictions of Braille standard make the microER valves for Braille
display have the characteristics of micro channel and micro flux, which lead to the
poor durability of micro electro-rheological (also called ER) valves. To solve the
problem, several comparison experiments were designed and the effects of the actions
of an elongational flow field on the durability of micro ER valves were studied in this
paper. In the experiment without elongational flow field, when removing the external
electric field, the nano-particles aggregated under the external electric field for a long
period in polarmolecule ER fluids had no ability to restore to the original dispersing
state just by their Brownian movements, which lead to the blockage of micro ER valves,
and independent recovery time of the valves became longer. Moreover, the pressure
drops of valves reduced greatly after several cycles of experiments, which indicated
the operational instability of the micro ER valves. In the experiment with elongational
flow field, the micro ER valve runs normally. It was proved that the micro ER valves
with the actions of the elongational flow field have the ability to keep the micro ER
valves stable operation and meet the demand of durability for Braille displays.

Keywords: Micro ER Valve; Durability; Elongational Flow Field; Convergent Channel;


Nano-particles Redispersion

1 Introduction

The less content and high price of the existing Braille displays are not able to meet
the demand of visual impaired groups on self-learning. Electro-rheological (ER)
fluids are special suspensions which consist of large amounts of solid particles
dispensed in certain carrier liquids. Their viscosities change with external electric
field strength. The nature of liquid makes them suitable for the application of micro
valves. The micro channel and small volume of micro ER valves [1-7], and the high
yield strength of polar molecule ER fluids provide the possibility of realizing a full-
page Braille display with low cost [8]. Each dot is controlled by a micro ER valve, so

*Corresponding author: Yu-fei LI, Nano-micro Fabrication Technology Department, Institute of


Electrical Engineering, Chinese Academy of Sciences; university of Chinese Academy of Sciences,
Beijing 100190, China, E-mail: liyufei@mail.iee.ac.cn
Ya-yong LIU, Lu-ning XU, Li HAN, Nano-micro Fabrication Technology Department, Institute of Electri-
cal Engineering, Chinese Academy of Sciences, Beijing 100190, China
Yu-fei LI,Ya-yong LIU, University of Chinese Academy of Sciences, Beijing 100190, China
Rong SHEN, Kun-quan LU, Beijing National Laboratory for Condensed Matter Physics, Key Laboratory
of Soft Matter Physics, Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
586   Improving the Durability of Micro ER Valves

there are nearly 6 thousand valves compactly placed in the size of A4 paper [9-11]. The
stability of the micro ER valves affects the durability and the application experiences
of Braille displays. The stability of a micro ER valve consists of a short recovering time
of valves and the stability of pressure drops of valves. The recovery time determines
the refreshing speed of the dots of Braille displays, and the micro ER valves tend to get
out of control without stability.
The restrictions of Braille standard make the flux of ER fluids only 1-2 micro liters
for displaying one Braille dot. It is necessary to use the small flow to eliminate the
effect of the aggregation of nano-particles on the recovering time and the instability
of the pressure drops of the valves. Compared with shear flow field, the elongational
flow field is more efficient in dispersing nano-particles [12-17]. In this paper, the effect
of the actions of the elongational flow field on the durability of the micro ER valves is
experimental studied. It was proved that the micro ER valves with the actions of the
elongational flow field have a good stability and meet the demand of durability for
Braille displays.

2 Materials, instruments andequipments

The Ca-Ti-O polarmolecule ER fluids were used in the experiment, and they were
developed by the Institute of Physics of the Chinese Academy of Sciences. The ER fluid
is composed of Ca-Ti-O nano-particles and hydraulic oil, and the volume fraction of
the Ca-Ti-O nano-particles is 30%. The operational electric field of micro ER valves is
provided by high-voltage DC power supply (DW-P303-1AC) made by Tianjin Dongwen
high voltage power supply factory, and its output range of voltages is from 0 to 30kV.
The inlet and outlet pressures of valves are measured by pressure sensors (JYB-KO-
PW2GZG) produced by Beijing Kunlun Coast Sensing Technology Company, and the
measuring range of pressures is from 0 to 500kPa. The real-time data acquisition is
done by a data acquisition equipment (NI PXIe-6361 X Series) produced by American
National Instrument Company Ltd.

3 Experimental device and testing method

Figure  1 shows a set of micro ER valve device, where the piston is arranged at the
bottom of device as a pressure source. The device is full of ER fluids and divided by
the valve channel into two chambers, which are connected with the upper and lower
pressure sensors individually.
The pressure drops ( ∆p ) of valves can be obtained by the difference of the upper
and lower pressure values ( ∆p = pi − po ). And a DC electric field is supplied by two
electrodes.
 Improving the Durability of Micro ER Valves   587

Figure 1. Schematic of micro ER valve testing device

Figure 2 shows the geometric parameters of convergent flow channel of ER valves for
generating an elongational flow field. The inlet and outlet of flow channel have the same
angles ( θ=
1 θ=
2 118° ) and heights ( h1 = h2 ), besides, the electrode gap ( h = 1mm )
and width of the flow channel have the same size, and the electrode length ( L ) is
11mm. The elongational flow field is produced at the inlet and outlet through the
change of cross-sectional areas.

Figure 2. Geometric parameters of convergent flow channel of the micro ER valves

Two sets of valve device with same parameters were prepared, named No. 1 valve and
No.2 valve, and the experimental procedures are as follows:
1. The upper sensor of No. 2 valve was taken down, and the elongational flow field
was applied to the flow channel through the reciprocating motion of the piston,
while No. 1 valve not. And the upper sensor is mounted on the device subsequently.
2. The chamber pressures were adjusted to the standard atmospheric pressure.
588   Improving the Durability of Micro ER Valves

3. An external electric field ( Ee = 2kV / mm ) was applied toboth of valves, lasting for
12 hours.
4. A pressure drop ( ∆p ) of 100kPawas adjusted to the valves separately.
5. Removingthe electric field and recording the recovering processes of valves.
6. Repeating the procedures from 1 to 5 for several cycles of experiments.

4 Experimental results and discussion

Ideally, the dispersed phase will quickly restore to the original dispersing state in a
short period by their Brownian movements after removing the electric field. In fact, the
recovery of micro ER valve without the actions of elongational flow field is complex,
as shown in Figure 3 (the dash line in Figure 3). At the beginning of experiments, the
recovery time of the No. 1 valve became longer after several cycles, and the recovery
trend of the valve gradually transformed from step style (the dash line in Figure 3a) to
slope style (the dash line in Figure 3d), indicating that he No. 1 valve had been blocked.
While the recovery trend of the valve with the actions of the elongational flow field (No. 2
valve) presented in step style in all these cycles, and the recovery time was short and
stable. The blockage happened in the condition without the actions of the elongational
flow field, the nano-particles agglomerated under the electric field and cannot return to
the initial dispersion state just by their Brownian movements after removing the electric
field. While the actions of the elongational flow field have the ability of re-dispersing
the agglomerated nano-particles, the No. 2 valve run normally and did not block.
But in the later experiments, as shown in Figure 4a, the recovery trend of No. 1
valve gradually transformed into a step style with the increase of the number of
cycles, the recovery time of the No. 1 valve was even shorter than that of the No.2 valve
(arrowed in Figure 4). In addition, the values of the pressure drops of No. 1 valve were
very unstable, as shown in Figure 5, and it reduced greatly under the same operational
strength of the electric field during several cycles of experiments, the pressure drops
of valves reduced to less than 60 kPa (the dash line in Figure 5), while the pressure
drops of No. 2 valves under 2kV/mm electric field were around 100kPaconsistently,
and their values were very stable (the solid line in Figure 5).
There are two possible explanations for these phenomenons in the later stage of
experiments. Firstly, the sediments of the agglomerated nano-particles lead to the
decrease of the volume fraction of nano-particles in the channel, which weakened
the pressure drops and decreased the recovery time of the micro ER valve under
the operational strength of the electric field. Secondly, the nano-particles adhered
to electrodes, which generated a reverse electric field (Er) and diminished the total
electric field ( E=
t Ee − Er ), and the pressure drops of valves went down. There are
no corresponding methods for further confirmation about these microscopic changes
at present. In any case, this experiment shows that the micro ER valves without the
actions of the elongational flow field have a problem of durability after working for
 Improving the Durability of Micro ER Valves   589

a long period, and it cannot continue to work normally. The positive effects of the
actions of the elongational flow field on the durability of the micro ER valves are
obvious, and the micro ER valves with the actions of the elongational flow field have
a good stability, which meet the demand of durability for Braille displays.

Figure 3. Recovery of micro ER valves after removing the electric field in several cycles at the
beginning of experiments

Figure 4. Recovery of micro ER valve after removing the electric field in the later stage of
experiment(a for No.1 valve, b for No.2 valve)
590   Improving the Durability of Micro ER Valves

Figure 5. The variations of the pressure drops of the valves in the later stage of experiment

5 Conclusions

In this study, a set of valve testing device was designed and fabricated, and the
effects of the actions of the elongational flow field were investigated on the stability
and restorability of the micro ER valves. The nano-particles agglomerated under
the external electric field cannot restore to the initial dispersing state just by their
Brownian movements, and the micro ER valves without the actions of the elongational
flow field had a poor durability, while the micro ER valves with the actions of the
elongational flow field had a good stability after working for a long period, keeping a
good operational durability. It can be concluded that the introduction of convergent
flow channel which produces an elongation flow field has a positive effect on the
stability of the micro ER valves and solves the problem of the operational durability of
micro ER valves. The effects of geometric parameters of convergent flow channels on
the strength of the elongation flow field will be studied in the future.

Acknowledgements: Thanks for the funding of Chinese Academy of Sciences


KGCX2-YW-619, the National Natural Science Foundation of China (Grant no. 11574355).

References
[1] Tu F. Recentprogressandapplicationofelectrorheologicalfluids [J]. Materials Review.
2014,28(6):66-8.
[2] Qu X, Zhao H, Xu G, Liu J, Zhao H. The researchofER single-channel flat valve[J]. Hydraulic
andPneumatic. 2010(01):7-9.
 Improving the Durability of Micro ER Valves   591

[3] Wei K, Meng G, Zhu S. Application of electrorheological fluids to fluid power control [J]. Journal
ofFunctional Materials ANDDevices. 2005(01):97-102.
[4] Zhu S, He Z, Qu L. The Comparison Study of the Flow Control Performance of Two-type of ER
Valve[J]. Hydraulic Pneumatic and Sealing. 2011(09):11-3+70.
[5] Xize N, Weijia W, Yi-Kuen L. Micro valves using nanoparticle-based giant electrorheological
fluid. Solid-State Sensors, Actuators and Microsystems, 2005 Digest OF Technical Papers
TRANSDUCERS ‘05 The 13th International Conference, 2005 5-9 June 2005.
[6] Yokota S, Kondoh Y, Ishihara K, Otsubo Y, Edamura K. A fluid control valve based on the viscosity
increase effects of dielectric fluids caused by electrodes planted with hair-like short fibers
(proposition of a fiber planted ER valve)[J]. JSME Int J Ser C-Mech Syst Mach Elem Manuf.
2003,46(4):1538-46.
[7] Park JH, Yoshida K, Yokota S. Micro ER valve using homogeneous ER fluids and its application
to micro fluid control system. Industrial Electronics Society, 2000 IECON 2000 26th Annual
Confjerence of the IEEE; 2000.
[8] Lu K, Shen R, Wang X, Sun G, Cao Z, Liu J. Polarmoleculetypelectrorheological fuids[J]. Physics.
2007(10):742-9.
[9] Xu L, Liu J, Han L. A matrix of valves based on electro-rheological fluid and its application on
multi-line Braille displayer. 2012 International Conference on Applied Materials and Electronics
Engineering, AMEE 2012, January 18, 2012 - January 19, 2012; 2012; HongKong, Hong kong:
Trans Tech Publications.
[10] Xu L, Han L, Li Y, Shen R, Lu K. A Test of Giant ER Valve in DC and Square Wave AC Field
with Different Frequencies.15th International Conference on Electrorheological Fluids and
Magnetorheological Suspensions, Incheon, Korea,2016.
[11] Xu L, Han L, Li Y, Shen R, Lu K. Operational Durability of a Giant ER Valve for Braille Display. 15th
International Conference on Electrorheological Fluids and Magnetorheological Suspensions,
Incheon, Korea,2016.
[12] Huang L, Ou X, Wu S. Effect of elongation flow field on dispersion mixing of filled polymer[J].
China Plastics. 2006(09):53-8.
[13] He G, Yin X, Qu J. Research progressof dispersion mixingof filled polymer basedon elongational
flow[J]. Plastics. 2010(03):8-10.
[14] May PA, Moore JS. Polymer mechanochemistry: techniques to generate molecular force via
elongational flows[J]. Chem Soc Rev. 2013,42(18):7497-506.
[15] Janssen JMH, Meijer HEH. DROPLET BREAKUP MECHANISMS - STEPWISE EQUILIBRIUM VERSUS
TRANSIENT DISPERSION[J]. J Rheol. 1993,37(4):597-608.
[16] Kang J, Smith TG, Bigio DI. Study of breakup mechanisms in cavity flow[J]. Aiche J.
1996,42(3):649-59.
[17] Bu H, Xiao F, Ou X, Wu S. Influenceof elongation flow fieldon dispersion mixingofPP/
Nano-CaCO3 filled system[J]. Plastics Industry. 2008(11):35-8.
Jinxia WU, Fei SONG, Shiyin QIN*
Aircraft Target Detection in Remote Sensing Images
towards Air-to-Ground Dynamic Imaging
Abstract: In this paper, a detection method of aircraft targets in remote sensing
images (RSIs) is presented towards air-to-ground (A/G) dynamic imaging. In order
to reduce its time complexity and achieve a high performance detection with high
speed and high accuracy, an approach to extracting candidate regions is proposed
based on circle frequency filter (CFF) with parameter optimization and adaptive
hierarchical clustering. Moreover, dense SIFT features are extracted from candidate
regions to build the corresponding BOW model of candidate regions, with which a
kind of detection strategy based on classification of targets from candidate regions
is employed and a classifier is designed and trained with support vector machine
so that a comprehensive detection algorithm is constructed. A series of experiment
results demonstrate the performance advantages of our proposed detection method.

Keywords: aircraft target detection, air-to-ground dynamic imaging, adaptive


hierarchical clustering, circle frequency filter

1 Introduction

The detection of valuable geospatial targets in remote sensing images is one of


the most fundamental tasks for understanding RSIs, such as ship [1], urban areas
[2], airport [3], buildings [4] etc. Besides, aircraft target detection, as an important
detected target, has gained much research and exploration in both military and civil
applications. In fact, image sizes of aircraft targets under the process of A/G dynamic
imaging are continuously varying with the change of imaging altitude. They are
getting smaller and their backgrounds become more complex with the imaging height
increasing. In this way, the aircraft target detection in remote sensing images towards
A/G dynamic imaging becomes more challenging.
So far, many practical methods have been researched and applied in aircraft
target detection. In order to describe complex aircraft object precisely, the object
detection is treated as a binary classification problem between target and

*Corresponding author: Shiyin QIN, School of Automation Science and Electrical Engineering, Beihang
University, Beijing, China, 100191, E-mail: qsy@buaa.edu.cn
Jinxia WU, School of Automation Science and Electrical Engineering, Beihang University, Beijing,
China, 100191
Fei SONG, Chinese Institute of Electronics, Beijing, China, 100036
 Aircraft Target Detection in Remote Sensing Images    593

background. Until now, many successful classifiers have been applied, such
as the boosting [5], template matching [6], support vector machines [7] and neural
network [8] etc. Among classifiers, the SVM has also been successfully used in other
applications in RSIs [9,10], also which inspire us to employ SVM in our detection
algorithm. In addition to the selection of classifiers, how to choose suitable feature
descriptors is another key issue. Some robust features have be used widely in the field
of target detection in RSIs, such as SIFT [11], HOG [12], gradient feature [13], invariant
moments [7] and so on. However, a simple low-level feature can’t gain a satisfactory
detection results. Hence, feature fusion, middle-level or deep feature extraction
methods are all promoted and applied to target detection problem. For example,
Zhang et al. presented a generic discriminative part-based model for geospatial
object detection through combination appearance features with spatial deformation
features and rotation deformation features [14]. A bag-of-word (BOW) model
with spatial sparse coding is proposed for target detection [15], in which the SIFT
descriptor and the corner detectors are employed to generate the dictionary of objects
for describing targets with middle-level features. However, the method has defects
in the detection speed because of feature representation based on sparse coding and
detection method by sliding window. Afterwards, Zhou et al improved the detection
performance by using advantage of convolutional neural network (CNN) to explore
and extract intrinsic features of aircraft targets [16]. While the deep learning method
can perform well only with a large number of samples, collection and annotation of
which are tedious and time-consuming.
Therefore, for solving the problem of aircraft target size changing and enhancing
detection speed, a candidate region extracting approach is proposed based on CFF
and adaptive hierarchical clustering. Moreover, dense SIFT features are extracted to
build the corresponding BOW model of candidate regions. And a detection strategy
based on classification of targets from candidate regions is employed and a classifier is
designed and trained with support vector machine so that a comprehensive detection
algorithm is constructed.

2 Analysis of aircraft target characteristics under a/g dynamic


imaging
2.1 Resolutions and Target Sizes for Different Altitudes

The A/G imaging procedure is shown in Figure 1. The imaging device’s visual range
is shrinking and the image sizes of aircraft targets are getting large with the imaging
altitude decreasing. In this work, the image size is 320×256 pixels and the spatial
resolution is 3.28m per pixel when the imaging height is 6km. Our detection problem
focuses on detecting aircrafts from RSIs generated when imaging altitude is between
2km and 4km. Since the spatial resolution is proportional to the height of the dynamic
594   Aircraft Target Detection in Remote Sensing Images

imaging, according to the simple geometric relationship shown in Figure 2, spatial


resolutions at different heights can be calculated during the dynamic imaging process.

Figure 1. Comparison of geometric sizes for A/G dynamic imaging

Figure 2. Geometric relationship between spatial resolution and imaging altitude


 Aircraft Target Detection in Remote Sensing Images    595

According to geometric relationship shown in Figure 2, the following equation can be


obtained:
N 320 × M
=
6 320 × 3.28 (1)

where M represents the spatial resolution of RSIs when the imaging altitude is N
kilometers. According to the above equation, the results of spatial resolution at
different imaging altitudes can be got and given in Table 1.

Table 1. Spatial resolution for different altitudes

Imaging altitude 2 3 4 5 6

Spatial resolutions 1.09 1.64 2.18 2.73 3.28

Since the size information of typical aircrafts can be obtained such as transport,
bomber, early warning and so on, pixel sizes of some typical aircrafts at different
imaging altitudes can be calculated based on the data listed in Table 1. Actual sizes of
fuselage and wingspan and pixel sizes of multiple aircraft targets are given in Table 2
when imaging height is 2 km. Size information in other imaging height can be calculated
based on Table 1 and Table 2. Mastering size information of multiple kinds of aircrafts is
beneficial for rational design of candidate region extraction algorithm.

Table 2. Size metric for imaging altitude of 2km

Aircraft style Size of fuselage (m) Pixel size Size of wingspan (m) Pixel size

B-1 44.81 41 41.67 38


B-2 21.03 19 52.43 48
B-52 49.59 45 50.45 46
C-5 75.74 69 67.88 62
C-17 53.04 49 51.81 48
C-130 29.79 27 40.41 37
E-2 17.54 16 24.56 23
E-3 46.61 43 44.42 41
A-50 46.59 43 50.0 46

2.2 Characteristic Analysis of Targets with Different Size

In the process of air-to-ground dynamic imaging, the pixel sizes of aircraft targets and
the complexity of backgrounds are changing with the imaging height varying given
in Figure 3. With the imaging height raising, the shape, texture, edges of aircrafts are
596   Aircraft Target Detection in Remote Sensing Images

unapparent and the background may contain more forests, dwelling house and other
buildings. To select suitable features, the description capacity of BOW model and
HOG feature are compared in Table 3, evaluated by detection rate (DR) in the same
dataset of aircrafts.
In fact, the BOW model based on dense SIFT features has outstanding descriptive
capacity for key points. It is easier to extract key points from larger size aircraft targets
and it’s very hard or impossible from smaller size aircraft targets. Since the prominent
performance of SIFT feature, the BOW model can represent large size targets better
than HOG feature but it cannot describe small size targets precisely. As for small size
aircraft, the gradient information is predominant so the HOG feature performs well
with higher imaging altitude. Hence, in this paper, BOW model is used for feature
representation for the detection problem when imaging altitude is between 2km and
4km, and it is also robust to rotation, size and illumination.

(a) imaging altitude is 4km (b) imaging altitude is 3km (c) imaging altitude is 2km

Figure 3. Illustrative comparison of different size aircraft targets in RSIs

Table 3. Comprision of feature description capacity

BOW model HOG

2 96.32% 82.8%

3 94.44% 85.2%

4 92.37% 88.5%

5 59% 73.17%

6 34.1% 70.5%
 Aircraft Target Detection in Remote Sensing Images    597

3 Preprocessing and extraction of candidate regions towards


target detection

Since target detection methods for remote sensing images based on image segmentation,
geometry information and template matching cannot effectively detect complex aircraft
targets, especially in the process of air-to-ground dynamic imaging. Their detection
accuracy would have a sharp decline when the background is complex and sizes of
aircrafts are changing. Therefore, the aircraft target detection problem is usually
regarded as a classification problem between target and background and can be solved
by some methods based on learning, which can achieve satisfactory results. These
methods can be realized by sliding window approach but it has high computation
complexity, which limits its real application. In order to accelerate detection speed,
a candidate region extraction method is promoted base on proper CFF and adaptive
hierarchical clustering method towards target detection and given in Figure 4.

Figure 4. Candidate region extractionchanges.

3.1 Image preprocessing

In the process of the dynamic imaging, due to the effects of light, cloud, atmospheric
refraction and motion of imaging equipment, remote sensing images will introduce
noise inevitably. In addition, there are plaques, guide lines, tags in apron area and
these distractors will also affect the detection accuracy. Therefore, remote sensing
images are needed to be preprocessed for improving detection performance.
Compared with traditional Gaussian filter, the bilateral filter [17] considers not only
spatial distribution of image intensities but also the similarity between neighbor
pixels. It can smooth the background to reduce noise and keep the edge information
of original images at the same time.
Figure 5 compares the results of median filtering, Gaussian filtering, and bilateral
filtering. The median filter result does not obtain obvious improvement for original
image, and the Gaussian filtering result smooths not only the background but also
the aircraft targets. Bilateral filter can smooth the disturbance in apron effectively,
such as guide line and plaque, and the problem that the intensity distribution of
background is uneven can also be solved. Besides, the edge information of the target
aircraft also can be preserved in a way tuned to human perception.
598   Aircraft Target Detection in Remote Sensing Images

(a) Original image (b) median filtering

(c) Gaussian filtering (d) bilateral filtering

Figure 5. Comparison of smoothing results

3.2 Proper CFF towards target detection

Cai et al. [18] make use of circle frequency filter to locate aircraft target centers and
the method can distinguish aircrafts from backgrounds based on the shape feature
of aircrafts. But it is much more dependent on radius parameter. Fixed radius can
only establish centers of aircrafts with corresponding scale. In order to solve the
problem of radius parameter dependency, pixel size information of different aircrafts
at different altitudes given in Table 2 is applied and a group of radius parameters for
each imaging height is set based on empirical value. Setting of radius parameters is
given in Table 4, where Ri represents radius parameters when imaging height is i km.

Table 4. Setting of radius parameters r

Imaging altitude/km R/pixels

2 R2={10, 12, 14, 15}

3 R3={4, 6, 8, 10}

4 R4={3, 4, 5, 6}
 Aircraft Target Detection in Remote Sensing Images    599

The CFF image is obtained by processing the original image by circle frequency
filter and normalizing the intensity of pixels to [0,1]. CFF result of aircraft centers is
much higher than others. So a threshold segmentation method is used for the CFF
images to get aircraft center candidate points. In this work, the threshold is set to
0.65. CFF images mentioned in the following are images processed by threshold
segmentation.

3.3 Extraction of Candidate Regions based on Adaptive Hierarchical Clustering

A candidate region is defined as an image patch whose center is an aircraft center


candidate point. If areas where all aircraft center candidate points locate are extracted
as candidate regions, it is impossible to improve the detection speed significantly. So
if they are classified into several classes through clustering method and class centers
are considered as new aircraft centers that it will accelerate detection speed. K-means
algorithm, k-medoids algorithm and hierarchical clustering algorithm are commonly
used in clustering analysis. K-means attempts to minimize the total squared error,
while k-medoids minimizes the sum of dissimilarities between points labeled to be
in a cluster and a point designated as the center of that cluster. In contrast to the
k-means algorithm, k-medoids chooses data points as centers and it could be more
robust to noise and outliers. Both methods need to select initial cluster centers
randomly and set number of clusters beforehand. Owing to random selection of the
initial cluster centers, two adjacent points may be clustered into different categories
shown in Figure 6, which is inconsistent with the actual situation.


(a) Original image (b) CFF result (c) clustering result

Figure 6. Results of k-medoids clustering algorithms

In order to avoid the above issue, in this paper, hierarchical clustering [19] is applied
for clustering aircraft center candidate points, which works by grouping the data
one by one on the basis of the nearest distance measure of all the pairwise distance
between the data point. The method also need to set the number of clusters in advance,
while the number of aircrafts in each RSIs is unknown so the number of cluster can’t
be given beforehand. Hence, an adaptive hierarchical clustering method is promoted
600   Aircraft Target Detection in Remote Sensing Images

to establish proper number of cluster. The change trend of within-class distance is


becoming gradually smooth with number of cluster increasing. According to the fact,
the appropriate number of clusters can be defined followed by Algorithm 1.
If the number of cluster is cNum and its second order within-class difference
distance is a local minimum or the ratio of its backward within-class difference
distance and forward within-class difference distance is greater than 5, the cNum is
the proper number of cluster. Moreover, if it exists that the distance between some two
classes is too small when the number of cluster is k, it means that the proper number
of clusters should be less than k. Hence, β is used as a threshold set to 25 pixel in this
paper and label represents numbers which the distance between classes is less than
β. So when the number of clusters is k and label >1, the proper number of clusters
should be less than k. And the cNumMax means the max number of cluster and is
set to 20 in this paper. cNum represents suitable number of clusters. After the cNum
is obtained, the area where cluster centers locate are regarded as candidate regions.
The final candidate regions can be achieved by combined candidate regions in each
parameter in Ri and the result is shown in Figure 7 when the imaging altitude is 2km,
where Ri(j) represents the jth element in the set of Ri.

Figure 7. Result of candidate region extraction

The candidate region extraction method can locate candidate regions precisely when
aircrafts have distinctive shapes and RSIs have sharper contrasts between aircrafts
and backgrounds. However, some false alarms will appear when the background
 Aircraft Target Detection in Remote Sensing Images    601

is complex and the intensity distribution is uneven. To detect aircrafts precisely,


learning methods are required to screen out targets from candidate regions. In next
part, a detection method based on BOW model of candidate region is presented for
aircraft target detection towards air-to-ground dynamic imaging.

Algorithm 1 Adaptive Hierarchical Clustering

Input : image I of CFF, maximum cluster number cNumMax, cluster variable j


Output : proper number of cluster cNum
Initialization : cNumMax = 20, = j 0
1.for j = 0 : cNumMax
1) Conduct hierarchical clustering for I when the number of clusters is j
2) Calculate average within-class distances and distances between classes when
the number of clusters is j
3) Calculate the label
4) Store average within-class distances in the array of withArray
end
2.Calculate the first-order difference of withArray and represent as diffArray
3.Judge the label and get the maximum value of cNum , ind
if (label )
ind = find (label ,1)
else
= ind cNumMax − 1
end
4.Obtain the proper cNum
while(ind )
{
bd = diffArray (ind -1) - diffArray (ind )
fd = diffArray (ind ) - diffArray (ind + 1)
if (bd > 0 & & fd < 0 || bd / fd >= 5)
break ;
else
ind
= ind − 1;
}

4 Aircraft target detection based on bow model of candidate


regions
4.1 Feature Extraction and Building of BOW Model of Candidate Regions

In this part, an aircraft target detection method based on BOW model of candidate
regions is promoted to detect aircraft precisely. Bag-of-Words (BOW) model which is
initially proposed for text categorization has been introduced for object detection in
images [20,21]. The model treats an image as a collection of unordered appearance
descriptors extracted from local patches, and quantizes them into discrete “visual
words” and then computes a compact histogram representation.
602   Aircraft Target Detection in Remote Sensing Images

Given an image, feature description is to represent the local patches in an image as


numerical vectors by feature descriptors. A good descriptor should have the capacity
to handle intensity, rotation, scale, and affine variations to some extent. The SIFT
descriptor has been shown to outperform a set of existing descriptors, such as HOG,
textual features etc. And the maximum size of aircraft targets is only 40×40 pixels,
so dense sift feature are introduced as the local feature descriptor in order to avoid
without detecting any key points.
Figure 8 shows the procedure of generating a visual-word codebook and
representing feature based on BOW model for candidate regions. Firstly, a set of
feature description are obtained by dense SIFT, and a visual-word codebook is formed
by the K-means algorithm. With the codebook, each image patch can be represented
as a histogram vector.

Figure 8. Procedures of codebook generating and feature representation

4.2 Aircraft Target Detection based on BOW Model

SVM delivers excellent performance in real-world applications such as text


categorization, hand-written character recognition, image classification and are
now established as one of the standard tools for machine learning and data mining.
 Aircraft Target Detection in Remote Sensing Images    603

In this paper, the SVM is also used with Gaussian kernel as classifier because of its
excellent characteristics in detection task. The aircraft target detection method in
RSIs towards air-to-ground dynamic imaging is shown in Algorithm 2. As for each test
image, candidate region extraction method is applied for obtaining candidate regions
to reduce detection time. And then dense SIFT features are extracted from candidate
regions and encode them into histogram vectors based on BOW model. Finally,
histogram vectors are fed into trained SVM based on Gaussian kernel to discriminate
whether they are targets or not.

Algorithm 2 Aircraft target detection in RSIs towards A/G dynamic imaging

Input: Image I, imaging altitude i km


Output: Location of aircraft targets
Conduct image preprocessing for I
Extract candidate regions from I
Conduct feature representation for from candidate regions based on BOW model
Conduct classification for candidate regions based on kernel SVM to detect aircraft targets

In order to achieve better generalization for SVM algorithm, the cross validation
is used to optimize the SVM penalty parameter and kernel parameter so that the
prediction accuracy and robustness is effectively improved. Because multiple radius
may be selected as CFF parameters thus it results in multiple overlapping candidate
regions. To ensure the accuracy of detection, multiple overlapping detections is fused
based on the distance between detection boxes. Figure 9 shows the result of the fusion
of multiple detections. For convenience, a square-shaped window is applied to mark
the detections.

4.3 Performance Analysis of Detection Algorithm

Our proposed method is not only applicable for aircraft target detection in RSIs for air-
to-ground dynamic imaging but also for RSIs with fixed spatial resolution. It can solve
the size changing issue in air-to-ground dynamic imaging. Besides, it can also detect
small size aircraft target in complex background with higher imaging altitude, which
is challenging problem. This method is also robust for multiple aircraft types, such as
transport aircraft, early warning aircraft and bombers but it maybe not suitable for
fighters because of their too small size. Our method focuses on RSIs generated from
2km to 4km imaging height. Hence, the detection accuracy will be not satisfactory
when the imaging height is more than 4km. Our work also provides a promising way
to detect complex airplanes in different pixel resolutions.
604   Aircraft Target Detection in Remote Sensing Images


(a) Multiple overlapping detection (b) Fused detection results

Figure 9. Fusion of multiple detections

5 Comprehensive experiments and comparative analysis

5.1 Datasets and Experiment Environment

Due to the lack of standard data sets of remote sensing images for target detection
task, especially generated during the process of dynamic imaging, performance of our
method is evaluated on 82 images collected from Google Earth for aircraft detection
according to imaging conditions calculated in Section2 and imaging altitude is from
2km to 4km. For each image, there are some aircrafts which may belong to different
types with different directions or sizes. The training set is built by choosing 400 image
patches with an aircraft in the center of each patch as positive samples and selecting
1300 image patches randomly from the background of the airport images as negative
samples. Since our proposed algorithm is robust to size, shape and direction varying,
the size of 40×40 pixels is chosen as the sample size and not need to rotate samples to
ensure the method has rotation invariance. Some samples from the training dataset
are illustrated in Figure 10. The database is very challenging, since some aircrafts are
not clear and the spatial resolution is varying, which cause the size of aircrafts and
the complexity of background is changing. The proposed method was run on a PC
with an Intel i5-5200U 2.20-GHz CPU and 8-GB RAM, implemented in MATLAB.

(a) Positive training samples (b) Negative training samples


Figure 10. Training set
 Aircraft Target Detection in Remote Sensing Images    605

5.2 Experiment Results and Comparative Analysis

The Algorithm 2 is applied to detect aircraft targets towards air-to-ground dynamic


imaging. In the process of testing, our goal is to predict the bounding boxes of all
objects locating in the given image. Since the spatial resolution is varying during the
dynamic imaging, the size of aircrafts is decreasing when imaging altitude increasing.
Sizes of bounding boxes are set as the size of 40×40, 36×36 and 28×28 when the
imaging height is 2km to 4km respectively.
Figure 11 (shown in the end of this paper) shows the result of the detection on a
part of test images, the images on the each rows indicate the detection results that
the imaging altitude is 2km to 4km respectively. Obviously, the experimental data
sets contain aircrafts targets with different sizes, various orientation and all sorts of
shapes to guarantee the validity and robustness of our method.

Figure 11. Detection results towards air-to-ground dynamic imaging

In our experiments, to quantify the detection results, an aircraft is assumed to be


detected correctly if more than 70% of it is detected. The detection rate (DR) and the
false alarm rate (FAR) is applied to evaluate our method performance and be defined
as Eq. (2).
number of corrective detected aircrafts
DR= × 100%
number of all aircrafts in image (2)
number of false detection
FAR= × 100%
number of all aircrafts in image
The accuracy assessment is demonstrated in Table 5. Here, CRL denotes candidate
regions localization and Accuracy of CRL represents the accurate rate of candidate
regions localization.
606   Aircraft Target Detection in Remote Sensing Images

Table 5. Detection results

Imaging Altitude Number Number Accuracy DR AR


(km) of test images of aircrafts of CRL

2 35 163 100% 96.32% 3.7%


3 28 162 99.38% 94.44% 4.9%
4 19 131 98.47% 92.37% 7.6%

In order to validate the rapidity our proposed algorithm, average computation time
of our method is compared with that of not extracting candidate regions in Table 6.
It can demonstrate that our method enhances the detection speed greatly and it is of
significance for the practical application.

Table 6. Average running time comparison

Candidate regions Whole image

Average running time /s 15.61 416.86

Moreover, the proposed detection method can detect aircrafts for some special cases
shown as Figure 12. There is an aircraft with the state of optic camouflage, but our
algorithm still has the capacity to detect the aircraft targets. Furthermore, aircrafts
also can be located precisely when the background is quite complex and the gray
level difference of the aircraft and airport is not obvious.

(a) Original image (b) detection result

Figure 12. Detection results for special case

6 Conclusion

This paper presents a new approach for aircraft detection in remote sensing images
towards air-to-ground dynamic imaging. The proper CFF and hierarchical clustering
are applied to obtain candidate regions for aircrafts in order to realize rapid detection
for aircrafts target. To get the better representation of aircraft, BOW model is also
introduced for encoding dense SIFT features of candidate regions into middle-level
 Aircraft Target Detection in Remote Sensing Images    607

features. Finally, a well-trained SVM is used to detect aircrafts accurately. The final
experiments showed our method is an effective and fast algorithm, which can detect
target in different directions, sizes and shapes. Besides, our method also effective for
some special cases, such as optic camouflage and complex background. In the future
work, how to detect aircrafts in higher imaging altitudes will be studied furthermore,
which generating RSIs with characteristic of more complex backgrounds and smaller
aircrafts target.

Acknowledgment: This work was partly supported by National Natural Science


Foundation of China (Grant Nos. 61273350 and U1435220), Beijing Science and
Technology Project of China (Grant No. D16110400130000-D161100001316001).

References
[1] TELLO M, LOPEZ-MARTINEZ C, MALLORQUI J J. A novel algorithm for ship detection in SAR
imagery based on the wavelet transform [J]. IEEE Geoscience & Remote Sensing Letters, 2005,
2(2): 201-5.
[2] SIRMACEK B, UNSALAN C. Urban-Area and Building Detection Using SIFT Keypoints and Graph
Theory [J]. Geoscience & Remote Sensing IEEE Transactions on, 2009, 47(4): 1156-67.
[3] ZHU D, WANG B, ZHANG L. Airport Target Detection in Remote Sensing Images: A New Method
Based on Two-Way Saliency [J]. IEEE Geoscience & Remote Sensing Letters, 2015, 12(5):
1096-100.
[4] WANG J, YANG X, QIN X, et al. An Efficient Approach for Automatic Rectangular Building
Extraction From Very High Resolution Optical Satellite Imagery [J]. IEEE Geoscience & Remote
Sensing Letters, 2015, 12(3): 487-91.
[5] AN Z, SHI Z, TENG X, et al. An automated airplane detection system for large panchromatic
image with high spatial resolution ☆ [J]. Optik - International Journal for Light and Electron
Optics, 2014, 125(12): 2768-75.
[6] LIU G, SUN X, FU K, et al. Aircraft Recognition in High-Resolution Satellite Images Using Coarse-
to-Fine Shape Prior [J]. IEEE Geoscience & Remote Sensing Letters, 2013, 10(3): 573-7.
[7] HUANG J, ZHANG H. Plane detection based on support vector machine and information fusion
[J]. Electronics Optics & Control, 2008, 75(495–504.
[8] HAN J, ZHANG D, CHENG G, et al. Object Detection in Optical Remote Sensing Images Based
on Weakly Supervised Learning and High-Level Feature Learning [J]. IEEE Transactions on
Geoscience & Remote Sensing, 2015, 53(6): 3325-37.
[9] MALLINIS G, KOUTSIAS N, TSAKIRI-STRATI M, et al. Object-based classification using Quickbird
imagery for delineating forest vegetation polygons in a Mediterranean test site [J]. Isprs Journal
of Photogrammetry & Remote Sensing, 2008, 63(2): 237-50.
[10] XU S, FANG T, LI D, et al. Object Classification of Aerial Images With Bag-of-Visual Words [J]. IEEE
Geoscience & Remote Sensing Letters, 2010, 7(2): 366-70.
[11] LOWE D G. Distinctive Image Features from Scale-Invariant Keypoints [J]. International Journal of
Computer Vision, 2004, 60(60): 91-110.
[12] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection; proceedings of the
IEEE Conference on Computer Vision & Pattern Recognition, F, 2013 [C].
608   Aircraft Target Detection in Remote Sensing Images

[13] LIU L, SHI Z. Airplane detection based on rotation invariant and sparse coding in remote
sensing images [J]. Optik - International Journal for Light and Electron Optics, 2014, 125(18):
5327-33.
[14] ZHANG W, SUN X, WANG H, et al. A generic discriminative part-based model for geospatial
object detection in optical remote sensing images [J]. Isprs Journal of Photogrammetry &
Remote Sensing, 2015, 99(30-44.
[15] SUN H, SUN X, WANG H, et al. Automatic Target Detection in High-Resolution Remote Sensing
Images Using Spatial Sparse Coding Bag-of-Words Model [J]. IEEE Geoscience & Remote
Sensing Letters, 2012, 9(1): 109-13.
[16] ZHOU P, ZHANG D, CHENG G, et al. Negative Bootstrapping for Weakly Supervised Target
Detection in Remote Sensing Images; proceedings of the IEEE International Conference on
Multimedia Big Data, F, 2015 [C].
[17] TOMASI C, MANDUCHI R. Bilateral Filtering for Gray and Color Images; proceedings of the
International Conference on Computer Vision, F, 1998 [C].
[18] CAI A H, SU Y. Airplane detection in remote sensing image with a circle-frequency filter [J].
Proceedings of SPIE - The International Society for Optical Engineering, 2005, 5985(529-34.
[19] JOHNSON S C. Hierarchial Clustering Schemes [J]. Psychometrika, 1967, 32(3): 241-54.
[20] LI F F, PERONA P. A Bayesian Hierarchical Model for Learning Natural Scene Categories;
proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, F, 2005 [C].
[21] CAO Y, WANG C, LI Z, et al. Spatial-bag-of-features; proceedings of the IEEE Conference on
Computer Vision & Pattern Recognition, F, 2010 [C].
Xiao-mei LIU*, Shuai ZHU
A New Third-order Explicit Symplectic Scheme for
Hamiltonian Systems
Abstract: Symplectic geometric algorithms are superior to other standard methods
in preserving structures and long-time tracking ability when solving Hamiltonian
systems, but the phase errors do accumulate. A fractional-step symmetric symplectic
method (FSJS), composing low order schemes, is presented for separable Hamiltonian
systems. A new third-order scheme derived from FSJS, is appropriate for engineering
application, and its phase error is smaller than the well-known third-order symplectic
Runge-Kutta (SRK3) scheme. Some numerical examples are given to show the
efficiency of our scheme.

Keywords: Hamiltonian system; Duffing equation; Symplectic partitioned Runge-


Kutta algorithm; Runge-Kutta algorithm

1 Introduction

The most important feature of Hamiltonian systems, including all earthy physical
course without any loss of energy, is dynamical evolution conserving laws. In the late
1980s, Kang Feng initially proposed and developed symplectic algorithms for solving
ODEs in Hamiltonian form [1]. These methods, designed by symplectic geometric
structures, are superior to other standard methods in preserving integrity and long-
term tracking ability. They have been widely used in the studies of the atmosphere,
earth sciences, GPS, dynamic process of gas turbines [2] and so on. Wan-xie Zhong, a
well-known academician, proposed structure-preserving algorithm by modifying the
semi-analytical method of Elasticity [3].
Although symplectic algorithms are not dissipative methods contrary to Runge-
Kutta algorithms, the phase errors do exist, whose accumulations may make dynamic
response distort. Peter Görtz indicated phase errors of symplectic methods for
single step [4]. Yu-feng Xing proposed the phase accuracy of the third-order explicit
symplectic method is higher than that of the fourth-order explicit and implicit
symplectic integration algorithms [5-6].
In this paper, we designed a way to construct symplectic methods by composing
low order schemes for separable systems, and derived a new 3rd order scheme. And

*Corresponding author: Xiao-mei LIU, Department of Mathematics, Shanghai Second Polytechnic


University, Shanghai, China, E-mail: Liuxiaomei5@sina.com
Shuai ZHU, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China
610   A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems

its phase error is smaller than the well-known third-order symplectic Runge-Kutta
method (SRK3).

2 Hamiltonian Systems and Symplectic P-R-K Algorithms

2.1 Separable Hamiltonian Systems

In mechanics, many problems are formulated as some special 2nd order ODEs
(Newtonian equations)

x = f ( x),


where x is an n dimension vector.
y = x
Assume , then we have
 x = y
 , (1)
 y = f ( x )

which is called Hamiltonian canonical equations. And this is a class of special


separable Hamiltonian system, meaning that it can be written in the form
1 T
H (=
x, y ) y y − u ( x ) ,
2
where u ( x ) = ∫ f ( x ) d x , its corresponding Hamiltonian equations are
 d x ∂H
 = = y
dt ∂y
 .
d y = −
∂H
f ( x)
=
 d t ∂x  x
z = 
For the notation simplicity, we introduce the symbol  y .
So the expression of Hamiltonian equations can be further simplified to

z = J −1 H z , (2)

 0 In 
J= 
where  - I n 0  is a standard symplectic matrix, I n is n × n identity matrix.
Wherever Times is specified, Times Roman or Times New Roman may be used. If
neither is available on your word processor, please use the font closest in appearance
to Times. Avoid using bit-mapped fonts if possible. True-Type 1 or Open Type fonts are
preferred. Please embed symbol fonts, as well, for math, etc.
 A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems   611

2.2 Symplectic Algorithms

Generating function method, symplectic Runge-Kutta method and composition


method are the main ways to construct symplectic difference schemes. Generating
function method [7], a generalization of Hamilton-Jacobi theory, is not suitable
for engineering applications because complicated higher order derivatives of
Hamiltonian function H need to be calculated. Composition method is a good way
to obtain arbitrary even order scheme by “composing” self-adjoint schemes with low
order [8].
F. Lasagni, J. M. Sanz-Serna, Y. B. Suris,et al, developed general conditions
to preserve symplectic structure of Hamiltonian system for traditional numerical
methods in ODEs such as Runge-Kutta (RK) method,therefore proposed symplectic
Runge-Kutta method (SRK), respectively [9-11].
Definition 1 A symplectic R-K method is a R-K method whose transformation of
(2), i.e., Jacobian matrix ∂z k +1 is everywhere symplectic, i.e.,
∂z k
T
 ∂z   ∂z 
k +1 k +1

 k  J  k  = J.
 ∂z   ∂z 
3 Fractional-Step Symmetric Symplectic Scheme

In this section, a new way to construct symplectic schemes by composing low order
schemes is proposed, whose coefficients can obtain easier.

3.1 The Construction of the Method

It is well known that the first-order scheme of (1) is

 x = x + τ y
k +1 k k +1

A :  k +1 ,
 y = y − τ f ( x )
k k

 x k +=
1
xk + τ y k
symmetric B :  k +1 .
 y = y − τ f ( x )
k k +1

where τ is step. x k and y k are numerical solution at tk .


Theorem 1 Scheme A and scheme B are symplectic.
Proof. The Jacobian matrix of scheme A is
1-τ 2 f ′ ( x k ) τ 
MA= ,
 -τ f ′ ( x k ) 1 
 

612   A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems

The Jacobian matrix of scheme B is


 1 τ 
MB = k +1 
.
 -τ f ′ ( x ) 1-τ f ′ ( x ) 
k +1 2
 
Obviously,

, M B T JM B J .
M AT JM A J=
=
We know scheme A and scheme B are symplectic by definition 1.
Therefore, the theorem is completed.
Definition 2 A s -stage fractional-step symmetric scheme( FSJS s ) method is
 ( )
composed by ∏ A ( aiτ )B b jτ ( i= 1, 2 , , ( s + 1) 2  ; j= 1, 2 , , [ s 2] )

i, j

If s is odd, the Jacobian matrix of FSJS s is


k + ( a1 + b1 + a2 + b2 ++ b +a )
[ s 2] [( s +1) 2]
∂z
MΠ =
∂z k
k + ( a1 + b1 + a2 + b2 ++ b +a ) k + ( a1 + b1 + a2 + b2 ++ b )
[ s 2] [( s +1) 2]
∂z ∂z [ s 2]
∂z k + a1
= k + ( a1 + b1 + a2 + b2 ++ b
� k + ( a1 + b1 + a2 + b2 ++ a

∂z [ s 2]
)
∂z [ s 2]
)
∂z k
= MA a M B b  M A( a1 ) .
([ ( s +1) 2] ) ( [ ])
s 2

Theorem 2 A s -stage fractional-step symmetric method ( FSJS s ) is symplectic


scheme.
Proof. If s is odd,
M Π T JM Π
T

=  M A a M   
 M A( a1 )  J  M A a M  M A( a1 ) 
 ( [( s+1) 2] ) B (b[s 2] )   ( [( s+1) 2] ) B (b[s 2] ) 
=  M A( a1 ) T  M B b T M A a T 
J M M 
 M A( a1 ) 
 ( [s 2] ) ( [( s+1) 2] )   A( a[( s+1) 2] ) B(b[s 2] ) 
= M A( a1 ) T  M B b T  M A a T
JM M
 M A( a1 )
( [s 2] )  ( [( s+1) 2] ) A( a[( s+1) 2] )  B(b[s 2] )
= M A( a1 ) T  M A a T  M B b T JM B b  M A a  M A( a1 )
( [ s 2] )  ( [ s 2] ) ( [ s 2] )  ( [ s 2] )

= M A( a1 ) T JM A( a1 )
= J.
For the case that s is even, we can use the similar argument.
Hence, FSJS s is a symplectic algorithm.
 A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems   613

3.2 The Third-order Symplectic Schemes

Yu-feng Xing proposed the phase accuracy of the third-order explicit symplectic method
is higher than that of the fourth-order explicit and implicit symplectic integration
algorithms. So he took the efficiency and the accuracy into account, the third-order
explicit symplectic method is superior to the others for linear and nonlinear separable
systems. Therefore, we construct third-order FSJS s schemes [13].
Next we take a 5-stage symplectic method ( FSJS5 ) for an example. To construct
FSJS53
 x k +1=
5
x k + aτ y k +1 5
A ( aτ ) :  k +1 5 ,
 y = y − aτ f ( x )
k k

 x= x
k +2 5 k +1 5
+ bτ y k +1 5
 k +2 5 ,
B ( bτ )  y= y
k +1 5
− bτ f ( x k + 2 5 )
:
 x=
k +3 5
x k + 2 5 + cτ y k + 3 5
 k +3 5 ,
A ( cτ )  y= y
k +2 5
− cτ f ( x k + 2 5 )
:
 x=
k +4 5
x k + 3 5 + dτ y k + 3 5
 k +4 5 ,
B ( dτ )  y= y
k +3 5
− dτ f ( x k + 4 5 )
:
x k +1 x k + 4 5 + eτ y k +1
=
 k +1 .
A ( eτ ) =y y k + 4 5 − eτ f ( x k + 4 5 )
:

Obviously, this scheme is simple for engineering application. So we can get

 x k +1 = x k + ( a + b )τ y k +1 5 + ( c + d )τ y k + 3 5 + eτ y k +1
 k +1 .
 y = y − aτ f ( x ) − ( b + c )τ f ( x ) − ( d + e )τ f ( x k + 4 5 )
k k k +2 5

By Taylor expansions, we have


x k +1 = x k + ( a + b )τ y k +1 5 + ( c + d )τ y k + 3 5 + eτ y k +1
= x k + ( a + b + c + d + e )τ y k
= x k + ( a + b + c + d + e )τ y k
 a ( a + b ) + a ( c + d ) + ae + ( b + c )( c + d )  2
τ f ( x )
k
−
 + ( b + c ) e + ( d + e ) e 
( a + b )( b + c )( c + d ) + ( a + b )( b + c ) e  3
τ f ′ ( x ) y
k k
+
 + ( a + b )( d + e ) e + ( c + d )( d + e ) e 
+ O (τ 4 ) .
614   A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems

y k +1 = y k − aτ f ( x k ) − ( b + c )τ f ( x k + 2 5 ) − ( d + e )τ f ( x k + 4 5 )
= y k − ( a + b + c + d + e )τ f ( x k )
( a + b )( b + c ) + ( a + b )( d + e )  2 k
τ y f ′ ( x ) Let, p= a + b ,
k
−
 + ( c + d )( d + e ) 
 a ( a + b )( b + c ) + a ( a + b )( d + e )  3
τ f ( x ) f ′ ( x )
k k
+
 + a ( c + d )( d + e ) + ( b + c )( c + d )( d + e ) 
+ O (τ 4 ) .

q= b + c , r= c + d , s= d + e , the conditions are


a + q + s = 1 (a)
p+r +e = 1 (b)

 1
 pq + ps + rs = (c )
 2
 1
ap + ar + qr + (a + q + s )e =2
(d )
 (3)
apq + aps + ars + qrs = 1
(e)
 6

 pqr + pqe + pse + rse = 1
(f)
 6
 1
qp 2 + s ( p + r )2 = (g)
 3

We have a total of 7 equations with 5 unknown variables (system (3)). But we deduced:
( a )( b )( c ) ⇒ ( d ) ; ( a ) ⇔ ( b ) ; ( b )( f ) ⇒ ( g ) .

Therefore, systems (3) can be simplified to


a + q + s =1 (a)
 1
 pq + ps + rs = (c )
 2

 1
 apq + aps + ars + qrs =
6
(e)

1
 pqr + pqe + pse + rse =
 (f)
6
With 5 unknown variables. So we have a free variable.

set s = 1 ,from (a), we have


q = −a.
 A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems   615

And from (e), if a ≠ 1


1
p= ,
6 ( a − a2 )
From(c),
1 1
r= − ,
2 6a
So we have
1 1
e= − ,
2 6 (1 − a )
3
Adding (f), we get 3a − 4a + 1 =0 , which leads to
−3 + 21 −3 − 21
a =1, a = ,a = .
6 6
We have found the two following solution
−6γ 3 + 6γ 2 − 1 1
a γ=
= ,b =,c ,
6γ ( γ − 1) 6γ ( γ − 1)
3γ − 4 3γ − 2
=d = ,e ,
6 ( γ − 1) 6 ( γ − 1)

−3 ± 21
where γ =
6
If a = 1 ,
25 9 9 7
a= 1, b = − ,c = ,d = ,e = .
24 24 24 24

 −6γ 3 + 6γ 2 − 1   1   3γ − 4 
A ( γτ ) B  τ  A τ B τ
 6γ ( γ − 1)   6γ ( γ − 1)   6 ( γ − 1) 
     
So
 3γ − 2  −3 ± 21
A τ (
 6 ( γ − 1) 
)γ =
  6

 25   9   9   7 
A (τ ) B  − τ  A  τ  B  τ  A  τ 
and  24   24   24   24  are third-order 5-stage fractional-step
symmetric symplecitc method FSJS53 .
 7   9   9   25 
A  τ  B  τ  A  τ  B  − τ  A (τ )
Notation 1 If e = 1 , Jacobian matrix is  24   24   24   24  ,
i.e., the well-known third-order symplectic method (SRK3) [12]. The scheme is
 1 7 2
x = xk + τ yk yk − τ f ( x1 )
y1 =
24 3

 2 3 2
x = x1 + τ y1 y1 + τ f ( x 2 )
y2 =
 4 3
 1
 xk +1 = x − τ y2
2
y 2 − τ f ( xk +1 )
yk +1 =
 24
616   A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems

Notation 2
 −6γ 3 + 6γ 2 − 1   1   3γ − 4 
A ( γτ ) B  τ  A τ B τ
 6γ ( γ − 1)   6γ ( γ − 1)   6 ( γ − 1) 
     
 3γ − 2  −3 ± 21
A τ (
 6 ( γ − 1) 
)γ =
  6
is noted as FSJS3ZL.
−3 + 21
Notation 3 The phase accuracy of FSJSZL ( γ = ) is higher than SRK3.
6

4 Numerical Examples

Example 1 Consider a un-damped single degree of freedom system:


 x = y

 y = −100 x ,

 x ( 0 ) 0,=
= y ( 0 ) 10

Exact solution:
= x sin10
= t , y 10 cos10t.
Hamiltonian function:= H 100 x 2 + y 2 . −3 + 21
Figure 1 illustrates that both SRK3 and FSJS3ZL ( γ = ) have the same
6
third-order accuracy, but the point-wise numerical precision of FSJS3ZL is higher than
SRK3. This verifies our notation 3.
Figure 2 shows that the energy computed by FSJS3ZL is more concentrated,
compared with SRK4. In other words, although FSJS3ZL is a third-order scheme, it is
superior to fourth-order method in preserving laws.

Example 2 Consider the nonlinear Duffing oscillator’s equation


 x = y
 3
 y =− x − x ,
 y (0) 0
 x ( 0 ) 2,=
=

2 x4
Hamiltonian function: H = x + + y2.
2
Figure 3 demonstrates the cases about FSJS3ZL and RK3 in preserving Hamiltonian
function in different step size. Figure 1a and 1b show that FSJS3ZL is better than SRK3
in conserving energy, and they are not dissipative. With the step size increasing, the
speed of divergence of SRK3 is faster than FSJS3ZL. In a word, FSJS3ZL method is
superior to SRK3 in stability.
 A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems   617

x absolute error
0.015
SRK3
FSJS3ZL
0.01

0.005
absolute error

-0.005

-0.01

-0.015
0 5 10 15 20 25 30
t /s

Figure 1. (a) The absolute error of x by SRK3 and FSJS3ZL


y absolute error
0.15
SRK3
FSJS3ZL
0.1

0.05
absolute error

-0.05

-0.1

-0.15

-0.2
0 5 10 15 20 25 30 35 40 45
t /s

Figure 1. (b) y The absolute error of y by SRK3 and FSJS3ZL


conservation of Hamiltonian function
100.3
FSJS3ZL
100.2 SRK4

100.1

100

99.9
H

99.8

99.7

99.6

99.5
0 5 10 15 20 25 30
t /s

Figure 2. Hamiltonian function by SRK4 and FSJS3ZL


618   A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems

12.01 12.3
FSJS3ZL SRK3
SRK3 FSJS3Zl
12.2
12.005
12.1

12 12

H
H

11.9
11.995
11.8

11.99 11.7
0 2 4 6 8 10 0 500 1000 1500 2000 2500 3000
t /s t /s

(a)( τ = 0.1 ) (b)( τ = 0.3 )


178
x 10
4.5 25
SRK3 FSJS3ZL
4

3.5 20

3
15
2.5
H

H
2
10
1.5

1 5
0.5

0 0
0 100 200 300 400 500 0 100 200 300 400 500 600 700 800
t /s t /s

(c)( τ = 0.5 ) (d)( τ = 0.8 )

Figure 3. Hamiltonian function by FSJS3ZL and RK3

5 Conclusion

In this paper, we proposed FSJS method by composing first-order schemes A


and B, and construct the order conditions of third-order five step FSJS scheme
A ( aτ ) B ( bτ ) A ( cτ ) B ( dτ ) A ( eτ ) , by Taylor’s expansions.
 7   9   9   25 
Through the theory, a well-known scheme A  24 τ  B  24 τ  A  24 τ  B  − 24 τ  A (τ ) (that
is called SRK3 scheme) was deduced and some other new third order symplectic
schemes were presented.  3γ − 4 
 −6γ + 6γ − 1   1 3  2

Especially, the new scheme A ( γτ ) B 


 6γ ( γ − 1)
τ  A  τ
6γ ( γ − 1) 
B τ
 6 ( γ − 1) 
 3γ − 2  −3 + 21     
A τ (
 6 ( γ − 1) 
)γ =
6
has higher phase accuracy than the conventional SRK3 scheme.
 
Numerical results of examples show that our scheme is superior to the conventional
methods, such as SRK3 and SRK4, in preserving Hamiltonian functions and the
numerical precision in time domain. With the advantages of explicit scheme and
long-term stable capability, it is a better scheme for engineering applications.
 A New Third-order Explicit Symplectic Scheme for Hamiltonian Systems   619

Acknowledgment: The work is supported by the National Natural Science Foundation


of China (No.50876066), Foundation of Shanghai Second Polytechnic University
(No.EGD15XQD14), Foundation for University New Teacher by the Commission of
Education of Shanghai (No.ZZZZEGD15007) and Foundation for key subjects by
Shanghai Second Polytechnic University (No.XXKPY1604)

References
[1] K. Feng, “On difference schemes and symplectic geometry,” Proceedings of the 1984 Beijing
Symposium on Differential Geometry and Differential Equations, pp.42–58, 1985.
[2] Yonghong Wang, Shilie Weng, Gang Zhou, Weirong Sun, Xiaomei Liu, “Research on theory
and application of danamic process of gas turbine engines under Hamilton systems,” Chinese
Journal of Engineering Thermophysics,vol.10, 2011,pp.1655-1660.
[3] Zhong Wanxie, “Symplectic solution methodology in apllied mechanics,” Beijing: Higher
Education Press, 2006, pp.50-56.
[4] Peter Gortz
 , “Backward error analysis of symplectic integrators for linear separable
Hamiltonian systems,” J.Comput. Math., vol.20, 2002,pp.449-460.
[5] Xing Yufeng, Yang Rong, “Phase errors and their correction in symplectic implicit single-step
algorithm,” Chinese Journal of Theoretical and Applied Mechanics, vol.39,2007,pp.668-671.
[6] Xing Yufeng, Feng Wei, “Phase analysis of Lie series algorithm and explicit symplectic
algorithm,” Chinese Journal of Computational Mechnics, vol.26,2009,pp.167-171.
[7] Kang Feng, Mengzhao Qin, “Symplectic geometric algorithms for Hamiltonian systems,”
Springer Press, 2010,pp.358-380.
[8] H. Yoshida, “Construction of higher order symplectic integrators,” Physics Letters A, vol.150,
1990,pp.262–268.
[9] F.M. Lasagni, “Canonical Runge–Kutta methods,” Z. Angew. Math. Phys.,
vol.39,1988,pp.952–953.
[10] J. M. Sanz-Serna, “Runge–Kutta schemes for Hamiltonian systems,” BIT, vol.28,
1988,pp.877–883.
[11] Y.B. Suris, “On the conservation of the symplectic structure in the numerical solution of
Hamiltonian systems (in Russian),” In: Numerical Solution of Ordinary Differential Equations,
ed. S.S. Filippov, Keldysh Institute of Applied Mathematics. USSR Academy of Sciences,
Moscow, Second edition, 1988,pp.20-25.
[12] Ruth, R. D., “A canonical integration technique,” IEEE Transaction Nuclear Sciecce.,vol.30,
1983,pp.2669-2671.
[13] Xiaomei Liu, Gang Zhou, Wang Yonghong, Sun Weirong,“Rectifying drifts of symplectic
algorithm,” Chinese Journal of Beijing University of Aeronautics and Astronautics,vol.39,
2013,pp.22-26.
Qi YANG*, Yu-liang QIN, Bin DENG, Hong-qiang WANG
Research on Terahertz Scattering Characteristics of
the Precession Cone
Abstract: Micro-motion is one of the most important features of midcourse ballistic
targets, whereas the scattering characteristics of micro-motion targets in the terahertz
band are remain unclear. The scattering characteristics of a precession cone in the
terahertz band are analyzed in this paper, and the basic law of the sliding-type
scattering model is obtained. In addition, a series of cones with different surface
roughness parameters are built and the impact of the rough surface of the target is
analyzed. The results show that the scattering characteristics of micro-motion targets
in the terahertz band are different with that in the microwave band, and the traditional
parameter estimation algorithms are no longer applicable to the terahertz radar.

Keywords: Terahertz; Micro-Doppler; Sliding-type scattering model; Rough surface


target; Scattering characteristics

1 Introduction

Terahertz (THz) waves usually refer to electromagnetic waves with frequencies


between 0.1-10 THz. The terahertz band lies between the millimeter wave and infrared,
which is a transitional band from electronics to photonics. Its position in the spectrum
confers special properties and applications on terahertz waves that differ from other
bands [1,2]. In recent years, the main research objects of the terahertz radar systems at
home and abroad are stationary or turntable targets, while the micro-motion targets
are rarely involved. In usual cases, radar target may have small vibrations or rotations
or other high order motions in addition to mass translation, which is called micro-
motion. Micro-motion may induce additional frequency modulations on the returned
radar signal which generate sidebands about the target’s Doppler frequency, called
the micro-Doppler effect, and it extends the traditional feature extraction and target
recognition fields [3]. Since micro-Doppler phenomenon is observed in radar target
detection, micro-motion, as one of significant target characteristics with promising
applications in radar target feature extraction and recognition, incur attentions
and research interests. Terahertz waves are sensitive to the micro-Doppler of the
targets due to their short wave lengths, which makes them especially suitable for the

*Corresponding author: Qi YANG, College of Electronic Science and Engineering, National University of
Defense Technology, Changsha, China, e-mail: yangqi_nudt@163.com
Yu-liang QIN, Bin DENG, Hong-qiang WANG, College of Electronic Science and Engineering, National
University of Defense Technology, Changsha, China
 Research on Terahertz Scattering Characteristics of the Precession Cone   621

detection and recognition of the micro-motion targets. However, some new scattering
characteristics of the micro-motion targets are emerged in the terahertz band, and led
to failure of the traditional methods in the microwave band. Consequently, feature
extraction and target recognition of micro-motion targets in the terahertz band is a
very meaningful research field.
To achieve feature extraction and target recognition in the terahertz band, the first
thing we need to do is to analyze the terahertz scattering characteristics of the micro-
motion targets. The scattering of the radar targets, especially the artificial targets,
are caused by a series of reflection and scattering mechanisms, which don’t meet the
model of ideal scattering center. The non-ideal scattering models are summarized
and their effects on imaging are analyzed in [4]. In [5], the expression of the micro-
Doppler based on the sliding-type scattering model is proposed, and points out
that it is not the sinusoidal form often assumed in other literatures. In the terahertz
band, researches on the micro-motion targets based on the traditional methods (e.g.
time-frequency analysis and Radon/iRadon transform) are proposed [6,7], however,
they are mainly based on the ideal scattering model, without considering the new
problems appearing in the terahertz band.
Taking the midcourse target recognition in the terahertz band as research
background, this paper chooses a precession cone as target, and analyzes the
movement of the scattering centers on the target, as well as the effect of surface
roughness. The theoretical analyses are verified by a 3D electromagnetic simulation
software named Computer Simulation Technology (CST).

2 Theory analysis

2.1 The theoretical model of a precession cone

Precession is considered as the midcourse target’s own micro-motion and includes


coning circling the precession axis and spinning circling the symmetry axis of the
target. For convenience, we only consider the micro-motion of the target rather than
translation. The intended target in this paper is a precession rotational symmetric
cone, which is a simplified model of the ballistic missile target during the midcourse.
The simulation data of this paper come from CST. The diagram of the cone observed
by radar is shown in Figure 1.
A reference coordinate system O-XYZ that takes the mess center of the model as
the origin is established. Considering the model precessing around O-Z axis, the spin
angular velocity is Ω , the precession angular velocity is ω , the precession angle is θ .
The azimuth and pitch angles of the light of sight (LOS) are α and β , and the unit
vector of the LOS in the reference coordinate system is n =[cosβ cosα ,cosβ sinα ,sinβ ]T .
Meanwhile, a target coordinate system O-xyz that takes the symmetrical axis as the O-z
axis is established. In this case, the precession of the target can be divided into three
622   Research on Terahertz Scattering Characteristics of the Precession Cone

Z ω z
A
Γ

LOS

O
β
C Y
α
y
B
X(x) Ω

Figure 1. Diagram of the precession cone in CST

parts: a rotation from the target coordinate system to the reference coordinate system,
a coning circling the precession axis O-Z, and a spinning circling the symmetry axis
of the target O-z, which correspond to the initial transform matrix Rinit , the coning
transform matrix Rconi and the spinning transform matrix Rspin respectively. Rspin in
this paper may be replaced by a unit matrix because the cone is rotational symmetric.
The other two transform matrices are shown below:
1 0 0 

Rinit = 0 cos θ sin θ 
0 − sin θ cos θ  (1)

cos ωt − sin ωt 0 
R coni ( t ) =  sin ωt cos ωt 0  (2)
 0 0 1 

Therefore, the distance between the scattering center P located at ( x, y, z ) and the
radar can be expressed as:
r (t ) =R0 + [ Rconi (t ) ⋅ Rinit ⋅ r0 ]T ⋅ n
=R0 + sin β ( y sin θ + z cos θ )
+ [ x cos α + y sin α cos θ − z sin α sin θ ] cos β ⋅ cos ωt (3)
+ [ x sin α − y cos α cos θ + z cos α sin θ ] cos β ⋅ sin ωt
R0 + sin β ( y sin θ + z cos θ ) + A cos (ωt + ϕ )
=
r
where R0 is the initial distance between the target and the radar, 0 is the micro-
ϕ
motion amplitude, A is the coefficient of amplitude modulation, and is the initial
phase. Assuming that the transmitting signal is single frequency continuous waves
 Research on Terahertz Scattering Characteristics of the Precession Cone   623

with frequency f 0 , then the micro-Doppler of the precession scattering center can be
obtained according to its definition, that is:
2 f 0 dr (t ) 2f
f md (t ) = − 0 Aω sin(ωt + ϕ )
=
c dt c (4)
= Aω sin(ωt + ϕω )
The expression of micro-Doppler in Equation (4) is obtained based on the ideal
scattering model, and the micro-Doppler of the precession scattering center is
sinusoid modulated.

2.2 The sliding-type scattering model

In the analysis and deduction above, it is assumed by default that the location of
the scattering center on the precession target is fixed and moves with the target,
known as the ideal scattering model. The assumption of ideal scattering model is
approximately valid when the precession angle is relatively small and the carrier
frequency is relatively low. However, we find out that the time-frequency distributions
are seriously distorted and the ideal scattering model is no longer valid at the large
precession angle situations in the terahertz band.
The scattering center is a local electromagnetic scattering phenomenon, and it
can’t accurately describe the realities of the target in many situations. In reality, the
locations of the scattering centers on the precession target are associated with the
relative relationship between the incident wave and the target, and the movements of
the scattering centers do not exactly correspond to the movement of the target. Without
consideration of the surface roughness of the target, according to the theoretical
calculation and the experimental measurement, every scattering center corresponds
to a discontinuity in the Stratton-Chu integral [8], as the discontinuity of the curvature
or surface from the point of geometry. For the model shown in Figure 1, the scattering
centers include: the cone-top A, the discontinuities B and C in the intersection line of
the plane Γ (plane Γ is composed of the LOS and the symmetrical axis of the model
O-z) and the model which are called the cone-rear scattering centers.
The back scattering coefficient of the cone-top is generally very small according
to the electromagnetic scattering theory. The movement of the cone-top scatter center
coincides with the movement of the target because it locates at the symmetry axis of
the target, and its micro-Doppler is sinusoid modulated. However, the movements of
the cone-rear scattering centers B and C are different with the target itself, and their
micro-Doppler curves are no longer simply sinusoid modulated. The micro-Doppler
expression of scatter center B and C obtained through the space analytic geometry is
(supposing that the azimuth angle of LOS is α =π /2 ) [5]:
2ω f 0 sin θ cos β  rF (t) 
f md (t)= −  l1 ±  cos ωt ,i =B,C (5)
c  1 − F (t) 2 
 
624   Research on Terahertz Scattering Characteristics of the Precession Cone

where F (t)=sinθ cos β sinω t+cosθ sin β .


By comparing with the ideal scattering model in Equation (4), the locations of B
and C slide on the bottom edge of the cone, which is called the sliding-type scattering
model. The present result shows that: the micro-Doppler at the sliding-type scattering
model is in fact a combination of the micro-Doppler at the ideal scattering model and
a modulation term which we call the extra Doppler term:

2ω f 0 r sin θ cos β cos ωt ⋅ F (t)


f extra (t)=
c 1-F (t) 2 (6)

2.3 The effect of the surface roughness

The effect of the target surface roughness on scattering characteristics widely exists
in the infrared, the millimeter and terahertz fields. In a statistical sense, two main
parameters commonly measured when studying the roughness of a surface are the
root mean square (rms) roughness, s, and the correlation length, L. The rms roughness
is a root mean square average of the heights above or below a mean reference line. The
correlation length gives a measure of how far two points must be separated along the
mean reference line before their heights are considered uncorrelated [9]. Researches
show that the impact on the scattering characteristics depends on the relative size of
the rough degree and the wavelength, that is to say the rough degree of the object is
an important factor in determining the scattering characteristics of laser radar in the
infrared field, while in the microwave field the effect of the rough degree of the object
is very small. The position of terahertz in the electromagnetic spectrum has doomed
the transition characteristic of surface roughness.

3 Simulation results and analysis

3.1 The validation of the sliding-type scattering model

In order to validate the sliding-type scattering model in section 2, detailed numerical


simulations and analysis are presented in this paper. The simulation target is a
precession cone which has a height of 0.08 m and a radius of 0.013 m. The distance
change curves of the cone-rear scattering centers on the target based on the ideal
scattering model and the sliding-type scattering model are shown in Figure 2. The
time-frequency distributions based on the ideal scattering model and the sliding-type
scattering model are shown in Figure 3 and Figure 4 respectively while the CST results
are shown in Figure 5.
 Research on Terahertz Scattering Characteristics of the Precession Cone   625

0.024 0.024

0.022 0.022

0.02 0.02

0.018 0.018
distance (m)

distance (m)
0.016 0.016

0.014 0.014

0.012 0.012

0.01 scattering center B 0.01 scattering center B


scattering center C scattering center C
0.008 0.008
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
time (s) time (s)

(a) (b)

Figure 2. The distance change curves of the cone-rear scattering centers: (a) is the distance at the
ideal scattering model. (b) is the distance at the sliding-type scattering model

500 500
Doppler (Hz)

Doppler (Hz)

0 0

-500 -500
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
time (s) time (s)

(a) (b)
Figure 3. The time-frequency distributions of the cone-rear scattering centers based on the ideal
scattering model: (a) is the result with carrier frequency 110 GHz. (b) is the result with carrier
frequency 340 GHz.
500 500
scattering center B scattering center B
scattering center C scattering center C
Doppler (Hz)

Doppler (Hz)

0 0

-500 -500
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
time (s) time (s)

(a) (b)

Figure 4. The time-frequency distributions of the cone-rear scattering centers based on the sliding-
type scattering model: (a) is the result with carrier frequency 110 GHz. (b) is the result with carrier
frequency 340 GHz.
626   Research on Terahertz Scattering Characteristics of the Precession Cone

500 500

Doppler (Hz)

Doppler (Hz)
0 0

-500 -500
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
time (s) time (s)

(a) (b)

Figure 5. The time-frequency distributions of the cone-rear scattering centers based on CST data: (a)
is the result with carrier frequency 110 GHz. (b) is the result with carrier frequency 340 GHz.

According to the distance change curves in Figure 2a and the time-frequency


distributions based on the ideal scattering model in Figure 3, it is obvious that the
micro-Doppler based on the ideal scattering model is sinusoid modulated, however, it
is no longer sinusoid modulated based on the sliding-type scattering model according
to Figure 2b and Figure 4. The electromagnetic simulation results are shown in Figure
5, and they are just the evidence of the sliding-type scattering model. Accordingly,
taking the sliding-type scattering model into consideration is more accordance with
the actual conditions. Furthermore, it is easy to see from Equation (6) that the higher
the carrier frequency the more obvious the non-ideal scattering phenomenon under
the same conditions. Consequently, the sliding-type scattering phenomenon in
the terahertz band is more obvious than that in the microwave band, which brings
difficulties to the parameter estimation.
At present, there is no parameter estimation algorithms based on the sliding-type
scattering model. However, every factor leading to distortion of the time-frequency
distribution is controlled even though the locations of the scattering centers slide on
the target. In other words, the analytical expressions of the distorted micro-Doppler
curves can be obtained and several parameters in the expressions are associated with
the micro-motion of the target. The generalized Hough transform is an effective curve
detection algorithm in the image processing field and is widely used in the image
recognition, target detection and feature extraction etc. Its basic idea is searching
and matching the parameters of the target curves and mapping them to the parameter
space. Accordingly, the generalized Hough transform may be considered in the micro-
motion parameter estimation based on the sliding-type scattering model.
 Research on Terahertz Scattering Characteristics of the Precession Cone   627

3.2 The effect of surface roughness

In order to verify the effect of the surface roughness on the scattering characteristics,
we build a series of cones with different surface roughness parameters (see Table 1, λ
is the wavelength of the signal) through Matlab and CST, and obtained the scattering
data at 110 GHz and 340 GHz. An example of the rough plane constituting the rough
surface models is shown in Figure 6.

Table 1. Parameters of the rough surface cones

No. 1 2 3 4

s 5λ 2λ λ 0.5 λ
L 0.01 λ 0.05 λ 0.1 λ 0.2 λ

100
80 100
60 80
40 60
40
20 20

( s λ=
, L 0.1λ )
Figure 6. The rough plane=

The time-frequency distributions of the rough surface cones are shown in Figure  7,
and the results show that the sliding-type scattering model is valid when the
roughness is relatively small, but it is gradually inappropriate with the increasing
of the surface roughness, reflecting to the time-frequency distributions is: the
Doppler curves gradually change to be block areas in the time-frequency image, and
new difficulties are brought to the feature extraction. However, the periodicity of
the time-frequency distributions is not damaged by the surface roughness, and the
period estimation algorithms are still valid. As for other parameters, new parameter
estimation algorithms considering the surface roughness are urgently required.
628   Research on Terahertz Scattering Characteristics of the Precession Cone

500 500

Doppler (Hz)

Doppler (Hz)
0 0

-500 -500
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
time (s) time (s)

(a) (b)

500 500
Doppler (Hz)

Doppler (Hz)

0 0

-500 -500
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
time (s) time (s)

(c) (d)
Figure 7. The time-frequency distributions of the rough surface cones: (a) is result of cone1. (b) is
result of cone 2. (c) is result of cone3. (d) is result of cone4.

4 Conclusions

The relatively shorter wavelengths in the terahertz band have determined the
advantage of high Doppler resolution of the terahertz radar, but also make some
inapparent affecting factors in the microwave band even more remarkable and bring
new problems to the parameter estimation algorithms. Through theoretical analysis
and simulation results, the cone-top scattering center of the precession cone satisfies
the ideal scattering model, while the cone-rear scattering centers slide with the
relative location between the light of sight and the target, and the relatively higher
carrier frequencies in the terahertz band make the slide effect more significant. The
slide effect of the scattering centers makes the sinusoid modulated time-frequency
curves seriously distort and there is not an efficient parameter estimation method
yet. In addition, the relatively shorter wavelengths also make the influence of surface
roughness more obvious and result in a translation from curves to blocks in the time-
 Research on Terahertz Scattering Characteristics of the Precession Cone   629

frequency distributions. This means that many targets which can be treated as smooth
surface in microwave band are not smooth any more in the terahertz band. These are
all the urgent and important problems that micro-motion parameter estimation in the
terahertz band must face and try to solve.

References
[1] P. H. Siegel, “Terahertz technology in biology and medicine,” IEEE Transactions on Microwave
Theory and Techniques. 2004; 52(10): 2438-2447.
[2] P. H. Siege, “Terahertz technology and applications,” Asia PaciJic Microwave Conference, Kyoto,
Japan, 2002.
[3] V. C. Chen, F. Li. S. S. Ho, H. Wechsler, “Analysis of micro-Doppler signatures,” Radar, Sonar,
and Navigation, IEEE Proceedings. 2003; 150(4): 271-276.
[4] J. Liu, F. Zhao, Y. Zhang, X. Ai, “Micro-Doppler of non-ideal scattering centers,” International
Radar Conference. 2013.
[5] QL. Ma, WH. Lu, CQ. Feng, J. Zhang, Z. Wang, “Precession characteristics analysis of ballistic
target based on non-ideal scattering centers,” Adcanced Materials Research. 2012, 580:
329-333.
[6] J. Li, Y. Pi, X. Yang, “Micro-doppler signature feature analysis in terahertz band,” Journal of
Infrared, Millimeter, and Terahertz Waves. 2010; 31(3): 319-328.
[7] Z. Xu, J. Tu, J. Li, “Research on micro-feature extraction algorithm of target based on terahertz
radar,” Journal on Wireless Communications and Networking. 2013; 2013(1): 1-9.
[8] G. Jin, X. Gao, X. Li, Y. Chen, “Inverse synthetic aperture radar echo decomposition of ballistic
target based on chirplet,” Journal of Electronics & Information Technology. 2010; 32(10):
2353-2358..
[9] DA. Digiovanni, AJ. Gatesman, RH. Giles, WE. Nixon, “Backscattering of ground terrain and
building materials at millimeter-wave and terahertz frequencies,” The International Society for
Optical Engineering. 2013.
Chun-yu CHENG, Meng-lin SHENG, Zong-min YU, Wen-xuan ZHANG,
An-qi LI, Kai-yu WANG*

Application Development of 3D Gesture Recognition


and Tracking Based on the Intel Real Sense
Technology Combing with Unity3D and WPF
Abstract: Intel real sense technology makes the man-machine interaction easier. In
this paper, the Intel real sense technology and the F200 camera are adopted. API and
some functions in Intel SDK are used. In addition, they are tentatively combined with
WPF application framework and Unity3D on traditional PC. A 3D gesture recognition
and tracking application is realized with C # programming language. This paper is a
meaningful attempt to combine Intel real sense technology with some other platforms
except Intel SDK. It has a guiding significance for the popularization and flexible
application of the Intel real sense technology.

Keywords: Intel real sense technology; 3D gesture recognition; windows presentation


foundation (WPF)

1 Introduction

3D gesture recognition and tracking is an important application of the real sense


technology [1,2].
Current research on 3D gesture recognition focuses on algorithm level [3-6].
However, the research on the perfect fusion of the real sense technology and various
experimental platforms on traditional PC is rare. This kind of fusion can make the
application development easier. And it can make the man-machine interaction more
friendly.
WPF is the next generation of display system that Microsoft promotes vigorously.
As a subset of the.net framework, it’s suitable for most.net developers to learn quickly.
It has universal meanings [7-8]. Unity platform is a comprehensive development
tool to construct three-dimensional models. It has a large user group in creating 3D
animation, making three-dimensional models, developing games and other fields
[9-11]. At the same time, Intel real sense technology has good support to Unity. The
interconnection between Unity3D software and Visual Studio platform is perfect.

*Corresponding author: Kai-yu WANG, College of Electronic Science and Technology, Dalian University
of Technology, Dalian, China, E-mail: wkaiyu@dlut.edu.cn
Chun-yu CHENG, Meng-lin SHENG, Zong-min YU, Wen-xuan ZHANG, An-qi LI, College of Electronic
Science and Technology, Dalian University of Technology, Dalian, China
 Application Development of 3D Gesture Recognition   631

Based on the above analysis, this paper combines the WPF application framework
of Microsoft and the Unity development platform with Intel real sense technology
respectively. A new real sense application is finally realized based on the gesture
recognition and tracking module. There are two visualization projects realized: 1) a
3D gesture recognition system based on WPF framework which can make different
responses to different gestures; 2) a simple 3D visual gesture tracking system based
on Unity3D platform; 3) A final system by which users can use their hands to control
the 3D models to hit balls. It’s a perfect fusion of 1) and 2).

2 The design idea and the overall design framework

Figure 1 shows the overall design diagram and Figure 2 shows the structure of the
system. First, a new project integrated Intel real sense SDK examples is created. The
SDK API layer is invoked to realize a simple gesture recognition application without
the constraint of the samples from the SDK itself; Secondly, a 3D scene and three-
dimensional objects based on the Unity development platform are created; Finally,
Intel real sense technology is connect with the objects to realize movement control
using gestures. The Ping-Pong interactive applications is realized: Users can control
the two rackets in the scene. The ball bounces back and forth in the scene. Once it
touches something, it will go to the opposite. If one of the rackets cannot catch the
ball, the other will get a point.

Unity project
creation
WPF project
creation Connect VS with
unity

Add the SDK Create objects


library reference and scenes

Create the user Configure the


interface physical
properties

Create utility Write the script


classes files

Import real
sense SDK
reference

WPF, unity3D and Intel real sense technology


integration

Figure 1. The overall design diagram.


632   Application Development of 3D Gesture Recognition

Application

C# Interface Unity Interface

SDK Interfaces

SDK Core
Module Management Algorithm
I/O Module
Pipeline Execution Module
Interoperability

Figure 2. The structure of the system

3 Software development environment

3.1 Intel F200 Real Sense Camera

The basic principle of Intel F200 real sense camera is structure light. The structure
light from an infrared launcher will be accepted by the infrared sensor after the
reflection from the object. Because the distance between the infrared equipment and
the surface of the object is different, the position and shape of the “structured light”
which the receiver catches changes. According to them, the space information of the
object can be calculated [12].

3.2 Maintaining the Integrity of the Specifications

3.2.1 Intel real sense SDK


Intel real sense SDK is an Intel official development suite. It contains a variety of
functions such as hand tracking and recognition, facial analysis, speech recognition,
augmented reality, background segmentation and so on. Intel Real Sense SDK is the
only bridge from hardware layer to software layer. All the interface functions in this
paper are from the SDK.

3.2.2 Microsoft visual studio 2015


Visual Studio is the most popular integrated development environment of Windows
applications. The source code in the project routine from Intel Real Sense SDK can be
changed with the Microsoft Visual Studio development platform to obtain the data
flow needed. The design based on Windows Presentation Foundation (WPF) is also
based on Microsoft Visual Studio development platform.
 Application Development of 3D Gesture Recognition   633

3.2.3 Unity3D
Unity3D developed by Unity Technologies is a multi-platform integrated game
development tool. It allows players to create such as 3D video games, visual
architectures, real-time 3D animations and other types of interactive content easily.
It is a fully integrated professional game engine. It supports different programming
languages.

4 The design and implementation of 3D gesture recognition appli-


cation based on WPF and Intel real sense technology

Most of the SDK C # example applications are based on the Windows Forms user
interface model, this article explores another user interface developing method with
WPF. It is focused on the fusion of Intel real sense technology and the WPF application
framework. Creating utility classes is the most important process.

4.1 The Intel Real Sense SDK API Initialization

In Intel real sense SDK framework, all cameras and data flow are managed
by PXCMCaptureManager. The interface classes PXCMSenseManager and
PXCMCaptureManager are used in this paper. Figure 3 is a chart of invoking SDK and
the camera to initialize the environment.

Initializes the multimodal


pipeline
Create SenseManager instances

Activate function modules in the


corresponding pipelines

Using QueryHand model to create


and return
the Hand model module instances

Create gestures configuration


module instances
And open the corresponding
function

Figure 3. The chart of invoking SDK and the camera to initialize the environment
634   Application Development of 3D Gesture Recognition

4.2 Add the SDK Reference

There are two kinds of dynamic link library (DLL) needed when an Intel real sense
application is created with C# language:
–– libpxcclr. Cs. DLL - managed C # interface DLL
–– libpxccpp2c. DLL - unmanaged C + + P/Invoke DLL

In Intel real sense SDK working reference manual, there are two methods given to add
the required Intel real sense SDK library (DLL) supports for the project. One kind is
creating reference for the necessary external DLL of the projects. It is usually finished
during the installation of the Intel real sense SDK within the scope of the system.
The other kind is adding the required DLL to the project and creating local reference.
This article uses the first one. The advantage of this approach is that the project will
be able to update to the latest version of the DLL when the SDK upgrades to the new
version. The disadvantage is that if the SDK installation path changes or the built
target platforms is different, reference may need to be modified.

4.3 Create the User Interface

This application can demonstrate video flow and gesture recognition from the color
camera. It uses three kinds of WPF control:
–– Image control used for hosting video
–– When users wave hands in front of the camera, the “refueling” label can be
displayed on the screen
–– The Stack Panel container containing other control

4.4 Create a Utility Class

First, a ConvertBitmap.Cs utility class is created. It is used to implement bitmap


conversion of color video stream. This step can realized by right-clicking on the
project name in the solution explorer, selecting “add item”, selecting “class” and
clicking OK. Then gesture detection, obtaining color images from the cameras and so
on are realized in the MainWindow. Xaml. Cs class.

5 The design and Implementation of the gesture tracking applica-


tion based on Unity3D and Intel real sense technology

It is very important to give laws of physics to objects in the game. It can enhance
the playability and Authenticity of the game. Similar to the development of 3D
 Application Development of 3D Gesture Recognition   635

gesture recognition application with WPF framework, the development process


of gesture tracking application based on Unity platform can also be understood
as a basic development process of the three dimensional scene application based
on Unity platform. “Write the script file” is the most critical process for the entire
application. It is where the Intel real sense technology and the Unity can perfectly
fuse.

5.1 Create 3D Objects

In a Table tennis game, some essential game objects like the racket, Table tennis, the
spring back wall, the console Table and so on are needed. In this article, however, it is
not entirely in accordance with the Table tennis’s performance in real life to create the
object. It is simplified and only the racket, balls, and the wall are kept. This is because
the emphasis of Unity3D development lies in how to make use of Intel real sense SDK
API to design a new and interesting game.

Figure 4. The Unity 3D scene view

5.2 The Perfect Fusion of Visual Studio, Unity3D and Intel Real Sense Technology

It is the most important to edit the object script file of the “racket”. The Intel real sense
camera captures the hand data. The data is expressed in detail in 3D coordinates and
passed to the Y direction of the velocity vector of the “racket”. That means the relevant
interface classes in Intel real sense SDK and the camera need to be initialized, and
object instances need to be created. Then this article creates a HandModel interface
class instance, use the instance to invoke a member function, and output the hand
model data to the objects of the HandData interface class. By invoking the HandData
member function, hand palm data can be extracted and used. Above is the basic
process of the core module of the Unity and Intel real sense technology fusion.
636   Application Development of 3D Gesture Recognition

5.2.1 The fusion of Visual Studio and Unity3D


The project codes are based on the C # programming language. The default crossing-
platform IDE Unity provides is MonoDevelop. However, cooperative working of
the Unity and Microsoft Visual Studio is more welcome. In order to do this, Edit-
>Preferences needs to be selected from the drop-down menu. After opening the
Unity Preferences dialog, the left External Tools need to be select. After clicking
MonoDevelop in the ComboBox, and the Browse button should be used to find the
Visual Studio.Exe file.

5.2.2 Intel real sense SDK with Unity


Unity 3D supports third-party plug-ins. It creates such a plug-in for the Intel real sense
SDK. For integrating Intel real sense SDK, SDK DLL file must be added to the project.
The external package and function libraries are imported into Unity by looking
for UnityToolkit.Unitypackage file in the Intel real sense SDK installation directory.
Only the plug-in and the plug-ins management directory need to be selected in the
pop-up dialog window. If it is based on 64-bit computer game development, DLL files
need to be replaced manually. However, in the latest version of Intel real sense SDK,
only the x86_64 plug-in management needs to be imported.

5.3 Figures and Tables

Final system effect is shown in Figure 5. Users can control two rackets by gestures to
make the ball go to the opposite after touching the racket in the scene. If the person
miss the ball, the other get one point. The success of the application development of
3D gesture recognition and tracking confirms the success of the collaboration and
fusion between the Intel real sense technology and traditional PC programs.

Figure 5. Final system effect


 Application Development of 3D Gesture Recognition   637

6 Conclusion

This paper is based on the Intel real sense technology and the F200 camera. Intel
real sense SDK development platform is combined with WPF application framework
and Unity3D successfully. C# programming language is used to realize the 3D gesture
recognition and tracking applications “Table Tennis Interactive”. From the invoking
of the SDK API to the connection between the application layer, every detail is
practiced. Hope the attempt to combine Intel real sense technology with some other
development platforms can bring inspiration to the readers.

Acknowledgment: This work is supported by the National Natural Science


Foundation of China (No. 61003175), “the Fundamental Research Funds for the Central
Universities” of China and the Liaoning Provincial Natural Science Foundation of
China (No. 201601523).

References
[1] H.B. Li, L.J. Ding, Y. Wu, and G.Y. Ran, “Static three-dimensional gesture recognition based on
kinect skeleton data,” Computer Applications and Software, Sep. 2015, pp. 161-165.
[2] L. Zhang, “3D Hand Gesture Recognition,” North China university of technology, 2013.
[3] Y. Wang and Q.Z. Zhang, “Gesture Recognition based on kinect depth information,” Journal of
Beijing Information Science & Technology University (Natural Science Edition), Jan. 2013, pp.
22-26.
[4] R. Deng, L.L. Zhou, and R.D. Ying, “Gesture extraction and recognition research based on kinect
depth data,” Application Research of Computers, Apr. 2013, pp. 1263-1265.
[5] H.L. Zhang, “The research and development of real-time vision- based gesture recognition
system,” Anhui University, 2013.
[6] Gupta L, Ma S, “Gesture-based interaction and communication: automated classification of
hand gesture contours,” IEEE Transactions on Systems Man & Cybernetics Part C, Jan. 2001, pp.
114-120.
[7] Y. Lv, “The research of the gesture interaction system based on cameras,” Qingdao University,
2009.
[8] T. Bian and W.M. Guo, “Development and research of force sensing touch gestures in tennis
game scenarios,” Beijing University of Post Telecommunication Software Engineering, 2011.
[9] F.F. Zeng, Y. Su, and J. Chen, “A new interactive input technology, 3D gesture recognition,”
Journal of East China Shipbuilding Institute, Jun. 2000, pp. 38-44.
[10] C.Q. Zhang and X. Song, “A real-time gesture recognition based applications development
framewok,” Microcomputer and its application, Sep. 2011, pp. 9-11.
[11] S.S. Guo, “Combination gesture recognition and application based on Kinect,” Liaoning
University, 2015.
[12] Y. Zhou, K. Liu, J. Gao, K.E. Barner and F. Kiamilev, “High-speed Structured Light Scanning
System and 3D Gestural Point Cloud Recognition,” Conference on Information Sciences and
Systems, 2013, pp. 1-6.

You might also like