You are on page 1of 652

FUZZY SYSTEMS AND DATA MINING II

Frontiers in Artificial Intelligence and


Applications
The book series Frontiers in Artificial Intelligence and Applications (FAIA) covers all aspects of
theoretical and applied Artificial Intelligence research in the form of monographs, doctoral
dissertations, textbooks, handbooks and proceedings volumes.
The FAIA series contains several sub-series, including ‘Information Modelling and Knowledge
Bases’ and ‘Knowledge-Based Intelligent Engineering Systems’. It also includes the biennial
European Conference on Artificial Intelligence (ECAI) proceedings volumes, and other EurAI
(European Association for Artificial Intelligence, formerly ECCAI) sponsored publications. An
editorial panel of internationally well-known scholars is appointed to provide a high quality
selection.

Series Editors:
J. Breuker, N. Guarino, J.N. Kok, J. Liu, R. López de Mántaras,
R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong

Volume 293
Recently published in this series

Vol. 292. H. Jaakkola, B. Thalheim, Y. Kiyoki and N. Yoshida (Eds.), Information Modelling
and Knowledge Bases XXVIII
Vol. 291. G. Arnicans, V. Arnicane, J. Borzovs and L. Niedrite (Eds.), Databases and
Information Systems IX – Selected Papers from the Twelfth International Baltic
Conference, DB&IS 2016
Vol. 290. J. Seibt, M. Nørskov and S. Schack Andersen (Eds.), What Social Robots Can and
Should Do – Proceedings of Robophilosophy 2016 / TRANSOR 2016
Vol. 289. I. Skadiņa and R. Rozis (Eds.), Human Language Technologies – The Baltic
Perspective – Proceedings of the Seventh International Conference Baltic HLT 2016
Vol. 288. À. Nebot, X. Binefa and R. López de Mántaras (Eds.), Artificial Intelligence Research
and Development – Proceedings of the 19th International Conference of the Catalan
Association for Artificial Intelligence, Barcelona, Catalonia, Spain, October 19–21,
2016
Vol. 287. P. Baroni, T.F. Gordon, T. Scheffler and M. Stede (Eds.), Computational Models of
Argument – Proceedings of COMMA 2016
Vol. 286. H. Fujita and G.A. Papapdopoulos (Eds.), New Trends in Software Methodologies,
Tools and Techniques – Proceedings of the Fifteenth SoMeT_16
Vol. 285. G.A. Kaminka, M. Fox, P. Bouquet, E. Hüllermeier, V. Dignum, F. Dignum and
F. van Harmelen (Eds.), ECAI 2016 – 22nd European Conference on Artificial
Intelligence, 29 August–2 September 2016, The Hague, The Netherlands – Including
Prestigious Applications of Artificial Intelligence (PAIS 2016)

ISSN 0922-6389 (print)


ISSN 1879-8314 (online)
Fuzzy Systems and Data Mining II
Proceedings of FSDM 2016

Edited by
Shilei Sun
International School of Software, Wuhan University, China

Antonio J. Tallón-Ballesteros
Department of Languages and Computer Systems, University of Seville, Spain

Dragan S. Pamučar
Department of Logistic, University of Defence in Belgrade, Serbia
and
Feng Liu
International School of Software, Wuhan University, China

Amsterdam • Berlin • Washington, DC


© 2016 The authors and IOS Press.

All rights reserved. No part of this book may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, without prior written permission from the publisher.

ISBN 978-1-61499-721-4 (print)


ISBN 978-1-61499-722-1 (online)
Library of Congress Control Number: 2016958585

Publisher
IOS Press BV
Nieuwe Hemweg 6B
1013 BG Amsterdam
Netherlands
fax: +31 20 687 0019
e-mail: order@iospress.nl

For book sales in the USA and Canada:


IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel.: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com

LEGAL NOTICE
The publisher is not responsible for the use which might be made of the following information.

PRINTED IN THE NETHERLANDS


v

Preface
Fuzzy Systems and Data Mining (FSDM) is an annual international conference devoted
to four main groups of topics: a) fuzzy theory, algorithm and system; b) fuzzy applica-
tion; c) the interdisciplinary field of fuzzy logic and data mining; and d) data mining.
Following the great success of FSDM 2015, held in Shanghai, the second edition in the
FSDM series was held in Macau, China, where experts, researchers, academics and
participants from the industry were introduced to the latest advances in the field of
Fuzzy Sets and Data Mining. Macau was declared a UNESCO World Heritage Site in
2005 by virtue of its cultural importance. The historic centre of Macau is of particular
interest because of its mixture of traditional Chinese and Portuguese cultures. Macau
has both Cantonese (a variant of Chinese) and Portuguese as official languages.
This volume contains the papers accepted and presented at the 2nd International
Conference on Fuzzy Systems and Data Mining (FSDM 2016), held on 11–14 Decem-
ber 2016 in Macau, China. All papers have been carefully reviewed by programme
committee members and reflect the breadth and depth of the research topics which fall
within the scope of FSDM. From several hundred submissions, 81 of the most promis-
ing and FAIA mainstream-relevant contributions have been selected for inclusion in
this volume; they present original ideas, methods or results of general significance
supported by clear reasoning and compelling evidence.
FSDM 2016 was also a reference conference, and the conference programme in-
cluded keynote and invited presentations, oral and poster contributions. The event pro-
vided a forum where more than 100 qualified and high-level researchers and experts
from over 20 countries, including 4 keynote speakers, gathered to create an important
platform for researchers and engineers worldwide to engage in academic communica-
tion.
I would like to thank all the keynote and invited speakers and authors for the effort
they have put into preparing their contributions to the conference. We would also like
to take this opportunity to express our gratitude to those people, especially the program
committee members and reviewers, who devoted their time to assessing the papers. It is
an honour to continue with the publication of these proceedings in the prestigious series
Frontiers in Artificial Intelligence and Applications (FAIA) from IOS Press. Our par-
ticular thanks also go to J. Breuker, N. Guarino, J.N. Kok, R. López de Mántaras, J. Liu,
R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong, the FAIA series editors, for support-
ing this conference.
Last but not least, I hope that all our participants have enjoyed their stay in Macau
and their time at the Macau University of Science and Technology (M.U.S.T.). We
hope you had a magnificent experience in both places.

Antonio J. Tallón-Ballesteros
University of Seville, Spain
This page intentionally left blank
vii

Contents
Preface v
Antonio J. Tallón-Ballesteros

Fuzzy Control, Theory and System

Cumulative Probability Distribution Based Computational Method for High


Order Fuzzy Time Series Forecasting 3
Sukhdev S. Gangwar and Sanjay Kumar
Introduction to Fuzzy Dual Mathematical Programming 11
Carlos A.N. Cosenza, Fabio Krykhtine, Walid El Moudani
and Felix A.C. Mora-Camino
Forecasting National Football League Game Outcomes Based on Fuzzy
Candlestick Patterns 22
Yu-Chia Hsu
A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 28
Feng Ran, Ke-Wei Hu, Jing-Wei Zhao and Yuan Ji
Interval-Valued Hesitant Fuzzy Geometric Bonferroni Mean Aggregation
Operator 37
Xiao-Rong He, Ying-Yu Wu, De-Jian Yu, Wei Zhou and Sun Meng
A New Integrating SAW-TOPSIS Based on Interval Type-2 Fuzzy Sets
for Decision Making 45
Lazim Abdullah and C.W. Rabiatul Adawiyah C.W. Kamal
Algorithms for Finding Oscillation Period of Fuzzy Tensors 51
Ling Chen and Lin-Zhang Lu
Toward a Fuzzy Minimum Cost Flow Problem for Damageable Items
Transportation 58
Si-Chao Lu and Xi-Fu Wang
Research on the Application of Data Mining in the Field of Electronic
Commerce 65
Xia Song and Fang Huang
A Fuzzy MEBN Ontology Language Based on OWL2 71
Zhi-Yun Zheng, Zhuo-Yun Liu, Lun Li, Dun Li and Zhen-Fei Wang
State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets 81
De-Hua He, Jin-Ding Cai, Song Xie and Qing-Mei Zeng
Finite-Time Stabilization for T-S Fuzzy Networked Systems with State
and Communication Delay 87
He-Jun Yao, Fu-Shun Yuan and Yue Qiao
viii

A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough


Sets 94
Zhi-Ying Lv, Ping Huang, Xian-Yong Zhang and Li-Wei Zheng
Fuzzy Rule-Based Stock Ranking Using Price Momentum and Market
Capitalization 102
Ratchata Peachavanish
Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation 108
Huan Niu, Jie Yang and Jie-Ru Chi
Hesitant Bipolar Fuzzy Set and Its Application in Decision Making 115
Ying Han, Qi Luo and Sheng Chen
Chance Constrained Twin Support Vector Machine for Uncertain Pattern
Classification 121
Ben-Zhang Yang, Yi-Bin Xiao, Nan-Jing Huang and Qi-Lin Cao
Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics 131
Eunsuk Yang

Data Mining

Dynamic Itemset Mining Under Multiple Support Thresholds 141


Nourhan Abuzayed and Belgin Ergenç
Deep Learning with Large Scale Dataset for Credit Card Data Analysis 149
Ayahiko Niimi
Probabilistic Frequent Itemset Mining Algorithm over Uncertain Databases
with Sampling 159
Hai-Feng Li, Ning Zhang, Yue-Jin Zhang and Yue Wang
Priority Guaranteed and Energy Efficient Routing in Data Center Networks 167
Hu-Yin Zhang, Jing Wang, Long Qian and Jin-Cai Zhou
Yield Rate Prediction of a Dynamic Random Access Memory Manufacturing
Process Using Artificial Neural Network 173
Chun-Wei Chang and Shin-Yeu Lin
Mining Probabilistic Frequent Itemsets with Exact Methods 179
Hai-Feng Li and Yue Wang
Performance Degradation Analysis Method Using Satellite Telemetry Big Data 186
Feng Zhou, De-Chang Pi, Xu Kang and Hua-Dong Tian
A Decision Tree Model for Meta-Investment Strategy of Stock Based on Sector
Rotating 194
Li-Min He, Shao-Dong Chen, Zhen-Hua Zhang, Yong Hu
and Hong-Yi Jiang
Virtualized Security Defense System for Blurred Boundaries of Next
Generation Computing Era 208
Hyun-A. Park
ix

Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules 220


Yong Wang, Ya-Zhi Tao, Xiao-Yi Wan and Hui-Ying Cao
Characteristics Analysis and Data Mining of Uncertain Influence Based
on Power Law 226
Ke-Ming Tang, Hao Yang, Qin Liu, Chang-Ke Wang and Xin Qiu
Hazardous Chemicals Accident Prediction Based on Accident State Vector
Using Multimodal Data 232
Kang-Wei Liu, Jian-Hua Wan and Zhong-Zhi Han
Regularized Level Set for Inhomogeneity Segmentation 241
Guo-Qi Liu and Hai-Feng Li
Exploring the Non-Trivial Knowledge Implicit in Test Instance to Fully
Represent Unrestricted Bayesian Classifier 248
Mei-Hui Li and Li-Min Wang
The Factor Analysis’s Applicability on Social Indicator Research 254
Ying Xie, Yao-Hua Chen and Ling-Xi Peng
Research on Weapon-Target Allocation Based on Genetic Algorithm 260
Yan-Sheng Zhang, Zhong-Tao Qiao and Jian-Hui Jing
PMDA-Schemed EM Channel Estimator for OFDM Systems 267
Xiao-Fei Li, Di He and Xiao-Hua Chen
Soil Heavy Metal Pollution Research Based on Statistical Analysis and BP
Network 274
Wei-Wei Sun and Xing-Ping Sheng
An Improved Kernel Extreme Learning Machine for Bankruptcy Prediction 282
Ming-Jing Wang, Hui-Ling Chen, Bin-Lei Zhu, Qiang Li, Ke-Jie Wang
and Li-Ming Shen
Novel DBN Structure Learning Method Based on Maximal Information
Coefficient 290
Guo-Liang Li, Li-Ning Xing and Ying-Wu Chen
Improvement of the Histogram for Infrequent Color-Based Illustration Image
Classification 299
Akira Fujisawa, Kazuyuki Matsumoto, Minoru Yoshida and Kenji Kita
Design and Implementation of a Universal QC-LDPC Encoder 306
Qian Yi and Han Jing
Quantum Inspired Bee Colony Optimization Based Multiple Relay Selection
Scheme 312
Feng-Gang Lai, Yu-Tai Li and Zhi-Jie Shang
A Speed up Method for Collaborative Filtering with Autoencoders 321
Wen-Zhe Tang, Yi-Lei Wang, Ying-Jie Wu and Xiao-Dong Wang
Analysis of NGN-Oriented Architecture for Internet of Things 327
Wei-Dong Fang, Wei He, Zhi-Wei Gao, Lian-Hai Shan and Lu-Yang Zhao
x

Hypergraph Spectral Clustering via Sample Self-Representation 334


Shi-Chao Zhang, Yong-Gang Li, De-Bo Cheng and Zhen-Yun Deng
Safety Risk Early-Warning for Metro Construction Based on Factor Analysis
and BP_Adaboost Network 341
Hong-De Wang, Bai-Le Ma and Yan-Chao Zhang
The Method Study on Tax Inspection Cases-Choice: Improved Support Vector
Machine 347
Jing-Huai She and Jing Zhuo
Development of the System with Component for the Numerical Calculation
and Visualization of Non-Stationary Waves Propagation in Solids 353
Zhanar Akhmetova, Serik Zhuzbayev, Seilkhan Boranbayev
and Bakytbek Sarsenov
Infrared Image Recognition of Bushing Type Cable Terminal Based on Radon
and Fourier-Mellin Transform and BP Neural Network 360
Hai-Qing Niu, Wen-Jian Zheng, Huang Zhang, Jia Xu and Ju-Zhuo Wu
Face Recognition with Single Sample Image per Person Based on Residual
Space 367
Zhi-Bo Guo, Yun-Yang Yan, Yang Wang and Han-Yu Yuan
Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm
in the Application of Personnel Scheduling in National Geographic
Conditions Monitoring 377
Juan Du, Xu Zhou, Shu Tao and Qian Liu
Quality Prediction in Manufacturing Process Using a PCA-BPNN Model 390
Hong Zhou and Kun-Ming Yu
The Study of an Improved Intelligent Student Advising System 397
Xiaosong Li
An Enhanced Identity Authentication Security Access Control Model Based
on 802.1x Protocol 407
Han-Ying Chen and Xiao-Li Liu
Recommending Entities for E-R Model by Ontology Reasoning Techniques 414
Xiao-Xing Xu, Dan-Tong Ouyang, Jie Liu and Yu-Xin Ye
V-Sync: A Velocity-Based Time Synchronization for Multi-Hop Underwater
Mobile Sensor Networks 420
Meng-Na Zhang, Hai-Yan Wang, Jing-Jie Gao and Xiao-Hong Shen
An Electricity Load Forecasting Method Based on Association Rule Analysis
Attribute Reduction in Smart Grid 429
Huan Liu and Ying-Hua Han
The Improved Projection Pursuit Evaluation Model Based on Depso Algorithm 438
Bin Zhu and Wei-Dong Jin
HRVBased Stress Recognizing by Random Forest 444
Gang Zheng, Yan-Hui Chen and Min Dai
xi

Ricci Flow for Optimization Routing in WSN 452


Ke-Ming Tang, Hao Yang, Xin Qiu and Lv-Qing Wu
Research on the Application-Driven Architecture in Internet of Things 458
Wei-Dong Fang, Wei He, Wei Chen, Lian-Hai Shan and Feng-Ying Ma
A GOP-Level Bitrate Clustering Recognition Algorithm for Wireless Video
Transmission 466
Wen-Juan Shi, Song Li, Yan-Jing Sun, Qi Cao and Hai-Wei Zuo
The Analysis of Cognitive Image and Tourism Experience in Taiwan’s Old
Streets Based on a Hybrid MCDM Approach 476
Chung-Ling Kuo and Chia-Li Lin
A Collaborative Filtering Recommendation Model Based on Fusion
of Correlation-Weighted and Item Optimal-Weighted 487
Shi-Qi Wen, Cheng Wang, Jian-Ying Wang, Guo-Qi Zheng,
Hai-Xiao Chi and Ji-Feng Liu
A Cayley Theorem for Regular Double Stone Algebras 501
Cong-Wen Luo
ARII-eL: An Adaptive, Informal and Interactive eLearning Ontology Network 507
Daniel Burgos
Early Prediction of System Faults 519
You Li and Yu-Ming Lin
QoS Aware Hierarchical Routing Protocol Based on Signal to Interference
plus Noise Ratio and Link Duration for Mobile Ad Hoc Networks 525
Yan-Ling Wu, Ming Li and Guo-Bin Zhang
The Design and Implementation of Meteorological Microblog Public Opinion
Hot Topic Extraction System 535
Fang Ren, Lin Chen and Cheng-Rui Yang
Modeling and Evaluating Intelligent Real-Time Route Planning and Carpooling
System with Performance Evaluation Process Algebra 542
Jie Ding, Rui Wang and Xiao Chen
Multimode Theory Analysis of the Coupled Microstrip Resonator Structure 549
Ying Zhao, Ai-Hua Zhang and Ming-Xiao Wang
A Method for Woodcut Rendering from Images 555
Hong-Qiang Zhang, Shu-Wen Wang, Cong Ma and Bing-Kun Pi
Research on a Non-Rigid 3D Shape Retrieval Method Based on Global
and Partial Description 562
Tian-Wen Yuan, Yi-Nan Lu, Zhen-Kun Shi and Zhe Zhang
Virtual Machine Relocating with Combination of Energy and Performance
Awareness 570
Xiang Li, Ning-Jiang Chen, You-Chang Xu and Rangsarit Pesayanavin
Network Evolution via Preference and Coordination Game 579
En-Ming Dong, Jian-Ping Li and Zheng Xie
xii

Sensor Management Strategy with Probabilistic Sensing Model


for Collaborative Target Tracking in Wireless Sensor Network 585
Yong-Jian Yang, Xiao-Guang Fan, Sheng-Da Wang, Zhen-Fu Zhuo,
Jian Ma and Biao Wang
Generalized Hybrid Carrier Modulation System Based M-WFRFT with Partial
FFT Demodulation over Doubly Selective Channels 592
Yong Li, Zhi-Qun Song and Xue-Jun Sha
On the Benefits of Network Coding for Unicast Application in Opportunistic
Traffic Offloading 598
Jia-Ke Jiao, Da-Ru Pan, Ke Lv and Li-Fen Sun
A Geometric Graph Model of Citation Networks with Linearly Growing
Node-Increment 605
Qi Liu, Zheng Xie, En-Ming Dong and Jian-Ping Li
Complex System in Scientific Knowledge 612
Zong-Lin Xie, Zheng Xie, Jian-Ping Li and Xiao-Jun Duan
Two-Wavelength Transport of Intensity Equation for Phase Unwrapping 618
Cheng Zhang, Hong Cheng, Chuan Shen, Fen Zhang, Wen-Xia Bao,
Sui Wei, Chao Han, Jie Fang and Yun Xia
A Study of Filtering Method for Accurate Indoor Positioning System Using
Bluetooth Low Energy Beacons 624
Young Hyun Jin, Wonseob Jang, Bin Li, Soo Jeong Kwon,
Sung Hoon Lim and Andy Kyung-yong Yoon

Subject Index 633


Author Index 637
Fuzzy Control, Theory and System
This page intentionally left blank
Fuzzy Systems and Data Mining II 3
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-3

Cumulative Probability Distribution Based


Computational Method for High Order
Fuzzy Time Series Forecasting
Sukhdev S. GANGWAR and Sanjay KUMAR1
Department of Mathematics, Statistics & Computer Science, G. B. Pant University of
Agriculture & Technology, Pantnagar-263145, Uttarakhand, India

Abstract. Issue of deciding interval length, calculations of complicated fuzzy


logical relations and hunt of apposite defuzzification process have been an
important area of research in fuzzy time series forecasting since its inception. In
present study, cumulative probability distribution based computational scheme
with discretized of universe is proposed for fuzzy time series forecasting. In this
study, cumulative probability distribution decides the length of intervals using
characteristic of data distribution and proposed computational algorithm
minimizes calculations of complex fuzzy logical relations and search of suitable
defuzzification method. To verify the enhancement in forecasting accuracy of
developed model, it is applied to the benchmark problem of forecasting historical
student enrollments of University of Alabama. Accuracy in forecasted enrollments
of developed model is also compared with the other various methods using
different error measures. Coefficients of correlation and determination are used to
determine the strength between forecasted and actual enrollments.

Keywords. Fuzzy time series, probability distribution, computational method,


forecasting

Introduction

Multiple regression based parametric models (Autoregression, Moving-average,


ARMA, ARIMA etc.) are comprehensive Statistical techniques used for forecasting.
An important confine of these parametrical forecasting models is not to tackle issue of
uncertainty in time series data that occurs because of imprecision and vagueness. Song
and Chissom [1, 2, 3] integrated fuzzy set theory of Zadeh [4] with time series
forecasting and developed few models of fuzzy time series (FTS) forecasting to grip
the uncertainty in historical time series data to forecast enrollments of the University of
Alabama. Chen [5] and Hwang et. al. [6] used simple arithmetic operation and
variations of the enrollments between current and last year to develop more efficient
FTS forecasting methods than the ones presented by Song and Chissom [1, 2, 3]. Own
and Yu [7] proposed high order forecasting model to address the limitation of the
model developed by Chen [5].

1
Corresponding Author: Sanjay KUMAR, Department of Mathematics, Statistics & Computer Science,
G. B. Pant University of Agriculture & Technology, Pantnagar-263145, Uttarakhand, India; E-mail:
skruhela@hotmail.com.
4 S.S. Gangwar and S. Kumar / Cumulative Probability Distribution

Wong et al. [8] utilized window size of FTS to propose a time variant forecasting
model. Performance of this model was tested by using time series data of enrollments
of University of Alabama and TAIEX. Kai et. al. [9] used K-mean clustering technique
to discretize universe of discourse and proposed an enhanced fuzzy time series
forecasting model. Chen and Tanuwijaya [10] presented new methods to handle
forecasting problems using high-order fuzzy logical relationships and automatic
clustering techniques.
Cheng et al. [11] discretized universe of discourse (UD) using minimum entropy
principle and used trapezoidal membership functions for enhancing accuracy in FTS
forecasting. Hurang and Yu [12] used ratio-based method to identify the length of
intervals in fuzzy time series forecasting which was further enhanced by Yolcu et. al.
[13] using single-variable constrained optimization technique. Teoh et al. [14] used
cumulative probability distribution approach (CPDA) with rough set rule induction and
proposed a hybrid FTS model. Su et al. [15] proposed used MEPA, CPDA and a rough
set algorithm to develop a new model for FTS forecasting.
Fuzzy relational equation and suitable defuzzification process are the pivot
components in any fuzzy time series forecasting method. To minimize the time in
generating fuzzy relational equations using complex min-max composition operation
and to eliminate the search of suitable defuzzification process Singh [16, 17, 18]
proposed various computational methods using difference parameter as fuzzy relation
for FTS forecasting. Joshi and Kumar [19] also presented a computational method
using third order difference as fuzzy relation. To enhance the performance of
computational FTS forecasting method, Gangwar and Kumar [20] developed a
computational algorithm using high order difference parameters and implemented it in
discretized universe of discourse. Intuitionistic fuzzy set (IFS) were used with CPDA
by Gangwar and Kumar [21] to introduce hesitation in FTS forecasting with unequal
intervals.
UD in all computational methods was portioned into the intervals of equal length.
In some cases, the discretization of universe of discourse into equal length intervals
may not give correct classification of time series data. The motivation and intention of
this study is to present a computational method using high order difference parameters
as fuzzy relation with discretized UD in which length of the intervals are optimized
using CPDA. Proposed algorithm eliminates time of making relational equations by
using tedious min-max composition operations and defuzzification process. Developed
method of FTS forecasting has been applied to benchmark problem of forecasting
student enrollments data of University of Alabama and compared with the other recent
methods proposed by various researchers.

1. Some Basic Concepts of Fuzzy Time Series

~
Let U = {u1, u2, u3, . . . , un,}be an UD. A fuzzy set Ai of U is defined as follows:

~
Ai P A~ (u1 ) / u1  P A~ (u 2 ) / u 2  P A~ (u3 ) / u3  .......  P A~ (u n ) / u n
i i i i
~
Here P A~ is membership function of fuzzy set Ai and assigns a value to each element
i
S.S. Gangwar and S. Kumar / Cumulative Probability Distribution 5

~
of U in [0, 1]. P A~ (u k )
i
(1 ≤ k ≤ n) is grade of membership of uk in Ai . Suppose
fuzzy sets fi(t), (i = 1, 2, . . .) are defined in the Universe of discourse Y(t). If F(t) is
the collection of fi(t), then F(t) is known as fuzzy time series on Y(t) [1]. F(t) and Y(t)
depend upon t and hence both are the function of time. If only F(t-1) causes F(t), i.e.
F (t  1) o F (t ), then relationship is denoted by fuzzy relational equation
F (t ) F (t  1)oR (t , t  1) and is called the first-order model of F(t). (‘‘o’’ is
Max–Min composition operator). If more than one fuzzy sets F(t-n), F(t-n+1), . . . ,F(t-
1) cause F(t) , then relationship is called nth order fuzzy time series model [1, 2].

2. Proposed FTS Method and Computational Algorithm

Proposed FTS method uses CPDA to discretize UD. It uses the ratio formula [20] for
determining the number of partitions. Order of difference parameters used in forecast is
computed as follows:
x For year 1973 enrollment forecast, proposed computational method uses second
order difference parameter D2 | E2  E1 | .
x For year 1974 enrollment forecast, proposed computational method uses third
order difference parameter D3 | E3  E2 |  | E2  E1 | .
x For year 1975 enrollment forecast, proposed computational method uses fourth
order difference parameter

D4 | E4  E3 |  | E3  E2 |  | E2  E1 | .
ith order difference parameter is defined as follows:

ª i 1 º
Di | Ei  Ei 1 |  «¦ | Ei c  Ei ( c 1) |»  | E1  E 0 | , 2 d i d N
¬c 1 ¼ (1)

Here, N is number of observations in each partition.


The methodology of proposed computational algorithm based FTS forecasting
method is explained in following steps:
Step 1 Since normal distribution is essential constraint for CPDA. We use lilliefors test
of Dallal and Wilkinson [22] to verify whether time series data follow normal
distribution or not. If time series data follow normal distribution, go to step 2.
Step 2 Standard deviation (V) is main characteristic of normal distribution and is
implemented to define universe of discourse, U = [Emin - V, Emax+ V].
Step 3 U is discretized into n intervals. Length of these intervals is determined using
CPDA in following sub steps:
1 Calculate both lower (PLB )and upper bound (PUB) of cumulative probabilities
using following equations:
6 S.S. Gangwar and S. Kumar / Cumulative Probability Distribution

1
PLB 0 ½
°
2i  3 ¾
i
PLB , 2 d i d 3°
2n ¿ (2)

PUB 1, i n (3)

2 Calculate boundaries of each interval using inverse of following normal


cumulative distribution function (CDF) with parameters mean (c) and standard
deviation (V) at the corresponding probabilities in P.

x F 1 ( P | c,V ) {x : F ( x | c,V ) P (4)

1 ­  ( x  c) 2 ½
x

V 2S ³f ¯ 2V
and, P F ( x | c, V ) exp ® 2 ¾dx (5)
¿

Step 4 Construct the triangular fuzzy sets Ãi in accordance with the intervals
constructed in step 3.
Step 5 Fuzzify observations of time series by choosing maximum membership grade
set up fuzzy logical relationships.
Step 6 Use ratio formula [20] for repartitioning time series into different partitions.
Step 7 Apply the following computational algorithm.
~ ~
For a fuzzy logical relation Ai o A j Ãi and Ãj are fuzzified enrollment of
current and next year. Ei and Fj are actual enrollment of current year and crisp
forecasted enrollment of the next year.
Computational algorithm: Forecasted enrollments of University of Alabama are
computed using the following computational algorithm with complexity of linear order.
This algorithm uses the difference parameters (Di) of various orders, lease and upper
~ ~
bounds of the intervals. For a fuzzy logical relation Ai o A j , it uses mid point of the
~
intervals ui and uj having supremum value in Ai and Ãj. The algorithm starts to forecast
enrollment for year 1973 in partition 1, 1981 in partition 2 and 1988 in partition 3 using
the second order difference parameter. In following computational algorithm [*Ãj] is
interval uj for which membership in Ãj is supremum (i.e. 1), L[*Ãj] and U[*Ãj] are
lower and upper bounds of interval uj respectively. l[*Ãj] and M[*Ãk] is length and mid
point of the interval uj whose membership in Ãj is supremum (i.e. 1).

For i = 2, 3, . . . . . . N (No. of observations in partition)


Obtained fuzzy logical Relation for year i to i+1
~ ~
Ai o A j
P=0 and Q=0
Compute
S.S. Gangwar and S. Kumar / Cumulative Probability Distribution 7

ª i 1 º
Di Ei  Ei 1  «¦ Ei c  Ei ( c 1) »  E1  E0
¬c 1 ¼
For a = 2, 3,......i
Fia = M[*Ãi] + Di/(2(a-1))
FFia = M[*Ãi] - Di/(2(a-1))
If Fia ≥ L[*Ãj] and Fia ≤ U[*Ãj]
Then P =P+ Fia and Q =Q+ 1
If Fia ≥ M[*Ãj]
Then P =P+ l[*Ãj]/( 2(i-1)*(2(a-1))**2)
Else P =P- l[*Ãj]/( 2(i-1)*(2(a-1))**2)
If FFia ≥ L[*Ãj] and FFia ≤ U[*Ãj]
Then P =P+ FFia and Q =Q+ 1
If FFia ≥ M[*Ãj]
Then P =P+ l[*Ãj]/( 2(i-1)*(2(a-1))**2)
Else P =P- l[*Ãj]/( 2(i-1)*(2(a-1))**2)
Next a
Fj = (P + M(*Ãj))/(Q + 1)
Next i
We use the root mean square error (RMSE) and average forecasting error (AFE) to
compare the forecasting results of different forecasting methods. Coefficients of
correlation and determination are used to determine the strength between actual and
forecasted enrollments of University of Alabama.

3. Experimental Study

In this section, proposed method is applied to forecast enrollments at University of


Alabama. Online lilliefors calculator confirms that time series data obey normal
distribution. Emin and Emax are observed from actual enrollments at University of
Alabama (Table 1). UD is defined as U [ Emin  V , Emax  V ] and is
approximately equal to [11280, 21112]. UOD is further discretized into seven unequal
intervals. Both PLB and PUB for each interval are computed using the equations 3, 4, 5
and 6 given in section 3. Seven fuzzy sets Ã1, Ã2, Ã3, ..........., Ã7 are defined on UD. Time
series data is discretized into three parts using ratio expression [9]. Finally,
computational algorithm described in section III is applied to each partition to compute
forecasted enrollments of University of Alabama. The forecasted enrollments are
presented in Table 1. Table 2 (a & b) shows RMSE and AFE in forecasted enrollments.
Table 1. Actual and Forecasted enrollments of University of Alabama from year 1971 to year 1992.

Year Enrollments Enrollments


Year
Actual Forecasted Actual Forecasted
1971 13055 - 1982 15433 15502
1972 13563 - 1983 15497 15332
1973 13867 13993 1984 15145 15332
1974 14696 14392 1985 15163 15332
8 S.S. Gangwar and S. Kumar / Cumulative Probability Distribution

1975 15460 15209 1986 15984 -


1976 15311 15332 1987 16859 -
1977 15603 15332 1988 18150 18478
1978 15861 15875 1989 18970 19356
1979 16807 - 1990 19328 19356
1980 16919 - 1991 19337 19356
1981 16388 16696 1992 18876 19356

4. Results and Discussions

In order to compare the performance of the aforementioned fuzzy time series


forecasting method, it has been implemented to forecast enrollments of University of
Alabama. RMSE and AFER in forecasting enrollments by proposed method are
observed 240.20 and 1.183 respectively (Table 2a) which is less than that of the
methods proposed by Liu [23], Cheng et. al [24], Wong et. al [8], Egrioglu [25], Singh
[18], Joshi and Kumar [19] and Gangwar and Kumar [21], Gang & Hong-wei [26],
Gangwar & Kumar [20]. Diminished amount of RMSE and AFER confirms that the
CPDA and computational algorithm based projected FTS forecasting method
outperforms the methods given by [23, 24, 8, 25, 18, 19, 21, 26, 20] Coefficient of
Correlation (R) and Coefficient of determination (R2) between actual and forecasted
enrollments were observed 0.994294 and 0.988622 that confirms the good strength of
association between actual and forecasted enrollments.
Table 2a: Comparison of proposed method in terms of error measures
Proposed [23] [8] [18] [20]
240.2 328.78 297.2 308.7 642.6
RMSE
1.183 1.32 1.52 1.53 2.97
AFER

Table 2b: Comparison of proposed method in terms of error measures

Proposed [24] [25] [26] [21] [19]


240.2 478.4 484.6 440.6 251 419
RMSE
1.183 2.40 2.21 2.06 1.27 2.07
AFER

5. Conclusions

This study proposes cumulative probability distribution and computational approach


based method for high order FTS forecasting to enhance the accuracy in forecast. The
fusion of cumulative probability distribution with computational method is proposed to
give a hybrid fuzzy time series model. The computational algorithm based FTS
forecasting methods those are reviewed in the literature use intervals of equal length
and keep the order of difference parameters fixed. The major recompenses of this FTS
forecasting method are (i) it uses a computational algorithm whose complexity is of
S.S. Gangwar and S. Kumar / Cumulative Probability Distribution 9

linear order with partition mechanism of UD and thus forecasting of time series data
with large number of observations may not be a matter of concern, (ii) it uses CPDA to
determine the length of the intervals used in forecasting, (iii) it reduces intricate
computations of fuzzy relational matrices and eliminates need of defuzzification
method.
Even though the fusion of CPDA with computational approach in partitioned
environment enhances the accuracy in forecasted output, following are few limitations
with the proposed method.
1. It can not be applied to time series data that does not follow normal
distribution.
2. Time series data are partitioned using the ratio
U ( Emax  Emin ) / 2( Emax  Emin ) . If 0  U d 1 then there will be no.
In this case, difference parameters increases heavily to make the computation
very complex.
3. If U t N / 2 (N = no. of observations in time series data), there will not be
enough observations in partitions for subsequently forecast.
However, some preprocessing techniques can be explored to make time series data
approximately normally distributed to address the limitation of non-normally
distributed time series data. There is also scope to explore the proposed method with
well known k-mean or any exclusive clustering techniques for partitioning the time
series data rather than using ratio formula.

References

[1] Q. Song, B. S. Chissom, Fuzzy time series and its models, Fuzzy Sets and Systems, 54(1993), 269-277.
[2] Q. Song, B. S. Chissom, Forecasting enrollments with fuzzy time series - Part I, Fuzzy Sets and Systems,
54(1993), 1-9
[3] Q. Song, B. S. Chissom, Forecasting enrollments with fuzzy time series - Part II., Fuzzy Sets and Systems,
64(1994), 1-8.
[4] L. A. Zadeh, Fuzzy set, Information and Control, 8(1965), 338-353.
[5] S. M. Chen, Forecasting enrollments based on fuzzy time series, Fuzzy Sets and Systems, 81(1996), 311-
319.
[6] J. R. Hwang, S. M. Chen, C. H. Lee, Handling forecasting problem using fuzzy time series, Fuzzy Set
and System, 100(1998), 217-228.
[7] C. M Own, P. T Yu, Forecasting fuzzy time series on a heuristic high-order model, Cybernetics and
Systems: An International Journal, 36(2005), 705-717.
[8] W. K. Wong, E. Bai, A. W. C. Chu, Adaptive time variant models for fuzzy time series forecasting. IEEE
Transaction on Systems, Man and Cybernetics-Part B: Cybernetics, 40(2010), 1531-1542.
[9] K. Chi, F. P. Fu and W. G. Chen, A novel forecasting model of fuzzy time series based on K-means
clustering, IWETCS, IEEE, 2010, 223–225.
[10] S. M. Chen, K, Tanuwijaya, Fuzzy forecasting based on high-order fuzzy logical relationships and
automatic clustering techniques, Expert Systems with Applications, 38(2011), 15425-15437.
[11] C. H. Cheng, R. J. Chang, C. A. Yeh, Entropy-based and trapezoid fuzzification based fuzzy time series
approach for forecasting IT project cost, Technological Forecasting and Social Change, 73(2006), 524-
542.
[12] K. Huarng, T. H. K. Yu, Ratio-Based Lengths Of Intervals To Improve Fuzzy Time Series Forecasting,
IEEE Transactions on Systems, Man and Cybernetics-Part B: Cybernetics, 36(2006), 328–40.
[13] U, Yolcu, E. Egrioglu, V. R. Uslu, M. A. Basaran, C. H. Aladag, A new approach for determining the
length of intervals for fuzzy time series, Applied Soft Computing, 9(2009), 647-651.
[14] H. J. Teoh, C. H. Cheng, H. H. Chu, J. S. Chen, Fuzzy Time Series Model Based on Probabilistic
Approach and Rough Set Rule Induction for Empirical Research in Stock Markets, Data & Knowledge
Engineering, 67(2008), 103–17.
10 S.S. Gangwar and S. Kumar / Cumulative Probability Distribution

[15] C. H. Su, T. L. Chen, C. H. Cheng, Y. C. Chen, Forecasting the Stock Market with Linguistic Rules
Generated from the Minimize Entropy Principle and the Cumulative Probability Distribution
Approaches, Entropy, 12(2010), 2397-417.
[16] S. R. Singh, A robust method of forecasting based on fuzzy time series, Applied Mathematics and
Computation, 188(2007), 472-484.
[17] S. R. Singh, A simple time variant method for fuzzy time series forecasting, Cybernetics and Systems:
An International Journal, 38(2007), 305-321.
[18] S. R. Singh, A computational method of forecasting based on fuzzy time series, Mathematics and
Computers in Simulation, 79(2008), 539-554
[19] B. P. Joshi, S. Kumar, A Computational method for fuzzy time series forecasting based on difference
parameters, International Journal of Modeling, Simulation and Scientific Computing, 4(2013),
1250023-1250035.
[20] S. S. Gangwar, S. Kumar, Partitions based computational method for high-order fuzzy time series
forecasting, Expert Systems with Applications, 39(2012), 12158-12164.
[21] S. S Gangwar, S. Kumar, Probabilistic and intuitionistic fuzzy sets based method for fuzzy time series
forecasting, Cybernetics and Systems, 45(2014), 349-361.
[22] G. E. Dallal, L. Wilkinson, An Analytic Approximation to the Distribution of Lilliefors’s Test for
Normality, The American Statistician, 40(1986), 294-296.
[23] H. T Liu, An improved fuzzy time series forecasting method using trapezoidal fuzzy numbers, Fuzzy
Optimization and Decision Making, 6(2007), 63-80.
[24] C. H. Cheng, J. W. Wang, G. W. Cheng, Multi-attribute fuzzy time series method based on fuzzy
clustering, Expert Systems with Applications, 34(2008), 1235-1242.
[25] E. Egrioglu, A new time-invariant fuzzy time series forecasting method based on genetic algorithm,
Advances in Fuzzy Systems, 2012, 2.
[26] G. Chen, H. W. Qu, A new forecasting method of fuzzy time series model, Control and Decision,
28(2013) 105-109.
Fuzzy Systems and Data Mining II 11
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-11

Introduction to Fuzzy Dual Mathematical


Programming
Carlos A. N. COSENZA a, Fabio KRYKHTINE a, Walid El MOUDANI b
and Felix A. C. MORA-CAMINO c,1
a
Lab Fuzzy, COPPE, Universidade Federal do Rio de Janeiro, Centro de Tecnologia,
Ilha do fundão, CEP 21941-594 Rio de Janeiro, RJ, Brazil
b
Doctorate School of Sciences and Technologies, Lebanese University, Tripoli-Al
koubba, Lebanon
c
ENAC, Toulouse University,7 avenue Edouard Belin, 31055 Toulouse, France

Abstract. In this communication the formulation of optimization problems using


fuzzy dual parameters and variables is introduced to cope with parametric or
implementation uncertainties. It is shown that fuzzy dual programming problems
generate finite sets of deterministic optimization problems, allowing to assess the
range of the solutions and of the resulting performance at an acceptable
computational effort.

Keywords. fuzzy dual numbers, fuzzy dual calculus, optimization, mathematical


programming

Introduction

In general optimization problems assume implicitly that their parameters (cost


coefficients, limit values for decision variables, boundary levels for constraints) are
perfectly known while very often for real problems this is not exactly the case [1].
Different approaches have been proposed in the literature to cope with this difficulty. A
first approach has been to perform around the nominal optimal solution numerical post
optimization sensibility analysis [2]. When some probabilistic information about the
values of the uncertain parameters is available, stochastic optimization techniques [3]
may provide the most expected optimal solution. When these parameters are only
known to remain within some intervals, robust optimization techniques [4] have been
developed to provide robust solutions. The fuzzy formalism has been also considered in
this case as an intermediate approach to represent the parameter uncertainties and
provide fuzzy solutions [5]. These different approaches result in general into a very
large amount of computation which turns them practically unfeasible.
In this communication, a new formalism based on fuzzy dual numbers is proposed
to diminish the computational burden when dealing with uncertainty in mathematical
programming problems.
The adopted formalism considers fuzzy dual numbers which have been introduced
recently by two of the authors [6] and which can be seen as a simplified version of

1
Corresponding Author: Felix A. C. MORA-CAMINO; ENAC, Toulouse University, 7 avenue
Edouard Belin, 31055 Toulouse, France , E-mail: felix.mora@enac.fr
12 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming

fuzzy numbers adopting some elements of classical dual number calculus [7] and [8].
Indeed, the proposed special class of numbers, dual fuzzy numbers, integrates the
nilpotent operator H of dual numbers theory while considering symmetrical fuzzy
numbers. Then uncertain values are characterized by only three parameters: a mean
value, an uncertainty interval and a shape parameter.
In this communication, first are introduced the elements of fuzzy dual calculus
useful to tackle the proposed issue: basic operations as well as strong fuzzy dual and
weak fuzzy dual partial orders and fuzzy dual equality. Then two classes of fuzzy dual
mathematical programming problems are considered: those where uncertainty relays
only in the parameters of the problem and those for which the implementation of the
solution is subject to uncertainty. In both situations, the proposed formalism is
developed and used to identify the expected performance of the solutions.

1. Fuzzy Dual Numbers

~
The set of fuzzy dual numbers is the set ' of numbers of the form u = a  H b such as
aR, bR+ where r(u) = a is the primal part and d(u) = b is the dual part of the fuzzy
dual number.
A crisp fuzzy dual number will be such as b is equal to zero, losing its fuzzy dual
attribute. To each fuzzy dual number a  H b is attached a fuzzy symmetrical number
whose membership function μ is such that:

­0 if x d a  b or x t a  b
° (1)
P ( x) ® P ( x ) P ( 2a  x )
° x  [a  b, a  b]
¯

where μ is an increasing function between a-b and a with μ(a)=1.

1.1. Operations with Fuzzy Dual Numbers

~ ~
Different basic operations can be defined on ' [9]. First, the fuzzy dual addition  , is
given by:
~ (x  H y )
( x1  H y1 )  ( x1  x 2 )  H ( y1  y 2 )
2 2 (2)

where the neutral element of the fuzzy dual addition is (0  0 H ) , written ~0 .

Then the fuzzy dual product, written ~


x , is given by:

( x1  H y1 ) ~x ( x 2  H y 2 ) ( x1 . x 2  H ( x1 ˜ y 2  x 2 ˜ y1 ))
(3)

The fuzzy dual product has been chosen here in a way to preserve the fuzzy
interpretation of the dual part of the fuzzy dual numbers, so it is different of the product
of dual calculus. The neutral element of fuzzy dual multiplication is (1  0 H ) , written ~1 .
C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 13

It is easy to check that internal operations such as fuzzy dual addition and fuzzy dual
multiplication are commutative and associative. The fuzzy dual multiplication is
distributive with respect to the fuzzy dual addition since operator ε is according to Eq.
(3) such as:
~
H ~x H 0 (4)

Comparing with common fuzzy calculus, fuzzy dual calculus appears to be much
less demanding in computer resource [10] and [11].

1.2. Fuzzy Dual Vectors

Let E be an Euclidean space of dimension p over R then we define the set of fuzzy dual
~
vectors E as the pairs of vectors which are taken from the Cartesian product E u E  ,
~
where E+ is the positive half-space of E. Basic operations can be defined over E :
Addition:

(a  H b)  (c  H d ) (a  c)  H (b  d ) a, c  E, b, d  E  (5)

Multiplication by a fuzzy dual scalar O  H P :

~ ~
(O  H P )( a  H b) Oa H (O b P a) O  H P  ', a  H b  E (6)

A pseudo scalar product is defined by:

~ (7)
u v r (u ).r (v)  H ( r (u ) .d (v)  d (u ). r (v) )  u, v  E

~
where "*" represents the inner product in E and "." represents the inner product in E.

2. Fuzzy Dual Inequalities

With the objective to make possible the comparison of fuzzy dual numbers as well as
the identification of extremum values between fuzzy dual numbers, a new operator
~
from ' to R+, called fuzzy dual pseudo norm, is introduced.

2.1. Fuzzy dual pseudo norm

Let us introduce the proposed operator:


~
a  H B  ' : a  H b D
a  U b  R (8)

where U is a shape parameter associated with the considered fuzzy dual number which is
given by:
14 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming

1
U ³ P ( x)dx  [0, 1]
2b xR
(9)

In the case of fuzzy dual numbers with symmetrical triangular membership


functions, U = ½ while for crisp fuzzy dual numbers, U 0 . In this paper it is supposed
that the considered fuzzy dual numbers have the same shape, i.e. a common ρ value.
It is straightforward to establish that the operator defined in Eq.(8), whatever the
value of the shape parameter, satisfies the characteristic properties of a norm:
~
 a H b' : a H b t 0 (10)

a  R, b  R  a H b 0Ÿa b 0 (11)

(a  H b)  (D  H E ) d a  H b)  D  H E a, D  R, b, E  R  (12)

O (a  H b) D O a  H b D a, O  R, b  R  (13)

~
However, since the set of dual numbers ' is not a vector space, the proposed
operator can be only regarded as a pseudo norm.
The fuzzy dual pseudo norm of a fuzzy dual vector u can be introduced as (here
is the Euclidean norm associated to E):
u D r (u )  U d ( u ) (14)

2.2. Strong and weak fuzzy dual inequalities

Partial orders between fuzzy dual numbers can be introduced using this pseudo norm.
Depending if fuzzy dual numbers overlap or not, strong and weak partial orders can be
introduced.
 ~
A strong fuzzy dual partial order written t is defined over ' by:
~ 
 a1  H b1 , a2  H b2  ' : a1  H b1 t a2  H b2 (15)
œ a1  U b1 t a2  U b2

In that case there is no overlap between the membership functions associated with
the two fuzzy dual numbers and the first one is definitely larger than the second one.
 ~
A weak fuzzy dual partial order written t is defined over ' by:
~ 
a1  Hb1 , a 2  Hb2  ' : a1  Hb1 t a 2  Hb2 (16)
œ a1  Ub1 t a 2  Ub2 t a1  Ub1 t a 2  Ub2
In that case there is an overlap between the membership functions associated with
the two fuzzy dual numbers and the first one appears to be partially larger than the
second one.
C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 15

A fuzzy dual equality, written ~ , can be defined between two fuzzy dual numbers
by:
~
 a1  H b1 , a 2  H b2  ' : a1  H b1 ~ (a 2  H b2 ) (17-a)
œ a 2  >a1  U b1 , a1  U b1 @ et a1  >a 2  U b2 , a 2  U b2 @

~
a1  Hb1 , a 2  Hb2  ' : a1  Hb1 # a 2  Hb2
œ a1  Ub1 t a 2  Ub2 t a 2  Ub2 t a1  Ub1 (17-b)
or a 2  Ub2 t a1  Ub1 t a1  Ub1 t a 2  Ub2

In this last case there is a complete overlap of the membership functions associated
with the two fuzzy dual numbers.
Then when considering two fuzzy dual numbers, they will be in one of the above
situations (no overlap, partial overlap or full overlap): strong fuzzy dual inequality,
weak fuzzy dual inequality or fuzzy dual equality.

2.3. Extremum operators

The max and the min operators over two or more fuzzy dual numbers can now be
defined. Let c+H J be the fuzzy dual maximum of fuzzy dual numbers a + H α and b+ H E :

c  H ˜J max ^a  H ˜ D , b  H ˜ E ` (18)

then:

c max ^a, b` (19.a)

J max ^a  U D , b  U E ` max ^a, b` (19.b)

Let d+ H G be the fuzzy dual minimum of fuzzy dual numbers a + H α and b+ H E :

d  H ˜G min ^a  H ˜ D , b  H ˜ E ` (20)

then:

d min ^a, b` (21.a)

G min ^a  U D , b  U E ` min ^a, b` (21.b)


Observe that here the max and min operators produce new fuzzy dual numbers.

3. Mathematical Programming with Fuzzy Dual Parameters

Here is introduced the fuzzy dual formulation of uncertain mathematical programming


problems.
16 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming

3.1. Discussion

To illustrate the proposed approach the case of a linear programming problem with real
variables where all parameters are uncertain and described by fuzzy dual numbers, is
considered. The proposed approach can be easily extended to integer mathematical
programming problems or to nonlinear mathematical programming problems, or to
problems with different types of level constraints.
Let then define formally problem L~ given by:
n
minn 
xR
¦ c~ x
i 1
i i
(22)

under the constraints:


n
~
¦ a~ ki xi t bk k  ^1,", m` (23)
i 1

and
xi  R  i  ^1,", n` (24)

where the coefficients a~hi , b~k , c~i are uncertain parameters.

When the problem is a constrained cost minimization problem, the cost parameters
c~i , although uncertain, remains positive and the absolute operator can be retrieved
from expression of Eq. (22). Here is adopted the fuzzy dual hypothesis for the cost
coefficients ci , the technical parameters aki and the constraint levels bk . This opens
different perspectives to be considered when dealing with the parameter uncertainty.
Here are considered three different cases:
 the nominal case (a standard deterministic linear programming problem) in which
the dual parts of the parameters are zero;
 the pessimistic case where uncertainty adds to the cost and where constraints are
strong ones,
 the optimistic case where uncertainty subtracts from the cost and the constraints
are weak ones.
The nominal case corresponds to a standard mathematical programming problem. The
analysis of the pessimistic case is developed here with more detail and can be transposed
easily to the study of the optimistic case.

3.2. Minimum Performance Bound

In the pessimistic case, problem L+ is formulated which is a fuzzy dual linear


programming problem with fuzzy dual constraints and real decision variables and is
written as:
n
minn
xR
¦ (c  H d ) x
i 1
i i i
(25)

under strong inequality constraints:


n 
¦ (a ki  H D ki ) xi t bk  HEk k  ^1, " , m` (26)
i 1
C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 17

and xi  R  i  ^1,", n` (27)

where ci , d i , aki , D ki , bk , E k are given.

This problem corresponds to the minimization of the worst estimate of total cost
with satisfaction of strong level constraints. Here variables xi are supposed to take real
positive values, but they could take also fully real or integer values. In the case in
which the di are zero, the uncertainty is relative to the feasible set. Problem L+ is
equivalent to the following problem in R  n :
n n
minn 
xR
¦
i 1
ci xi  U ¦d x
i 1
i i
(28)

under the hard constraints:


n

¦ (a ki  U D ki ) xi t bk  U E k k  ^1,", m` (29)
i 1

and
xi t 0 i  ^1,", n` (30)

It appears that the proposed formulation leads to minimize a combination of the


values of the nominal criterion and of its degree of uncertainty. In the case in which
the cost coefficients are positive this problem reduces to a classical linear programming
problem over R  n . In the general case, since the quantity n c x will have at solution a
¦
i 1
i i


particular sign, the solution x of problem L+ will be the one corresponding to the
minimum of:
^ min §¨ c xH  U d xH ·¸ , min §¨ c xG  U d xG ·¸ `
n n n n
(31)
xR n 
¦
©i 1
i i ¦i 1
i i
¹ xR n 
¦
©i 1
i i ¦
i 1
i i
¹
H
where x is solution of problem:
n n

xR
minn  ( ¦c x  U ¦d x )
i 1
i i
i 1
i i
(32)

under the hard constraints:


n

¦ (a ki  U D ki ) xi t bk  U E k k  ^1, ", m` (33)


i 1

¦c xi 1
i i t 0 and xi t 0 i ^1,", n` (34)

G
and where x is solution of problem:
n n

xR
minn  ( U ¦ i 1
d i xi  ¦c x )
i 1
i i
(35)
18 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming

under the following constraints:


n

¦ (a ki  U D ki ) xi t bk  U E k k  ^1, ", m` (36)


i 1

¦c x
i 1
i i d0 and xi t 0 i ^1,", n` (37)

The fuzzy dual optimal performance of this program is then given by:
n n n

¦ (c
i 1
i  H d i ) xi ¦c x
i 1

i i  H ¦ d i xi
i 1
(38)

Problems of Eqs. (32), (33) and (34) and of Eqs. (35), (36) and (37) are classical
continuous linear programming problems which can be solved in acceptable time even
for large size problems.

3.3 Performance analysis

It is of interest to consider the complementary problem L- given by:


n n
minn  ¦ ci xi  U ¦ d i xi (39)
xR
i 1 i 1

under the constraints:


n

¦ (a ki  U D ki ) xi t bk  U E k k  ^1,", m` (40)
i 1

and
xi t 0 i  ^1,", n` (41)

and the nominal problem L0 given by:


n
minn ¦ ci xi (42)
xR
i 1 i

under the nominal constraints:


n

¦a x t bk
ki i k  ^1,", m` (43)
i 1

and
xi t 0 i  ^1,", n` (44)
 0
Let x and x be the respective solutions of problems of Eqs.(39), (40) and (41) and of
Eqs. (42),(43) and (44), it will be instructive to compare in a first step the performances
of problems L+, L- and L0 where:
n n n n n
(45)
¦c x
i 1
i

i  U ¦ d i xi d
i 1
¦c x
i 1
i
0
i d ¦c x
i 1
i

i  U ¦ d i xi
i 1
C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 19

This allows to display the dispersion of results between the pessimistic view of problem
L+, the optimistic view of problem L- and the neutral view of problem L0.

Then in a second step, since x is feasible for problems L- and L0, it is of interest to

compare the different performances when adopting the x solution:
n n n n n
(46)
¦c x
i 1
i

i  U ¦ d i xi d
i 1
¦c x
i 1
i

i d ¦c x
i 1
i

i  U ¦ d i xi
i 1

to produce bounds to the effective performance of the solution.

4. Mathematical Programming with Fuzzy Dual Variables

Now we consider fuzzy dual programming problems with fuzzy dual parameters and
decision variables as well. In that case problem V is formulated as:
n
min ( c  H d )( x  H y )
xi R , yi R 
¦
i 1
(47)
i i i i

under the strong constraints


n  (48)
¦ (a ki  H D ki ) ( xi  H yi ) t bk  HEk k  ^1, ", m`
i 1

and
xi  R, yi t 0 i  ^1, ", n` (49)

The above problem corresponds to the minimization of the worst estimate of total cost
with satisfaction of strong level constraints when there is some uncertainty not only on
the values of the parameters but also on the ability to implement exactly what should be
the optimal solution. According to Eq. (3), problem V can be rewritten as:
n
min
xR , y R
n n ¦ (c x
i 1
i i  H ( xi d i  ci y ))i (50)

under constraints Eq. (41) and :


n 
¦ (a ki xi  H (D ki xi  aki yi )) t bk  HEk k  ^1, ", m` (51)
i 1

which is equivalent in R n u R n to the following mathematical programming problem:


n n
min C ( x, y )
xR , yR n 
¦c x
i 1
i i  U ¦ (d i xi  ci yi )
i 1
(52)

under constraints of Eq. (41) and hard constraints:


n

¦ (a ki xi  U ( D ki xi  aki yi )) t bk  U E k k  ^1, ", m` (53)


i 1

Let A(x,y) be the set defined by:


­x  R n , y  R n : ½
°n ° (54)
° °
A( x, y ) ®¦ ( a ki x i  U ( D ki x i  a ki y i )) t bk  U E k ¾
°i 1 °
° k  ^1, " , m` °
¯ ¿
then
20 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming

x  R n , y  R n A( x, y)  A( x, 0) and C ( x, y) t C ( x, 0) (55)
It appears, as expected, that the case of no diversion of the nominal solution is
always preferable. In the case in which the diversion from the nominal solution is fixed
to yi , i  ^1, ", n`, problem V has the same solution than problem V’given by:
n n
minn ¦ ci xi  U ¦ d i xi (56)
xR
i 1 i 1

under constraints Eq. (41) and:


n n

¦ (a
i 1
ki x i  U D ki x i ) t bk  U ( E k  ¦ a ki y i )
i 1
(57)
k  ^1, " , m`
The fuzzy dual optimal performance of problem of Eq. (46) will be given by:
n n

¦c
i 1
i xi*  H ¦ ( xi* d i  ci yi )
i 1
(58)

where x * of problem V’.


~
Here also other linear constraints involving the other partial order relations over '
(weak inequality and fuzzy equality) could be introduced in the formulation of problem
V while the consideration of the integer version of problem V will lead also to solve
families of classical integer linear programming problems.
The performance of the solution of problem V will be potentially diminished by the
reduction of the feasible set defined by Eq. (54).

5. Conclusion

This study has considered mathematical programming problems presenting some


uncertainty on the values of their parameters or in the implementation of the values for
the decision variables. A special class of fuzzy numbers, fuzzy dual numbers, has been
defined in such a way that the interpretation of their dual part as an uncertainty level
remains valid through the basic operations on these numbers. A pseudo norm has been
introduced, allowing the comparison between fuzzy dual expressions and leading to the
definition of hard and weak constraints to characterize fuzzy dual feasible sets.
Mathematical programming problems with uncertain parameters and variables have
been considered under this formalism. The proposed solution approach leads to solve a
finite collection of classical mathematical programming problems corresponding to
nominal and extreme cases, allowing to the characterization of the expected optimal
performance and solution. These results in a rather limited additional computational
effort compared with classical approaches. The above approach could be easily
extended to cope with fuzzy dual numbers of different shapes present in the same
mathematical programming problem.

References

[1] M. Delgado, J. L. Verdegay and M. A. Vila, Imprecise costs in mathematical programming problems,
Control and Cybernetics, 16(1987), 114-121.
[2] T. Gal, H. J. Greenberg (Eds.), Advances in Sensitivity Analysis and Parametric Programming, Series:
International Series in Operations Research & Management Science, Vol. 6, Springer, 1997.
C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 21

[3] A. Ruszczynski and A. Shapiro. Stochastic Programming. Handbooks in Operations Research and
Management Science, Vol. 10, Elsevier, 2003.
[4] A. Ben-Tal, L. El Ghaoui and A. Nemirovski, Robust Optimization. Princeton Series in Applied
Mathematics, Princeton University Press, 2009.
[5] H. J. Zimmermann, Fuzzy Sets Theory and Mathematical Programming, in A. Jones et al. (eds.), Fuzzy
Sets Theory and Applications, D. Reidel Publishing Company, 99-114, 1986.
[6] C. A. N. Cosenza and F. Mora-Camino, Nombres et ensembles duaux flous et applications, in French,
Technical repport, Labfuzzy laboratory, COPPE/UFRJ, Rio de Janeiro, August 2011.
[7] W. Kosinsky, On Fuzzy Number Calculus, International Journal of Applied Mathematics and Computer
Science, 16(2006), 51-57.
[8] H. H. Cheng , Programming with Dual Numbers and its Application in Mechanism Design, Journal of
Engineering with Computers, 10(1994), 212-229.
[9] Mora_Camino F., O. Lengerke and C. A. N. Cosenza, Fuzzy sets and dual numbers, an integrated
approach, Fuzzy sets and Knowledge Discovery Conference, Chongqing, China, 28-31 May 2012.
[10] H. Nasseri, Fuzzy Numbers: Positive and Nonnegative, International Mathematical Forum, 3(2006),
1777-1780.
[11] E. Pennestrelli and R. Stefanelli, Linear Algebra and Numerical Algorithms using Dual Numbers,
Journal of Multibody Systems Dynamics, 18(2007), 323-344.
22 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-22

Forecasting National Football League


Game Outcomes Based on Fuzzy
Candlestick Patterns
Yu-Chia HSU1
Department of Sports Information and Communication, National Taiwan University of
Sport, Taichung City, Taiwan.

Abstract. In this paper, a sports outcome prediction approach based on sports


metric candlestick and fuzzy pattern recognition is proposed. The sports gambling
market data are gathered and processed to form the candlestick chart, which has
been widely used in financial time series analysis. Unlike the traditional
candlestick is composed of the price for financial market analysis, the candlestick
for sports metric is determined by the point spread, total point scored, and the
gambling shock which measures the bias of gambling line and real total point
scored. The fluctuation behaviors of sports outcome are represented by the
fuzzification of candlestick for pattern recognition. The decision tree algorithm is
applied on the fuzzified candlesticks to find the implicit knowledge rules, and used
these rules to forecasting the sports outcome. The National Football League is
introduced to our empirical study to verify the effectiveness of forecasting.

Keywords. fuzzy logic, pattern recognition, sports metric forecasting, sports


gambling markets

Introduction

Sports outcome prediction is an important area of betting on sports events, which has
gained a lot of popularity recently. American Football, such as National Football
League (NFL) games, uses a complex scoring system that the resulting scores are
hardly to model using standard modeling approaches. There are five ways to score in
American Football, giving 2 points, 3 points, 6 points, and 7 points under different
touchdown situation. Other sports, such as soccer, baseball, basketball, are relatively
much simpler to give different points under few situations. Consequently, the standard
modeling approaches, such as Poisson-type regression models, can provide impressive
performance when modeling scores in soccer, but it may perform worse when applied
to American Football scores due to the peculiar distribution [1].
Many researches on sport forecasting have demonstrated that the win/lose results
of the game may be affected by the past score, offense/defense statistics, player
absence [2], and etc. Even the temperature, wind speed and moistures in the
competition venues may potentially influence player performance. Most research
adopts these influencing factors for quantitative analysis to estimate the pointed score

1
Corresponding author: Yu-Chia Hsu, Dep. of Sports Information and Communication, National Taiwan
University of Sport, No. 16, Sec. 1, Shuang-Shih Rd., Taichung, Taiwan; E-mail: ychsu@ntupes.edu.tw.
Y.-C. Hsu / Forecasting National Football League Game Outcomes 23

or probability of victory. Many notable forecasting models were proposed by


academics and professionals and growing exponentially since the appearance of the
“Moneyball” phenomenon [3]. Although, considering these explosive data as the
variables that influence performance of players are important for coaches and managers
of sports teams, there have been very few studies conducted on modeling the betting
market data to forecast the winner of the game using non-parametric model based on
computational intelligence techniques.

1. Overview of Sport Beating Forecasting and Candlestick Chart Analysis

1.1. Sport beating Market

The market data in sport betting, such as odds, point spread, over/under, offer a type of
predictor and source of expert advice and expectation probability regarding sports
outcome. Adopting the betting market data published by bookmakers in the prediction
model could provide a rather high forecasting accuracy [4]. It is reasonable because
betting companies would not survive with inefficient odds and spread.
The betting market has many similar characteristics to financial markets [5]. Three
variants of the efficient market hypothesis (EMH): "weak", "semi-strong", and "strong"
form, which reflect the relationship between the current prices and the information
rationality and instantaneousness, have also been extended to betting market to reflect
the line incorporates all relevant information contained, all public information, and
inside information in the past game outcome [6].
Moreover, the price fluctuation were followed the mechanism known as the
random walk model under some restriction and condition, that the profitable
forecasting models were not persisting for a long time. However, both in financial and
betting market, the profitable forecasting models existed during the periods of market
inefficiency, but require extensive modeling innovations [7].

1.2. Japanese Candlestick Theory

Candlestick charting originates back to the Japanese rice future market in the 18th
century. It provides a visual aid for looking at data differently and forecasting near term
equity price movement, and then develops insight into market psychology. Recently,
Japanese candlestick theory is one of the most widely used technical analysis
techniques that based on the empirical model for investment decision. The trend of
financial time series was assumed to be predictable by recognizing the specific
candlestick patterns.
The candlestick is produced with the opening, highest, closing, lowest prices over
a given time interval. Each candlestick includes both a body and a wick that extends
either above or below the body. Figure 1 illustrates the candlestick line. The body is
shown as a box to represent the difference between the opening and closing price, and
the wick is shown as a line to represent the highest and the lowest price range during
the opening and closing. The body is filled with either black or white color, according
to the condition that weather the opening price is above or below the closing price,
respectively. In some particular time interval, the highest /lowest price is marked by the
top/bottom of the body. However, a candlestick may or may not have a wick.
24 Y.-C. Hsu / Forecasting National Football League Game Outcomes

The advantage of the candlestick theory is to introduce rich information in a


visualized interface for experienced chartist easy to identify the patterns. In a decade,
this analysis technique was extended to apply in other field, such as in predicting teen’s
stress level change on a micro-blog platform [8], and in sports metric [9] to forecasting
the game outcomes. However, the graphic patterns, such as the size of body, and the
relationship of position between two successive candlestick are hardly to be
represented. Some researchers have propose to utilize the fuzzy logic to solve the
problem [10-12].

Figure 1. The basic candlestick Figure 2. The sports metric candlestick

2. Candlestick Chart for Sports Metric

The sports metric candlestick charts provide simple graphics of game outcomes relative
to the gambling line, which have been proposed by Mallios [9]. Similarly with the
candlestick chart used in financial equity price analysis, each candlestick of sports
metric includes both a body and a wick that extends either above or below the body.
But the open, high, close, and low price, which constitute the body and wick of
candlestick chart in finance are not appropriate for sports. For sports metric, the
candlestick charts are composed by the winning/losing margin, the total points scored,
and their corresponding gambling line. Figure 2 illustrates the sports metric candlestick.
The body of candlestick is determined by the winning/losing margin, denoted D, and
the gambling line on the wining/losing margin, denoted LD, for a certain team. If D >
LD, the body’ color is white, and the body’s maximum and minimum values are
defined by D and LD. If LD > D, the body’ color is black, and the body’s maximum
and minimum values are defined by LD and D. The length of the candlestick wick is
determined by the gambling shock of line on total points scored, denoted GST. GST is
calculated by the difference between total points scored in the game and the
corresponding line on total points scored. If GST > 0, the wick extends above the body,
and below the body when GST <0. There is no wick when GST = 0.

3. Fuzzy Representation of Candlestick Patterns

3.1. Size of Body and Wick


The lengths of the wicks and the body can reflect the price fluctuation during a time
interval which are considered as the critical characteristics for candlestick pattern
recognition. In traditional technique analysis, the size of chart as short, medium or long
Y.-C. Hsu / Forecasting National Football League Game Outcomes 25

is defined variously with different opinion. In order to describe the characteristic of


candlestick more appropriately, four fuzzy linguistic variables used in fuzzy set are
adopt to describe the length of the wicks and the body: Very Short, Short, Long, and
Very Long. Figure 3 illustrates the membership function of the linguistic variables.
Two type of membership function are adopt to define the linguistic variables, linear
function is used for Very Long and Very Short, and triangle function is used for Short
and Long. In Figure 3, the footnote of x-axis indicate the real length of body or wick,
and the unit of x-axis is the normalized scale from 0 to 1. In this study the result of
evaluating the input values through the membership functions are obtained by
calculating the length of bodies or wicks with min-max normalization to be between 0
and 1.

Figure 3. The membership function for the linguistic variables

3.2. Relationship between Candlesticks

The size of candlestick line only reflects the characteristics of the price fluctuation
during a time interval, which is not enough to model valuable candlestick patterns. In
order to capture the characteristics of consequent trend of candlestick, the relationship
between two adjacent candlestick lines should be considered. Compared with the
previous candlestick line, the related position of the opening and closing price are used
to model the open style and the close style. Five linguistic variables, Low, Equal Low,
Equal, Equal High, and High, are defined to represent the open and close style. Figure
4 shows the membership function of the linguistic variables of the open style and close
style. The unit of x-axis is the prices in previous time interval and the y-axis is the
possible values of the membership function. The parameters in the function to describe
the linguistic variables depend upon the previous candlestick line, which is illustrated
by the previous candlestick line in the bottom of figure 4.

3.3. Fuzzification of Candlestick Pattern

The candlestick charts are characterized with fuzzy linguistic variables by applying
subordinate function maximum method. When more than one fuzzy set matched for a
single crisp value, the fuzzy set with the maximum membership value will be selected.
Table 1 shows the example of fuzzy candlestick pattern at time t-i to t for forecasting
the next game outcome.
To mining the rule of candlestick pattern for forecasting next game outcome, we
extract the historical data, consist of the point spread line, total point line, the actual
box score, and the outcome, at time t, t-1,…to t-i. Then, we translate these data to the
candlestick char entity, and symbolize the time series by fuzzification. The fuzzy
26 Y.-C. Hsu / Forecasting National Football League Game Outcomes

candlestick patterns are then recognized by using the random forests algorithm to
achieve the optima decision tree. Finally, the next game outcomes are predicted by
using the optimal decision tree.

Figure 4. The membership function of the linguistic variables of the open style and close style
Table 1. Example of fuzzy candlestick pattern
Time Body Upper Lower Body Open style Close style Outcome
frame length wick wick color
Length length
t-i Short VeryShort VeryShort Black EqualHigh Low Win

t VeryLong Long Short White EqualLow Equal Lose

4. Empirical Studies and Analysis

For demonstration the effectiveness of forecasting game outcome, we use the NFL data
gathered from the covers.com in the 2011-2012 season. We arbitrarily choose the
champion of the Super Bowl in the year, New York Giant, as the team for empirical
study. The data covers the regular season, and after seasons data for the year. Total 20
games that New York Giant have joined were held in the year, including 17 regular
games from week 1 to week 17, and 4 after season games including the Wildcard,
Divisional, Conference, and Super Bowl. The data in the year are divided into two sets
according to the NFL season. The data in regular season is considered as the training
set, and the data in play-off or super bowl is considered as the testing set. The rules of
candlestick patterns is found by the regular season data, and used to forecast the
outcome of super bowl.
The effect of the prediction is evaluated based on four performance measurements,
precision, recall, and F-measure, which are widely used in data mining. The formulas
are shown in Eqs. (1) – (3).
TP (1)
precision u 100%
TP  FP
TP (2)
recall u 100%
TP  FN
recall u precision (3)
F  measure 2 u
recall  precision
where TP, FP, and FN denote true positive, false positive, and false negative.
Y.-C. Hsu / Forecasting National Football League Game Outcomes 27

The empirical results of the forecasting are presented in Table 2. The results
revealed that the precision, recall, and F-measure of the outcome prediction for win are
extreme high. This may be occurred due to the small size of samples, which is the
innate limitation of sports outcome forecasting. Most team of NFL only played almost
20 games in one season. So, it is reasonable that only 17 samples are used for training,
and left 4 samples are for testing. In fact, the New York Giant wins the all 4 games of
after season, including the Super Bowl.
Table 2. The results of prediction
Time frame of Number of Outcome Precision Recall F-measure
the input data input variables prediction
t 7 Win 100% 100% 1
Lose 0% 0% 0
t-1, t 14 Win 100% 75% 0.857
Lose 0% 0% 0

5. Conclusion

In this paper, we proposed a computational intelligence based sports forecasting model


to predict the champion of NFL super bowl. This model combine the advantage of
candlestick chart analysis for financial time series and pattern recognition technique by
applying fuzzy set and random forests algorithm. Unlike most sports forecasting
models which are focus on the athletes’ performance, we adopt the beating market data
for considering the psychology and behavior of beating market maker and sport fans.
The original beating market data are transformed into candlestick chart and
characterized by fuzzification, and then be classified to find the implicit patterns for
forecasting. Empirical results show that this idea is feasible and obtains acceptable
accurate of prediction.

References

[1] R. D. Baker, I. G. McHale, Forecasting exact scores in National Football League games, International
Journal of Forecasting, 29 (2013), 122-130.
[2] W. H. Dare, S. A. Dennis, R. J. Paul, Player absence and betting lines in the NBA, Finance Research
Letters, 13 (2015), 130-136.
[3] M. Lewis, Moneyball: The Art of Winning an Unfair Game, W. W. Norton & Company, New York, 2003.
[4] D. Paton, L. V. Williams, Forecasting outcomes in spread betting markets: can bettors use ‘quarbs’ to
beat the book, Journal of Forecasting, 24 (2005), 139-154.
[5] S. D. Levitt, Why are gambling markets organized to differently from financial markets, The Economic
Journal, 114 (2004), 223–246.
[6] L. V. Williams, Information efficiency in betting markets: A survey, Bulletin of Economic Research, 51
(1999), 1-39.
[7] W. S. Mallios, Forecasting in Financial and Sports Gambling Markets. Wiley, New York, 2011.
[8] Y. Li, Z. Feng, L. Feng, Using candlestick charts to predict adolescent stress trend on micro-blog,
Procedia Computer Science, 63 (2015), 221-228.
[9] W. Mallios, Sports Metric Forecasting, Xlibris Corporation, 2014.
[10] C.-H. L. Lee, A. Liu, W.-S. Chen, Pattern discovery of fuzzy time series for financial prediction, IEEE
Transactions on Knowledge and data Engineering, 18 (2006), 613-625.
[11] Q. Lan, D. Zhang, L. Xiong, Reversal pattern discovery in financial time series based on fuzzy
candlestick lines, Systems Engineering Procedia, 2 (2011), 182-190.
[12] P. Roy, S. Sharma, M. K. Kowar, Fuzzy candlestick approach to trade S&P CNX NIFTY 50 index
using engulfing patterns, International Journal of Hybrid Information Technology, 5 (2012), 57-66.
28 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-28

A Fuzzy Control Based Parallel Filling


Valley Equalization Circuit
Feng RANa, Ke-Wei HUb, Jing-Wei ZHAOb and Yuan JIa,1
a
Department of Microelectronics Center, Shanghai University, Shanghai, China
b
School of Mechatronic Engineering and Automation, Shanghai University, Shanghai,
China

Abstract. Aiming at the problem of high cost and slow equalization speed in
traditional circuit, a parallel filling valley equalization circuit based on fuzzy
control is proposed in this paper. A fuzzy controller suitable for the circuit is
designed. The average voltage, voltage range and the balance electric quantity of
the battery pack are modeled by fuzzy model. The fuzzy reasoning and
defuzzification is produced to optimize the circuit control logic, which can be
adapted to the nonlinearity of the battery pack and the uncertainty of the battery
parameters. The simulation and experiment results show that, in the process of
charging and discharging, the fuzzy control based parallel filling valley
equalization circuit has the advantage of fast and efficient equalization which can
improve the use efficiency of the battery pack.

Keywords. Fuzzy control, battery equalization, filling valley balancing, energy


utilization, lithium battery

Introduction

As the continuous environmental pollution and the deterioration of oil, the vehicles
energy system structure has become a hot issue of the global concern and research [1].
In recent years, people are committed to the development of safe, efficient and clean
transport. The electric vehicle represents the development direction of the new
generation of environmentally friendly vehicles. As the power source of electric
vehicles, the power battery directly affects the use of electric vehicles [2]. The lithium
battery is one of the best choices for the power source of electric vehicle because of its
advantages, such as the high voltage, low self-discharge rate, high efficiency and
environmental protection [3]. However, the lithium battery in the production, long
standing and times of the charge and discharge process, battery charge amount of the
gap increases, so that within the battery pack cells dispersion increased dispersion
increases, individual cell performance degradation intensified, eventually leading to the
whole group batteries failure [4]. Therefore, the battery equalization technology is an
indispensable technology to ensure the safety of the battery and extend the service life
of the battery pack [5].The battery equalization can be roughly divided into active and
passive equalization [6-7]. Active balance in the process will not consume the battery
energy and has become a hot research topic today [8]. In the active balance scheme, the

1
Corresponding author: Yuan JI, Department of Microelectronics Center, Shanghai University,
Shanghai, China; E-mail: jiyuan@shu.edu.cn.
F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 29

highest energy cell of the battery pack adds energy to the lowest battery cell through
the converter. The super capacitor equalization, the inductance balance, and the
converter equalization are the most common ways to achieve the parallel filling valley
equalization [9]. However, problems of active equalization that needs to be solved
urgently, such as the high cost, the complex control circuit and the slow equalization
speed.
At this stage, the research of battery equalization technology mainly includes two
aspects. On the one hand it is the equalization strategy [10], about how to build a
common evaluation system of the battery group and then obtains the control strategy
equilibrium basis. On the other hand it is the design of the equalization circuit topology
[11], mainly research on the hardware implementation. In view of these two aspects,
the researchers put forward many different equalization solutions. Tian et al. [12]
proposed an energy tap charging and discharging equalization control strategy but did
not give a specific implementation of the program. Wu et al. [13] proposed that the
SOC based equilibrium of the battery can effectively eliminate the inconsistency of the
battery. But due to the SOC estimate accuracy is not guaranteed, it is only suitable for
the offline mode equalization. Fu et al. [14] proposed a control strategy based on the
battery voltage as the criterion of equilibrium, and the goal is to achieve the relative
consistency of the SOC of a single cell. It is widely used because of its clear goal and
simple control, but the ability to deal with nonlinear problems needs to be improved in
this method.
Generally, the lithium battery shows the nonlinear characteristic. In order to make
the battery maintain good system stability and fast balancing speed in different
environments with uncertain parameters, this paper proposes a parallel filling valley
equalization scheme based on the fuzzy control. A balanced fuzzy controller is used to
optimize the balance strategy. Simulation results show that the balancing speed and the
efficiency of the proposed parallel fill valley equalization scheme has been improved,
compared with the traditional inverse excitation filling valley control strategy. Thirteen
general E-bike lithium batteries (rated 48V) were used as the object of the series
battery for charging and discharging experiments. The experimental results show that,
the voltage difference between the lithium battery converges to less than 10mV with
the fuzzy control based parallel fill valley equalization strategy when the large voltage
difference are initialized between the battery packs.

1. Design of Filling Valley Equalization Fuzzy Controller

As the battery working characteristic is highly nonlinear curve, it is difficult to


determine all related parameters with a precise mathematical model. By using the fuzzy
control method, the system can make reasonable decision under uncertain or imprecise
conditions. In this paper, the fuzzy control technique is used to adjust the equilibrium
current and time. The fuzzy logical system uses Sugeno type. Sugeno method is
computationally effective and works well with optimization and adaptive techniques,
which makes it very attractive in control problems, particularly for dynamic nonlinear
systems. The inference calculation of the input to the output is realized by a set of
inference rules prior mastered. A typical fuzzy control system is composed of the rule
base, data base, inference engine, fuzzy unit and the defuzzification unit. Figure 1
shows the structure of filling valley equalization fuzzy controller, a typical two input
and one output fuzzy control system.
30 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit

Figure 1. Fuzzy logic controller of filling valley balancing

The controller is designed by fuzzy control theory. It is supposed to control the


equalization electricity quantity of the battery cell by controlling the equalization
current and time. The rule base is used to collect control rules used to describe the
battery equalization control algorithm. The database is used to store some of the data
that has been mastered.
The fuzzy controller has two inputs, the average voltage (AV) and the voltage
difference (VD), of the battery pack. The output is the equalization electricity quantity
(QBAL). As the input of fuzzy controller, the value of AV and VD are transformed into
the fuzzy language μ1(x)‫ޔ‬μ2(x) after the fuzzification process. Then the inference
engine will generate language control logic μ0(z) according to the pre-established rule
base and the input fuzzy language. At last, the language control logic is transformed
into the control output signal QBAL by the defuzzification process. The relationship
among the equalization electricity quantity QBAL, the equilibrium current IBAL and the
equilibrium time TBAL can be expressed as:

VS S M L VL
Membership degree relation

2.7V 3.4V 3.7V 4.1V4.2V


Average voltage

Figure 2. Membership functions for average voltage electric quantity

QBAL I BAL ˜ TBAL (1)


F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 31

Figure 2 and Figure 3 show the filling valley equalization fuzzy controller
input/output membership function. The equalization current and the equalization time
are determined by the measured average voltage AV and the voltage range VD of the
fuzzy controller. The triangle function is chose to be the membership function of the
average voltage (AV) and voltage difference (VD) as it’s easy to be calculated,
compared with other membership function. The average voltage (AV) is divided into 5
fuzzy subsets: average very large (AVL), average large (AL), average medium (AM),
average small (AS), average very small (AVS), for covering the domain [2.7V, 4.2V].
Input variable voltage difference VD is also divided into 5 fuzzy subsets: difference
very large (DVL), difference large (DL), difference medium (DM), difference small
(DS), difference very small (DVS), which used to cover the domain [0mv, 20mv].
System is in accordance with the 20mV input when the voltage range is greater than
20mV. The output variable equalization electricity QBAL is divided into subsets: VL
(very large), L (large), M (medium), S (small), VS (very small). Fig.2 and Fig. 3 shows
the membership function of fuzzy control system in the horizontal coordinates VD, AV,
QBAL. For example, when the AV is 3.4V, it is one hundred percent in the S
membership and zero percent in the M and VS membership. Figure 4 shows the surface
relationship of fuzzy controller. From the figure, the relationship among AV, VD and
the balance capability could be seen. The rule base can be described in Table 1.

Table 1. Control rules of fuzzy logic controller


Output DVS DS DM DL DVL
AVS OVS OVS OS OM OL
AS OVS OS OM OL OVL
AM OVS OM OVL OVL OVL
AL OVS OS OM OL OVL
AVL OVS OVS OS OM OL

VS S M L VL
Membership Degree Relation

0mv 5mv 10mv 15mv 20mv


VD

0As 2.5As 5As 7.5As 10As


QBAL

Figure 3. Membership functions for voltage difference and balancing electric quantity
32 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit

Figure 4. Surface of the fuzzy logic controller output VS input

Table 1 shows the fuzzy control system has a total of 25 rules, write them
separately R1㧘R2…R25. Fuzzy rule expression can be given by

­ R1 (AVS and DVS) o OVS


°R 2 (AVS and DS) o OVS
°
®R3 (AVS and DM) o OS
°
°
¯ R 25 (AVS and DVS) o OVS
(2)

In the theory of fuzzy control, there are many kinds of operations. So it has many
kinds of choices in the practical application. According to the demand of the definition,
filling valley equalization fuzzy controller operation rules are shown as follows: fuzzy
variable "and" is used for and operation, "min" is used to take the minimum value,
fuzzy variable "or" is used for or operation, "max" is used to take the maximum value.
Implication relation operation use the "min", output synthesis calculation use the "max"
and the centroid method is used in the output defuzzification process. All of the above
rules can be expressed as:

25
R R1  R2  ...  R25 Ri
i 1 (3)

The exact values of AV and VD is known, the fuzzy quantity of output QBAL can
be given by

Po QBAL (VB u VD) ˜ R


(4)

According to the fuzzy control logic operation rules


F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 33

25
Po QBAL ( VB u VD)
(VB V ˜ Ri
i 1
25
(V u VD)
(VB V ˜ ( Ai and Bi o Ci )
i 1 (5)

Because the "min" method is used in the calculation of the implication relation

25
Po QBAL [[V ˜ ( Ai o Ci )] [VD
[VB [[V ˜ ( Bi o Ci )]
i 1 (6)

Finally, the output fuzzy variables are accurate through the solution of the
defuzzification module

QBAL
³ Q P Q dQ
BAL o BAL BAL

³ P Q dQ
o BAL BAL
(7)

The maximum equalization current Ieq_max and equalization current IBAL can be
given by

U dc  U M U 0  U dio LP ˜ T ˜K
2

I eq _ max 2
2 ª¬ U dc  U M LP LSK  U 0  U dio LP  Lx º¼
(8)

§ QBAL ·
I BAL min ¨ , Ieq _ max ¸ (9)
¨ TBAL _ MIN ¸
© ¹

Where TBAL_MIN is the minimum equilibrium time, here TBAL_MIN = 0.8s.


Then calculate the PWM wave duty cycle V and equalization time TBAL.


2 I BAL ˜ U min  U dio ˜ L p  Lx
2

V
U dc  U M
2
˜ Lp ˜K ˜ T
(10)

­
°Q ½
TBAL min ® BEC , TBAL _ MAX ¾ (11)
° I BEC
¯ ¿

Where TBAL_MAX shows the maximum equilibrium time, here TBAL_MAX = 10s.
TBAL_MAX is set to limit the length of the equilibrium time, in case the battery pack
charging and discharging voltage changes too large in the time period.
34 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit

2. Simulation and Experiment Result

2.1. The simulation of the filling valley equalization fuzzy controller

The simulation of battery charging and discharging is carried out under the MATLAB
and the ordinary fly-back control is compared with the fuzzy control according to the
simulation results.

Figure 5. Fuzzy logic controller VS Normal fly-back controller

Figure 5 shows the lithium battery charging and discharging process under
MATLAB simulation. The battery pack has 13 sections batteries in different initial
voltage, the 13 curves of different colors in the figure are corresponding. According to
the algorithm, the charge and discharge process differences, distinguish the figure into
the Fig.5A, Fig.5B, Fig.5C and Fig.5D. The horizontal axis shows the simulation time,
and the vertical axis shows the battery voltage value of the battery pack. Where the
fuzzy control is used in Fig.5B and Fig.5D, the ordinary fly-back control is used in
Fig.5A and Fig.5C. By the comparison, in the charging process, as shown in Figure 5A,
the battery pack needs 190min to reach the energy balance, while fuzzy control based
controller achieve energy balance only in 120min, as shown in Figure 5B. In the static
discharge process, as shown in Figure 5C, the battery pack needs 232min to reach the
energy balance, while in Figure 5D, only in 150min, equilibrium state were reached.
The simulation results shows that the fuzzy control based parallel filling valley
equalization strategy the has a faster equalization speed compared with normal fly-back
controller control.

2.2. Experimental results of charge and discharge equalization

The battery charge and discharge experiment are carried out in this paper in the
background of the filling valley equalization fuzzy control. The initial voltage values of
these cells vary from 2.9V to 3.4V.
F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 35

Charging experiment is according to the way that constant current charging first
and then constant voltage charging. The full charging process is shown in Figure 6. In
order to see the equalization time clearly, Figure 7 shows the first 50min equalizing
charge diagram. It can be seen from the figure, only 30min the battery pack from the
initial state of disequilibrium can enter into the equilibrium, which has remarkably
improve compared with general equalization technology. As shown in Figure 8, the
equalizing discharging process can balance the power of batteries in different initial
voltage and make the batteries equalization.

Figure 6. Full diagram of equalizing charge

Figure 7. Equalizing charge diagram of first 50min

Figure 8. Full diagram of equalizing discharge

Table 2. Comparison before and after charging and discharging


Battery Pack Parameters Balanced discharge Balanced charge
Voltage range before balanced 535.2070mV 117.0849mV
Voltage range after balanced 8.9998mV 9.7893mV
The time to reach the balance About 30min About 98min
Charge and discharge time 237min 160min

From Table 2, the experiment result shows that, the fuzzy control based parallel
filling valley equalization circuit can clearly reduce the voltage difference and have
good performance in solving the nonlinear problem and equalization speed. However,
36 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit

the fuzzy rule and data base in the realistic fuzzy control process need adequate
accuracy to be reliable and better rules, inference process will certainly be found in the
later research.

3. Conclusion

The proposed fuzzy control based parallel filling valley equalization circuit can
fast reaching equalization. It has good ability of solving the nonlinear problem,
compared with traditional circuit. With the development of electrical vehicle, people
require high quality cell equalization. Lossless equalization can achieve lossless energy
transfer between different batteries to avoid the waste of energy. Filling valley
equalization is one of the schemes in lossless equalization. But how to improve the
energy flow efficiency and change the imbalance in multi- string-parallel battery pack
should be concerned in future research of lossless equalization. In addition,
equalization circuit is supposed to be as succinct as possible. How to reduce the size of
the chip and enhance the applicability deserve enough attention in further study.

References

[1] E. Kim, K. G. Shin, J. Lee. Real-time battery thermal management for electric vehicles. Cyber-Physical
Systems (ICCPS). Berlin: IEEE, (2014):72-83.
[2] C. L. Wey, P. C. Jui. A unitized charging and discharging smart battery management system. Connected
Vehicles and Expo (ICCVE). Las Vegas: IEEE, (2012):903-909.
[3] B. B. Qiu, H. P. Liu, J. L. Yang, et al. An active balance charging system of lithium iron phosphate
power battery group, Advanced Technology of Electrical Engineering and Energy, 2014.
[4] J. Cao, N. Schofield, A. Emadi. Battery Balancing Methods: A Comprehensive Review. Vehicle Power
and Propulsion Conference (VPPC). Harbin: IEEE, (2008):1-6
[5] B. T. Kuhn, G. E. Pitel, P. T. Krein, et al. Electrical properties and equalization of lithium-ion cells in
automotive applications. Vehicle Power and Propulsion Conference (VPPC): IEEE, 2005
[6] B. Lindemark. Individual cell voltage equalizers (ICE) for reliable battery performance.
Telecommunications Energy Conference,: INTELEC, (1991):196-201
[7] A. Baughman, M. Ferdowsi. Analysis of the Double-Tiered Three-Battery Switched Capacitor Battery
Balancing System. Vehicle Power and Propulsion Conference (VPPC). Harbin: IEEE, (2006):1-6
[8] W. G. Ji, X. Lu, Y. Ji, et al. Low cost battery equalizer using buck-boost and series LC converter with
synchronous phase-shift control. Annual IEEE Applied Power Electronics Conference and Exposition
(APEC). Long Beach: IEEE, 331(2013):1152-1157
[9] M. Daowd, N. Omar, DBP Van, et al. Passive and Active Battery Balancing comparison based on
MATLAB Simulation. IEEE Vehicle Power and Propulsion Conference (VPPC). Chicago, IL: IEEE,
(2011):1-7
[10] H. R. Liu, S. H. Zhang, et al. Lithium-ion battery charge and discharge equalizer and balancing
strategy.Transactions of China Electrotechnical Society, 16(2015):186-192.
[11] W. G. Ji, X. Liu, Y. Ji, et al. Low cost battery equalizer using buck-boost and series LC converter with
synchronous phase-shift control. In 2013 28th Annual IEEE applied Power Electronics Conference and
Exposition (APEC). Long Beach. CA, USA, (2013): 1152-1157
[12] R. Tian, D. T. Qin, M. H. Hu, et al. Research on battery equalization balance strategy. Journal of
Chongqing University (Nature Science Edition), (2005):1-4
[13] Y. Y. Wu, H. Liang. Research on electric vehicle battery equalization method. Automotive Engineering,
(2004): 384-385.
[14] J. J. Fu, B. J. Wu, H. J. Wu, et al. Dynamic bidirectional equalization system to a vehicle hang-ion
battery weave. China Measurement Technology, (2005): 10-11.
Fuzzy Systems and Data Mining II 37
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-37

Interval-Valued Hesitant Fuzzy Geometric


Bonferroni Mean Aggregation Operator
Xiao-Rong HEa,1, Ying-Yu WUa, De-Jian YUb, Wei ZHOUc and Sun MENGc
a
School of Economics and Management, Southeast University, Nanjing, China
b
School of Information, Zhejiang University of Finance and Economics, Hangzhou,
China
c
Yunnan University of Finance and Economics, YNFE
Kunming, China

Abstract. Hesitant fuzzy set (HFS) is one of the most common used techniques for
expressing the decision maker’s subjective evaluation information. Interval-valued
hesitant fuzzy set (IVHFS) is the extension of HFS and can reflect our intuition
more objectively. In this paper we focus on the IVHF information aggregation
methods based on Bonferroni mean (BM). We proposed the IVHF geometric BM
operator (IVHFGBM) and weighted IVHFGBM operators. Some numerical
examples for the operators are designed for showing their effectiveness. The
desirable properties of weighted IVHFGBM operator are also discussed in detail.
These operators can be applied in many areas especially in decision making
problems.

Keywords. Bonferroni mean, IVHFS, aggregation operator

Introduction

There are various methods available for decision making. One of the common features
for decision making methods is the information aggregation techniques [1-7]. Using
information aggregation operator in decision making, we can obtain the comprehensive
performance values of alternatives, which are used to compare alternatives. The
alternative with the biggest comprehensive performance value is the best option. The
Bonferroni mean (BM) [8-10] is a widely used technique in information aggregation
and decision making area. At present, it has been extended to interval-valued
uncertainty environment, intuitionistic fuzzy (IF) environment, interval-valued
intuitionistic fuzzy (IVIF) environment, fuzzy environment, uncertain linguistic fuzzy
environment and hesitant fuzzy environment.
However, we found that the BM operator cannot be used to aggregate interval-
valued hesitant fuzzy information [11] which is the research focus of this paper. In the
rest of this paper, we first review the basic concept about interval-valued hesitant fuzzy
set (IVHFS) and then extend the BM to interval-valued hesitant fuzzy environment.
The numerical examples are presented to better understand these interval-valued
hesitant fuzzy information aggregation methods based on BM operators.

1
Corresponding Author: Xiao-Rong HE, School of Economics and Management, Southeast University,
Nanjing, China; E-mail: shelley526@126.com.
38 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator

1. Preliminaries

In this section, a briefly review of the interval-valued hesitant fuzzy set (IVHFS) is
presented.
Definition 1 [11]. Let X be a referenced set. An IVHFS on X can be represented
as the following mathematical form:

E { x, f E ( x) !| x  X } (1)

where f E ( x ) denotes all possible interval-valued membership degrees of the


element x to the set E.
The IVHFS has a strong practical value in the situations where the membership
degree is difficult to determine. For example, a patient often on a regular abdominal
pain and he/she go to the hospital to consult three doctor independently. After
understanding of his/her illness, the first doctor thinks the possibility that patient with
stomachache is [0.6, 0.7]. The second doctor believes the possibility that patient with
stomachache is [0.1, 0.2] and likely to suffer from other diseases. The point of the third
doctor is similar to the first doctor and believes the possibility that patient with
stomachache is [0.7, 0.8]. In this case, the possibility that the patient with stomachache
can be represented by an interval-valued hesitant fuzzy element (IVHFE)
^[0.6,0.7], [0.1,0.2], [0.7, 0.8]` . Obviously, other kinds of the extended fuzzy set
theory cannot deal with this case effectively. Furthermore, IVHFE is the basic element
of IVHFS.
For any IVHFEs, Chen et al. [11] defined the operations and given the comparison
rules.
Definition 2. Suppose that h J h ^ª¬J L
`
, J U º¼ , h1 J 1h1 ^ª¬J 1
L
`
, J 1U º¼ and

h2 J 2 h2 ^ª¬J 2
L
`
, J 2U º¼ be three IVHFEs. O is a real number bigger than 0. Then the
operations are defined as follows.

1)
hO J h ^ª«¬ J , J
L O U O º
»¼`
2) O h J h ^ªª1¬1  (1  J L O
(  J U )O º¼
,1  (1
) ,1 `
3) h1 † h2 J 1h1 ,J 2 h2 ^[J 1
L
 J 2L  J 1LJ 2L , J 1U  J 2U  J 1U J 2U ]`
4) h1 † h2 J 1h1 ,J 2 h2 ^[J J , J 1U J 2U ]`
L L
1 2

Definition 3. For an IVHFE h J h ^ª¬J L


`
, J U º¼ , ʶh is the number of the elements

in h .

1 1 § L JU J L · 1 §J L JU ·
S (h ) ¦ J
# h J h
¦ ¨J 
# h J h © 2 ¹
¸ ¦ ¨
# h J h © 2 ¹
¸
(2)
X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator 39

is defined as the score function of IVHFE h . For two IVHFE s h1 and h2 , if


S (h1 ) ! S (h2 ) , then h1 h2 ; if S (h1 ) S (h2 ) , then h1 h2 .
Example 1. Suppose that h1 ={[0.5,0.6], [0.6,0.7]}, h2
={[0.7,0.8],[0.4,0.6],[0.7,0.9]} and h3 ={[0.5,0.6]} be three IVHFEs. According the
score function and comparison rules defined in Definition 3, we have
1 1 § 0.5  0.6 0.6  0.7 ·
S (h1 )
ʶh1
¦ J
J 1h1 1

¨
2

2
¸ 0.6
¹
1 1 § 0.7  0.8 0.4  0.6 0.7  0.9 ·
S (h2 ) ¦ J 2 3 ¨© 2  2  2 ¸¹ 0.68
ʶh2 J 2 h2
1 1 § 0.5  0.6 ·
S (h3 ) ¦ J 3 1 ¨© 2 ¸¹ 0.55
ʶh3 J 3h3
Since S (h2 ) ! S (h1 ) ! S (h3 ) , then h2 h1 h3

2. Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operators

After the concepts of IVHFS and IVHFE are proposed, the aggregation operators for
aggregating IVHFEs are forwarded correspondingly, such as IVHFWA operator,
IVHFWG operator, IVHFOWA operator, IVHFOWG operator, GIVHFWA operator,
GIVHFWG operator, induced IVHFWA operator, induced IVHFWG operator, and so
on [12-13]. It should be noted that the above IVHF information aggregation operators
cannot be used to fuse the correlated arguments. On the other hand, the geometric mean
(GM) is the common aggregation operator and has been widely used in the information
fusion field. Based on the GM, the geometric BM (GBM) operator has been proposed
and investigated by some researchers. However, it seems that the researchers have no
concern with the investigation on GBM for aggregating IVHFEs which is the concern
of the following studies.
Definition 4. Let h j J j h j ^ª¬J j
L
`
, J jU º¼ ( j 1, 2,..., n) be a group of IVHFEs. If

1 § n ·
1

IVHFGBM h1 , h2 , , hn = ¨ … ( phi † qh j ) nn(( n 1) ¸


p  q ¨ ii ,zj j 1 ¸
(3)
© ¹

Then IVHFGBM is called the interval-valued hesitant fuzzy geometric BM


operator (IVHFGBM).
Theorem 1. Let p, q ! 0 , and h j J j h j ^ª¬J j
L
`
, J jU º¼ ( j 1, 2,..., n ) be a group of
IVHFEs. After using IVHFBM operator, the aggregated IVHFE is obtained as follows.

IVHFGBM h1 , h2 , , hn
40 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator

­ª 1

°« § · pq
° n 1

® «1  ¨1  – 1  (1  J i ) ((1  J j ) n(
= ¨ L p L q n ( n 1) ¸
J i hi ,J j h j
« ¸
° ¨ i 1, j 1 ¸
°¯ «¬ © iz j ¹

1
º½
§ · pq » °
°
n 1
1  ¨1  – 1  (1  J iU ) p (1
(  J Uj )q n(
n ( n 1) ¸ »
¨ i 1, j 1 ¸ »¾
¨ ¸ °
© iz j ¹ »°
¼¿ (4)

Example 2. Suppose there are three IVHFEs, h1 ={[0.49,0.63], [0.58,0.78],


[0.37,0.66], [0.68,0.87]}, h2 ={[0.69,0.81]} and h3 ={[0.57,0.69], [0.63,0.77]}. Based
on the IVHFGBM operator, the aggregated IVHFE for h1 , h2 and h3 can be obtained.
Since there are two parameters p and q in the IVHFGBM, the values of p and q may
change the aggregated results to a certain extent. For example,
(1) When p=1, q=10, then

IVHFGBM h1 , h2 , h3
= {[0.5516, 0.6758], [0.5627, 0.6850], [0.5902, 0.7153], [0.6232, 0.7801], [0.4622,
0.6925], [0.4640, 0.7098], [0.6020, 0.7190], [0.6506, 0.7868]}
the score of the aggregated IVHFE is 0.6419.
(2) When p=3, q=7, then

IVHFGBM h1 , h2 , h3
= {[0.5534, 0.6777], [0.5655, 0.6889], [0.5958, 0.7277], [0.62326, 0.7823], [0.4678,
0.6949], [0.4711, 0.7126], [0.6163, 0.7337], [0.6568, 0.7945]}
the score of the aggregated IVHFE is 0.6476.
As can be seen from the Definition 4, the IVHFGBM is symmetrical about
parameters p and q which is the same with IVHFBM operator. In order to describe this
phenomenon figuratively, Figure 1 is provided as followed.

Figure 1. Scores for IVHFEs obtained by the IVHFGBM operator (p∊ (0, 10), q∊ (0, 10))
X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator 41

Figure 2 shows the changing trend of the scores for the aggregated IVHFEs based
on IVHFGBM operator when the two parameters are fixed.

Figure 2. Scores trends p=0.01, 1 and 10 (q∊ (0, 10))

Definition 5. Let h j J j h j ^ª¬J j


L
`
, J jU º¼ ( j 1, 2,..., n ) be a group of IVHFEs,
w ( w1 , w2 ,..., wn ) T
1,2,..., n) , satisfying wi ! 0
is the weight vector of h j ( j 1,2,
n
( i 1,2,..., n ), ¦w
i 1
i 1 . If

IVHFWGBM (h1 , h2 ,...,


.. hn )

1 § n ·

1


wi wj n(( n 1)
n
¨ … ph hi † qh j ¸
p  q ¨ ii ,zj j 1 ¸
© ¹ (5)

then IVHFWGBM is called the interval-valued hesitant fuzzy weighted geometric


BM operator.
Theorem 2. Let h j J j h j ^ª¬J j
L
`
, J jU º¼ ( j 1, 2,..., n ) be a group of IVHFEs,
w ( w1 , w2 ,..., wn ) T
1,2,..., n) , satisfying wi ! 0
is the weight vector of h j ( j 1,2,
n
( i 1,2,..., n ), and ¦ wi 1 . Then the IVHFWBM and IVHFWGBM operators can be
i 1
transformed as follows:
IVHFWBM (h1 , h2 ,..., hn )
­ª 1

° «§ 1 · pq
° ¨

n
¸
® «¨1  – 1  (1  (1  J i ) i ) ((1  ((1  J j ) )
L w p L wj q n ( n 1))
J i hi ,J j h j
« ¸ ,
° ¨ i 1, j 1 ¸
° «© iz j ¹
¯¬
§ 1 ·º ½
¸» °

n
¨1 
¨¨ i –
1  (1  (1  J iU ) wi ) p (1 (  J Uj ) j ) q
(  (1
w n ( n 1))
¸¸ » ¾
(6)
1, j 1 °
© iz j ¹ ¼» ¿
42 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator

IVHFWGBM (h1 , h2 ,..., hn )


­ª 1

°« § 1 · pq
°« ¨

n
¸ ,
® 1  ¨1  – 1  (1  (J i ) i ) ((1  (J j ) )
L w p L wj q n ( n 1))
J i hi ,J j h j
« ¸
° ¨ i 1, j 1 ¸
° « © iz j ¹
¯¬
1
º½
§ · pq » ° 1
¸ »°

n
1  ¨ 1  – 1  (1  (J iU ) wi ) p (1
(  (J Uj ) j ) q
w n ( n 1))
¸¸ » ¾
(7)
¨¨ i 1, j 1
°
© iz j ¹ »°
¼¿
Example 3. Suppose there are three IVHFEs, h1 ={[0.31,0.45], [0.46,0.71] }, h2
={[0.34,0.47]} and h3 ={[0.23,0.35], [0.46,0.58], [0.65,0.73]} and the weight of the
three IVHFEs is 0.3, 0.4, 0.3 . Based on the IVHFWGBM operator, the aggregated
T

IVHFE can be obtained when the values of p and q were assigned to specific numbers.
For example, when pp=0.1, qq=10, then
IVHFWGBM h1 , h2 , h3
= {[ 0.6512, 0.7377], [0.6829, 0.7648], [0.6831, 0.7650], [0.6528, 0.7390], [0.6861,
0.7672], [0.6864, 0.7673] }
the score of the aggregated IVHFE is 0.7153.
Example 4. Suppose there are four IVHFEs, h1 ={[0.2,0.4], [0.2,0.7]}, h2
={[0.5,0.6], [0.3,0.6]} , h3 ={[0.3,0.5]} and h4 ={[0.5,0.6],[0.3,0.6]}, the weight of the
four IVHFEs are supposed as 0.2, 0.3, 0.3, 0.2 . Based on the IVHFWGBM
T

operator, the aggregated IVHFE can be obtained, for example,


(1) when p 0.001, q 10 , the score is 0.7766;
(2) when p q 5 , the score is 0.7848;
(3) when p 10, q 0.001 , the score is 0.7765;
When the parameters p and q changed from 0 to 10 simultaneously, the scores are
shown in Figure 3 in detail.

Figure 3. Scores obtained by the IVHFWGBM operator


X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator 43

Example 5. Suppose there are four IVHFEs, h1 ={[0.5,0.8], [0.5,0.6], [0.4,0.7]},


h2 ={[0.3,0.4], [0.6,0.7]} , h3 ={[0.4,0.6]} and h4 ={[0.3,0.5],[0.4,0.4]}, the weight of
the four IVHFEs are supposed as 0.2, 0.3, 0.3, 0.2 . Based on the IVHFWGBM
T

operator, the scores are shown in Figure 4 when the parameters p and q changed from 0
to 10 simultaneously.

Figure 4. Scores obtained by the IVHFWGBM operator

3. Conclusions

In this paper, we have extended the traditional BM and proposed the IVHFGBM and
IVHFGWBM operators to aggregate IVHFEs. Some numerical examples for these
operators are also presented to show the practicality and effectiveness. In the future
research, we intend to consider the extensions of some other BMs and study their
relationships, pay attention to the application of the proposed operators to the real
application area such as sustainable development evaluation, science and technology
project review, group decision making [14-16]and so on.

References

[1] J. J. Peng, J. Q. Wang, J. Wang, et al. Simplified neutrosophic sets and their applications in multi-criteria
group decision-making problems. International Journal of Systems Science, 47(2016), 2342-2358.
[2] D. Yu, D. F. Li and J. M. Merigó, Dual hesitant fuzzy group decision making method and its application
to supplier selection. International Journal of Machine Learning and Cybernetics, In press. DOI:
10.1007/s13042-015-0400-3
[3] H. Zhao, Z. Xu and S. Liu, Dual hesitant fuzzy information aggregation with Einstein t-conorm and t-
norm. Journal of Systems Science and Systems Engineering, In press.DOI:10.1007/s11518-015-5289-6.
[4] X. F. Wang, J. Q. Wang and W. E. Yang. Group decision making approach based on interval-valued
intuitionistic linguistic geometric aggregation operators. International Journal of Intelligent
Information and Database Systems, 7(2013), 516-534.
[5] M. Xia, Z. Xu and N. Chen. Induced aggregation under confidence levels. International Journal of
Uncertainty, Fuzziness and Knowledge-Based Systems, 19(2011), 201-227.
44 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator

[6] G. Wei. Interval valued hesitant fuzzy uncertain linguistic aggregation operators in multiple attribute
decision making. International Journal of Machine Learning and Cybernetics, In press. DOI:
10.1007/s13042-015-0433-7
[7] H. Liu, Z. Xu and H. Liao. The multiplicative consistency index of hesitant fuzzy preference relation.
IEEE Transactions on Fuzzy Systems, 24(2016), 82-93.
[8] C. Bonferroni, Sulle medie multiple di potenze, Bolletino Matematica Italiana, 5 (1950), 267-270.
[9] M. M. Xia, Z. S. Xu, and B. Zhu. Geometric Bonferroni means with their application in multi-criteria
decision making. Knowledge-Based Systems, 40 (2013), 88-100.
[10] W. Zhou and J. M. He. Intuitionistic fuzzy geometric Bonferroni means and their application in multi-
criteria decision making. International Journal of Intelligent Systems, 27(2012), 995-1019.
[11] N. Chen, Z. S. Xu, and M. M. Xia. Interval-valued hesitant preference relations and their applications
to group decision making. Knowledge-Based Systems, 37(2013), 528-540.
[12] R. M. Rodríguez, B. Bedregal, H. Bustince, et al. A position and perspective analysis of hesitant fuzzy
sets on information fusion in decision making. Towards high quality progress. Information Fusion,
29(2016), 89-97.
[13] R. Pérez-Fernández, P. Alonso, H. Bustince, et al. Applications of finite interval-valued hesitant fuzzy
preference relations in group decision making. Information Sciences, 326(2016), 89-101.
[14] D. Yu. Group decision making under intervaĺvalued multiplicative intuitionistic fuzzy environment
based on Archimedean t́conorm and t́norm. International Journal of Intelligent Systems, 30(2015),
590-616.
[15] D. Yu, W. Zhang and G. Huang. Dual hesitant fuzzy aggregation operators. Technological and
Economic Development of Economy, 22(2016), 194-209.
[16] W. Zhou and Z. S. Xu. Generalized asymmetric linguistic term set and its application to qualitative
decision making involving risk appetites. European Journal of Operational Research, 254(2016), 610-
621.
Fuzzy Systems and Data Mining II 45
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-45

A New Integrating SAW-TOPSIS Based on


Interval Type-2 Fuzzy Sets for Decision
Making
Lazim ABDULLAH1 and CW Rabiatul Adawiyah CW KAMAL
School of Informatics and Applied Mathematics, Universiti Malaysia Terengganu,
Malaysia

Abstract. Most of the integrated methods of multi-attributes decision making


(MADM) used type-1 fuzzy sets to represent uncertainties. Recent theory has
suggested that interval type-2 fuzzy sets (IT2 FS) could be used to enhance
representation of uncertainties in decision making problems. Differently from the
typical integrated MADM methods which directly used type-1 fuzzy sets, this
paper proposes an integrating simple additive weighting - technique for order
preference similar to ideal solution (SAW-TOPSIS) based on IT2 FS to enhance
judgment. The SAW with IT2 FS is used to determine the weight for each
criterion, while TOPSIS method with IT2 FS is used to obtain the final ranking for
the attributes. A numerical example is used to illustrate the proposed method. The
numerical results show that the proposed integrating method is feasible in solving
MADM problems under complicated fuzzy environments. In essence, the
integrating SAW-TOPSIS is equipped with IT2 FS in contrast to type-1 fuzzy sets
for solving MADM problems. The proposed method would make a great impact
and significance for the practical implementation. Finally, this paper provides
some recommendations for future research directions.

Keywords. Interval type-2 fuzzy set, Simple additive weighting, Multi-criteria


decision making , TOPSIS, preference order

Introduction

Decision making based on multi-criteria evaluation has been used with great success
for many applications. Most of these applications are characterized by high levels of
uncertainties and vague information. Fuzzy set theory has provided a useful way to
deal with vagueness and uncertainties in solving multi-criteria decision making
(MCDM) problem. During the last two decades, MCDM methods that integrated with
fuzzy sets have been one of the fastest growing research areas. Abdullah [1] presents a
brief review of category in the integration of fuzzy sets and MCDM. In general,
MCDM can be categorized into multi-attribute decision making (MADM) and multi-
objective decision making (MODM). Naturally, MADM problem is related to multiple
attributes. The attributes of MADM represent the different dimensions from which the
alternatives can be viewed by decision makers. There are many fuzzy MADM methods
that have been discussed in the literature, and fuzzy technique for order preference

1
Corresponding Author: Lazim ABDULLAH, School of Informatics and Applied Mathematics,
Universiti Malaysia Terengganu; E-mail: lazim_m@umt.edu.my.
46 L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS

similar to ideal solution (FTOPSIS) is one of the MADM methods. Preference or


decision derived from FTOPSIS is made by observing the degree of closeness to ideal
solution. Add to this method, fuzzy simple additive weighting (FSAW) is another type
of fuzzy MADM methods. It is an extension of the SAW method, where it employs
trapezoidal fuzzy numbers to represent imprecision in judgements.
Lately, the integration of MADM method has received considerable attention in
literature. Integrated method is simply defined as two or more methods that are
concurrently employed to solve decision making problems. For example, the TOPSIS
is integrated with fuzzy analytic hierarchy process (FAHP) model to propose a new
integrated model for selecting plastic recycling method [2]. Rezaie et al., [3] present an
integrating model based on FAHP and VIKOR method for evaluating cement firms.
Wang et al., [4] develop an integrating OWA–fuzzy TOPSIS to tackle fuzzy MADM
problems. Kharat et al., [5] applied an integrated fuzzy AHP–TOPSIS to municipal
solid waste landfill site selection problem. Pamučar and Ćirović [6] applied the new
integrated fuzzy DEMATEL–MABAC method in making investment decisions.
Tavana et al., [7] proposed an integrated fuzzy ANP-COPRAS-Grey method to
determine the selection of social media platform.
Most of these integrating methods employed type-1 fuzzy sets to represent
uncertainties in decision making. However, the type-1 fuzzy sets have some extent of
limitation in dealing with uncertainties. Recent theories suggest that interval type-2
fuzzy sets (IT2 FSs) are more flexible than the interval type-1 fuzzy sets in representing
uncertainties. Therefore, in contrast to these methods, this paper introduces linguistic
terms based on IT2 FS for proposing a new integrating MADM method. The IT2 FS is
incorporated within the framework of FSAW and FTOPSIS to develop a new
integrating fuzzy MADM method. Specifically, Interval Type-2 Fuzzy Simple
Additive Weighting (IT2 FSAW) method is integrated with Interval Type-2 Technique
for Order Preference Similar to Ideal Solution (IT2 FTOPSIS) method for solving
MADM problems. In the proposed method, the judgements made by decision makers
over the relative importance of alternatives are determined using IT2 FSAW procedure
and the final preference is obtained using IT2 FTOPSIS. The ranking method of IT2
FTOPSIS approach preserves the characteristics of fuzzy numbers where the linguistic
terms can easily be converted to fuzzy numbers.

1. Proposed Method

This paper integrates the IT2 FSAW with IT2 FTOPSIS to establish a new MADM
method. In this proposed method, the IT2 FSAW is used to find weights of the criteria,
whereas IT2 FTOPSIS is used to establish preference of alternatives. The definitions
of IT2 FS [8], upper and lower memberships of IT2 FS [9], and ranking values of the
trapezoidal IT2 FS [10] are used in the proposed method. The detailed procedure of the
proposed method is described as follows.
Step 1: Construct the decision matrix Y p of the p-th decision maker and construct the
average decision matrix Y , respectively.
L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS 47

x1 x2 xn
ª p f12p f11nnp º
f1 « f11 »
f2 « f p f 22p p »
f 22nn »
Yp ( fijp ) mun « 21
« »
fm « p »
«¬ f m
m1
1 f mp2 f mnp »
m ¼

Y ( fij )mun ,
(1)

§ f1† f 2 † † fijk ·
fij ¨ ij ij ¸,
¨ k ¸ f1 , f 2 , , f m represent the criteria and
where © ¹ .
x1 , x2 , , xn represents alternatives.

Step 2: Construct the aggregated fuzzy weight W , from the weighting matrix Wp of
the attributes provided by p-th decision maker.
p
Let wi (ai , bi , ci , di ), i 1, 2, , m be the linguistic weight given to the subjective
criteria C1 , C2 , , Ch and objective criteria Ch 1 , Ch 1 , , Cn by decision maker Dt .
f1 f2 fm
Wp ( wip )1um ª w1p w2p wmp º,
¬ ¼ (2)
W ( wi )1um , (3)
wi1 † wi2 † wik
wi , wi
where k is an interval type-2 fuzzy set.

To defuzzify weights of fuzzy attribute, the signed distance is employed [11].


Defuzzification of W is represented as:
1
d (W j ) ( w1j  w2j  w3j  w4j ), j 1
1, 2, ,n
4 (4)
The crisp value for criteria W . is given by:
d (W j )
Wj n
, j 1, 2, ,n
¦ d (W j )
j 1 (5)
n
¦ Wj 1
where j 1 . Therefore, the weight vector W [W1 , W2 , , Wn ] is constructed.

Step 3: Create the weighted decision matrix Yw ,


48 L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS

x1 x2 xn

f1 ª v11 v12 v11nn º


« »
f 2 « v21 v22 2n »
Yw v ij
mun «
v2n
»,
« »
f m «v vmn »¼
¬ m1 vm 2 (6)

where vij W … fij , 1 d i d m, and 1 d j d n.

Step 4: Calculate the ranking value Rank vij of the IT2 FS v ij


ij,, using Eq (7). Create
the ranking for weighted decision matrix Yw* ,
Yw* Rank v iij
mun
,
(7)

Step 5: Calculate the positive-ideal solution


x 
v1 , v2 , , vm and the negative-

ideal solution
x 
v1 , v2 , , vm , where

­ max{Rank
°1d j d n
v },}
ij if fi  F1
vi ®
° min{Rank
¯1d j d n
v },
ij if fi  F2
(8)
and


­ min{Rank
°1d j d n
v },
ij if fi  F1
vi ®
° max{Rank
¯1d j d n
v },
ij if fi  F2
(9)
F F
where 1 denoted the set of benefit attributes, and 2 denotes the set of cost attributes.

Step 6: Find the distance


d xj
between each alternative x j and the positive ideal

solution x , using the Eq (10).

¦ Rank v  v
m

2
d xj ij

i ,
i 1
(10)

where 1 d j d n. Similarly, find the distance


d xj 
between each alternative x j and

the negative-ideal solution x , using the following equation.

¦ Rank v  v
m

2
d xj ij

i ,
i 1
(11)

Step 7: Calculate the degree of closeness


C xj
of x j with respect to the positive

ideal solution x , using the following equation.
L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS 49

xj

d

C xj ,

d xj  d xj (12)
where 1 d j d n.

Step 8: Arrange the values of in a descending order, and the larger value of
C xj


C x j , indicates the higher preference of the alternative x j ,
.

2. Numerical Example

For the purpose of illustration and to show the feasibility of the proposed method, an
example is presented. This example is retrieved from Chou et al. [5].
Researchers intend to identify the facility location alternatives to build a new plant.
The team has identified three alternatives which are alternative 1 ( A1 ) , alternative 2
( A2 ) , and alternative 3 ( A3 ) . To determine the best alternative site, a committee of
four decision makers is created; decision maker 1 ( D1 ) , decision maker 2 ( D2 ) ,
decision maker 3 ( D3 ) and decision maker 4 ( D4 ) . Three selection criteria are
deliberated: transportation availability (C1 ) , availability of skilled workers (C2 ) and
climatic conditions (C3 ) . Table 1 shows the linguistic terms used to rate criteria with
respect to alternatives and also the weights for criteria.
Table 1. Linguistic terms and IT2 FS
Linguistic Terms Interval Type-2 Fuzzy Sets
Very Poor (VP) ((0,0,0,0.1;1,1),(0,0,0,0.05;0.9,0.9))
Poor (P) ((0.0,0.1,0.1,0.3;1,1),(0.05,0.1,0.1,0.2;0.9,0.9))
Medium Poor (MP) ((0.1,0.3,0.3,0.5;1,1),(0.2,0.3,0.3,0.4;0.9,0.9))
Fair (F) ((0.3,0.5,0.5,0.7;1,1),(0.4,0.5,0.5,0.6;0.9,0.9))
Medium Good (MG) ((0.5,0.7,0.7,0.9;1,1),(0.6,0.7,0.7,0.8;0.9,0.9))
Good (G) ((0.7,0.9,0.9,1;1,1),(0.8,0.9,0.9,0.95;0.9,0.9))
Very Good (VG) ((0.9,1,1,1;1,1),(0.95,1,1,1;0.9,0.9))

Based on the ratings given by decision makers , the example is solved using the
proposed method. The final degree of closeness and preference are shown in Table 2.
Table 2. Degree of closeness and preference
Degree of closeness Preference order
C ( A1 ) 0.4112 3
C ( A2 ) 0.4605 2
C( A3 ) 0.4778 1

It can be seen that the preference order of the alternatives is A3 ; A2 ; A1. The
proposed method therefore decided that the best alternative is A3. This preference is
slightly inconsistent with the result obtained using the FSAW where the preference is
A2 ; A3 ; A1.
50 L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS

3. Conclusions

This paper proposed a novel method, which integrate IT2 FSAW and IT2 FTOPSIS to
solve MADM problems. Decision makers used interval type-2 linguistic variables to
assess the importance of the criterion. The ranking weighted decision matrix obtained
from IT2 FSAW was then used as an input to the IT2 FTOPSIS where ideal solutions
could be computed. Finally, preference of alternatives was obtained as a result of the
implementation using the integration method. To illustrate the feasibility of the
proposed method, a numerical example, that formerly solved using the FSAW method
was considered. The results showed that A3 is the most preferred alternative. Detailed
comparative analysis between the results obtained using the integrated method and
other decision making methods is left for future research. Future research may also
include sensitivity analysis where the uncertainty of the final preference of the
integrating model can be investigated.

Acknowledgments

This work is part of the research grant project FRGS 59389. We acknowledged the
financial support provided by Malaysian Ministry of Education and Universiti
Malaysia Terengganu.

References

[1] L. Abdullah, Fuzzy Multi Criteria Decision Making and its Application: A Brief Review of Category.
Procedia-Social and Behavioral Sciences, 97 (2013), 131-136.
[2] S. Vinodh, M. Prasanna, N. Hari Prakash, Integrated fuzzy AHP-TOPSIS for selecting the best plastic
recycling method: A case study. Applied Mathematical Modelling, 39 (2014),4662-4672.
[3] K. Rezaie, S. S. Ramiyani, S Nazari-Shirkouhi, A. Badizadeh, Evaluating performance of Iranian
cement firms using an integrated fuzzy AHP-VIKOR method. Applied Mathematical Modelling, 38
(2014), 5033-5046.
[4] T. Wang, J. Liu, J. Li, C. Niu, An integrating OWA–TOPSIS framework in intuitionistic fuzzy settings
for multiple attribute decision making, Computers & Industrial Engineering, 98(2016), 185-194.
[5] M. G. Kharat, S. J. Kamble, R. D. Raut, S. S Kamble, S. M. Dhume, Modeling landfill site selection
using an integrated fuzzy MCDM approach . Earth Systems and Environment, 2(2016), 53.
[6] D. Pamučar, G. Ćirović, The selection of transport and handling resources in logistics centers using
Multi-Attributive Border Approximation area Comparison (MABAC), Expert Systems with
Applications, 42(2015), 3016-3028.
[7] M. Tavana, E. Momeni, N. Rezaeiniya, S. M. Mirhedayatian, H. Rezaeiniya, A novel hybrid social media
platform selection model using fuzzy ANP and COPRAS-G, Expert Systems with Applications,
40(2013), 5694-5702.
[8] Y. C. Chang, S. M. Chen, A new fuzzy interpolative reasoning method based on interval type-2 fuzzy
sets. IEEE International Conferencte on Systems, Man and Cybernetics, (2008), 82-87.
[9] J. M. Mendel, R. I., John, F. Liu, Interval Type-2 Fuzzy Logic Systems Made Simple. IEEE
Transactions of Fuzzy Systems, 14 (2006), 808-821.
[10] L. Lee, S. Chen S, Fuzzy Multiple Attributes Group Decision-Making Based On The Extension Of
Topsis Method And Interval Type-2 Fuzzy Sets. Proceedings of the Seventh International Conference on
Machine Learning and Cybernetics, (2008), 3260-3265.
[11] J. S.Yao, K. Wu, Ranking fuzzy number based on decomposition principle and signed distance. Fuzzy
Sets and Systems, 116(2000), 275-288.
Fuzzy Systems and Data Mining II 51
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-51

Algorithms for Finding Oscillation Period


of Fuzzy Tensors
Ling CHEN a,b,1 Lin-Zhang LU a,c
a
School of Mathematical Sciences, Guizhou Normal University, GuiYang, Guizhou, P.
R.China 550001
b
School of Science, Shandong jianzhu University, JiNan, Shandong, P.R.China 250101
c
School of Mathematical Sciences, Xiamen University, Xiamen, Fujian, P.R.China
361005.

Abstract. In this paper, we focus on describing the oscillation period and index
of fuzzy tensor. The definition of the induced third-order fuzzy tensor is proposed.
By using this notion, firstly, the oscillation period and index of fuzzy tensor are
obtained on the basis of Power Method with max-min operation. Secondly, we rely
on Minimal Strong Component to find the oscillation period of fuzzy tensor. It
is a more practical graph theory method that the number of nonzero elements is
less than half of the sum of fuzzy tensor elements. Furthermore, numerical results
demonstrate the Power Method and the Minimal Strong Component two algorithms
for solving the period and index of fuzzy tensor which are effective and promising.

Keywords. Fuzzy tensors, oscillation period, minimal strong component

Introduction

In fuzzy mathematics, the study of fuzzy matrix is very complex but quite important
since it has a wide range of applications, especially in fuzzy control and fuzzy decision.
The object of fuzzy control is fuzzy system which is one of the important aspects of
fuzzy control system in which one can reach the stable state in limited time, and study
it’s stability by using the periodicity of fuzzy matrix. In order to study the multi-objective
fuzzy decision making and dynamic multiple objective fuzzy control, it is necessary to
investigate the higher order forms of fuzzy matrix.
The periodicity is one of the most important characteristics of fuzzy matrices.
Thomason [1] first proposed the powers of fuzzy matrix with convergence period or os-
cillation period. Fan and Liu [2] got the conclusion that the period of fuzzy matrix is
equal to the least common multiple of the period of its cutting matrix. Li [3] discussed
the periodicity of fuzzy matrices in the general case. Liu and Ji [4] described the peri-
odicity of square fuzzy matrices. Furthermore, in the same paper [3], and perfected the
conclusion of the upper bound of powers convergence index of fuzzy matrix, obtained
the greatest periodicity value of any square fuzzy matrix. So they solved the problem of
estimating period in the general fuzzy matrix.
1 Corresponding Author: Ling CHEN, School of Mathematical Sciences, Guizhou Normal University,

GuiYang, Guizhou, P. R.China; E-mail: chenling 100@163.com.


¯
52 L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors

Owing to the literatures [5,6,7], nowadays, many practical problems can be modeled
as tensor problems. For example, vector is one order tensor, matrix is a second order
tensor, three order or higher is called higher order tensor. [8] explain fast algorithm ex-
ponential data fitting and its application to exponential data fitting. [9]considered infinite
and finite dimensional Hilbert tensors and researched its periodicity. So generalizing the
tensor to the fuzzy tensor is practical and meaningful.
In this paper, we deal with the oscillation period and index of fuzzy tensor with max-
min operation. We find the oscillation period and index of fuzzy tensor by Power Method
and Minimal Strong Component in section 1 and in section 2, respectively. Our numerical
examples show the feasibility of the two proposed algorithms. Finally, in section 3, we
will give Conclusions.

1. A power method finding oscillation period and index of fuzzy tensor

In this section, we first describe some concepts and results about fuzzy matrices in the
literatures [1,2,3,4,10,11,12,13] , which will be used in the section. We give the definition
of fuzzy tensor, and analyze the periodicity and index of fuzzy tensor.
Let A = (aij ) and B = (bij ) be n × n fuzzy n matrices. We have the following
product definition: A × B = C = (cij ) = ( k=1 (aik ∧ bkj )), where aij ∧ bij =
min{aij , bij }, aij ∨ bij = max{aij , bij }. And Ak+1 = Ak × A, k = 1, 2, · · · . A = B
if aij = bij for all i, j ∈ {1, 2, · · · , n}.
Consider a finite number of fuzzy matrices A1 , A2 , · · · , An with any Ai ∈
F n×n , where F n×n denotes the set of all of n × n fuzzy matrices. We have F =
{A1 , A2 , · · · , An }.
Let Z + = {x|x be a position integer} and [n] be the least common multiple of
1, 2, · · · , n.
Referring to the relevant literatures [1,2,3,4,11], for convenience in application, we
propose an equivalent definition of the period of oscillation and the index of fuzzy matrix.

Definition 1. Let A be an n × n fuzzy matrix, there exist s, t ∈ Z + , such that


As+t = As , then we call d = min{t|As+t = As } the period of oscillation of A, and
k = min{s|As+d = As } the index of A.

Remark 1. The possible range of the period of fuzzy matrix is from 1 to [n], that is ,
1 ≤ d ≤ [n] and d|[n]. If d = 1, we say A is convergence.

Similar to the definition of tensor, in view of the characteristic of fuzzy matrix, we


will present the definition of fuzzy tensor as follows.

Definition 2. An order m dimension n fuzzy tensor A = (ai1 ···im ) consists of nm entries


0 ≤ ai1 · · · in ≤ 1, where ij = 1, · · · , n for j = 1, · · · , m.

For our purposes, throughout this paper, we always consider i1 · · · , im be the same
dimensional.
From the above definition of fuzzy tensor, clearly, a fuzzy tensor is higher order
generalization of a fuzzy matrix, and is also a tensor extension of characteristic function.
Next, we discuss the third order clustering of fuzzy tensor by using the slice-by-slice
method. For fuzzy tensor, we obtained second-dimensional sections by fixing all indices
L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors 53

(a)Horizontal slices (c)Fromtal slices


(b)Lateral slices

Figure 1. Slices of third-order fuzzy tensor.

expect for tow indices. Each slice is a fuzzy matrix. Fixing all indices but three indices,
we will define the induced third-order fuzzy tensor.

Definition 3. Let an order m dimension n fuzzy tensor A = (ai1 ···im ). Multiple third-
order fuzzy tensors clustering (Aij ik ih , A) of A are constructed by fixing all but three
indices. We call Aij ik ih is the induced third-order fuzzy tensor of A. Where ij , ik , ih ∈
{i1 · · · , im }.

By the third-order clustering theory, we shall explore the period and index of higher
order fuzzy tensor, which is converted into the study of third-order fuzzy tensor. A third-
order fuzzy tensor has the horizontal, lateral and frontal slices. Each direction contains
3 m−3
a set of fuzzy matrices. We obtain Cm n the induced third-order fuzzy tensors and
3 m−3
3Cm n sets of fuzzy matrices sequences by an order m dimension n fuzzy tensor.
Figure 1 shows the horizontal, lateral and frontal slices of the third-order fuzzy tensor
Aij ik ih , denoted by Aij :: , A:ik : and A::ih , respectively.
On the whole, it is far more intuitive and simpler to investigate higher order fuzzy
tensor with the help of geometric significance of third-order fuzzy tensor. Furthermore,
it is convenient to apply them to various fields.
Now, we introduce the period of induced third-order fuzzy tensor and the given fuzzy
tensor. The following result follows immediately by Difinition1.

Theorem 1. Let F = {A1 , A2 , · · · , An }. Then the oscillation period of F is the least


common multiple (l.c.m) the period of A1 , A2 , · · · , An , and the index of F is the largest
of the index of A1 , A2 , · · · , An .That is, suppose d1 , · · · , dn , dF and k1 , · · · , kn , kF is
the period oscillation and index of A1 , A2 , · · · , An , F respectively. We get
dF = l.c.m[d1 , · · · , dn ], kF = max{k1 , · · · , kn }.

Proof. Rebuilding a fuzzy matrix, for F = {A1 , A2 , · · · , An }, consider Ai (i = 1, 2 · · · ,


n) as a block fuzzy matrix, we have block diagonal matrix F = diag (A1 , A2 , · · · , An ).
by Definition 1 then dF = l.c.m[d1 , · · · , dn ] and kF = max{k1 , · · · , kn }.

From the geometric significance of 3-order fuzzy tensor, we state easily the main
conclusion as follows.

Theorem 2. Let Aij ik ih is the induced third-order fuzzy tensor of order m dimension n
fuzzy tensor A. Suppose d, dij , dik , dih and k, kij , kik , kih is the oscillation period and
index of A, Aij :: , A:ik : and A::ih , respectively. Then
d = l.c.m[d1 , · · · , dn ], k = max{k1 , · · · , kn }.
54 L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors

Table 1. Numerical data for Example 1


i3 i4 = 1 i4 = 2 i4 = 3 i4 = 4
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0.3 0.1 0.8 0.9 0.1 0.2 0.1 0.4 0.7 0.9 0.8 0.9 0.3 0.9 0.5 0.5
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0.1 0.9 0.2 0.8 ⎟ ⎜ 0.8 0.1 0.1 0.4 ⎟ ⎜ 0.3 0.1 0.1 0.6 ⎟ ⎜ 0.6 0.5 0.4 0.8 ⎟
1 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0.5 0.1 0.2 0.7 ⎟ ⎜ 0.9 0.2 0.4 0.3 ⎟ ⎜ 0.4 0.6 0.6 0.6 ⎟ ⎜ 0.9 0.9 0.9 0.8 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
⎛ 0.1 0.3 0.4 0.6 ⎞ ⎛ 0.3 0.6 0.8 0.4 ⎞ ⎛ 0.1 0.7 0.8 0.4 ⎞ ⎛ 05 0.4 0.2 0.7 ⎞
0.8 0.9 0.9 0.1 0.3 0.8 0.4 0.5 0.6 0.8 0.8 0.4 0.3 0.6 0.8 0.7
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0.3 0.9 0.8 0.6 ⎟ ⎜ 0.5 0.7 0.4 0.8 ⎟ ⎜ 0.3 0.4 0.9 0.3 ⎟ ⎜ 0.3 0.9 0.3 0.7 ⎟
2 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0.7 0.1 0.7 0.2 ⎟ ⎜ 0.9 0.6 0.9 0.4 ⎟ ⎜ 0.3 0.9 0.8 0.8 ⎟ ⎜ 0.7 0.3 0.6 0.3 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0.2 0.2 0.9 0.1 0.5 0.7 0.6 0.1 0.7 0.9 0.7 0.2
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ 0.5 0.7 0.4 0.9 ⎞
0.5 0.9 0.9 0.6 0.5 0.2 0.7 0.6 0.5 0.8 0.1 0.3 0.5 0.3 0.4 0.7
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0.9 0.5 0.3 0.1 ⎟ ⎜ 0.6 0.9 0.4 0.7 ⎟ ⎜ 0.4 0.5 0.2 0.4 ⎟ ⎜ 0.9 0.8 0.6 0.1 ⎟
3 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0.4 0.8 0.3 0.8 ⎟ ⎜ 0.8 0.7 0.6 0.3 ⎟ ⎜ 0.5 0.4 0.7 0.8 ⎟ ⎜ 0.8 0.3 0.3 0.2 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
⎛ 0.4 0.2 0.8 0.2 ⎞ ⎛ 0.7 0.8 0.3 0.7 0.9 0.4 0.3 0.8 0.3 0.7 0.4 0.8
⎞ ⎛ ⎞ ⎛ ⎞
0.2 0.8 0.8 0.7 0.9 0.6 0.6 0.9 0.5 0.3 0.2 0.6 0.7 0.1 0.2 0.8
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0.9 0.9 0.7 0.3 ⎟ ⎜ 0.9 0.5 0.7 0.3 ⎟ ⎜ 0.9 0.5 0.8 0.3 ⎟ ⎜ 0.3 0.2 0.5 0.2 ⎟
4 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0.6 0.9 0.7 0.1 ⎟ ⎜ 0.3 0.7 0.5 0.5 ⎟ ⎜ 0.1 0.7 0.4 0.5 ⎟ ⎜ 0.5 0.5 0.1 0.4 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
0.2 0.5 0.7 0.8 0.8 0.2 0.7 0.9 0.7 0.5 0.2 0.7 0.6 0.1 0.1 0.2

Proof. Using the block fuzzy matrices theory as Theorem 1 can prove the theorem.
Clearly, based on Definition 3 and Theorem 2, we have the following result.

Theorem 3. Let an order m dimension n fuzzy tensor A = (ai1 ···im ). Multiple third-
order fuzzy tensors clustering (Aij ik ih , A) of A, Aij ik ih is the induced third-order fuzzy
tensor of A, then the oscillation period D of fuzzy tensor A is the least common multiple
of the oscillation period all of induced third-order fuzzy tensors, and the index K of A is
the max of the index all of induced third-order fuzzy tensors.

Proof. By using block theory, the proof can be done.


By the algorithm in [3] of finding the oscillation period and index of fuzzy matrix,
we give here a power method for the oscillation period and index of fuzzy tensor. From
the above discussion, the following algorithm can be given naturally.
Algorithm 1(A power method finding oscillation period and index of fuzzy tensor).
Input: An order n dimension fuzzy tensor A = (ai1 ···im ).
Output: The oscillation period and index of fuzzy tensor as D and K.
Step 1. Choose ij , ik , ih ∈ {i1 , · · · , im }, let Aij ik ih = (aij ik ih ).
Step 2. By using Definition 1 and Theorem 1, compute dij , dik , dih and kij , kik , kih .
Step 3. By Theorem 2 compute d and k.
Step 4. Until ij , ik , ih throughout all i1 , · · · , im , repeat Step 1-Step 3.
Step 5. Based on Theorem 3 calculate D and K from the above all d and k.
Next, to demonstrate that Algorithm 1 works for fuzzy tensor, we test the following
example whose codes are written in R language.

Example 1. Let A be a 4-order fuzzy tensor with dimension four which is defined by
Table 1. For m = 4, we have the induced 3-order fuzzy tensor Ai1 i2 i3 , Ai1 i2 i4 , Ai1 i3 i4
and Ai2 i3 i4 . For Ai1 i2 i3 , if i4 = 1, we have the induced 3-order fuzzy tensor Ai1 i2 i3 1
L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors 55

contains data denoted by Ai1 i2 i3 1 = (A(:, :, 1, 1), A(:, :, 2, 1), A(:, :, 3, 1), A(:, :, 4, 1)),
and we obtain three sets of fuzzy matrices F11 , F21 , F31 by fixed one indices in turn i1 , i2 ,
i3 , where Fi1 = {A1 , A2 , A3 , A4 }, i = 1, 2, 3.
Consider all of fuzzy matrices Fi1 (i = 1, 2, 3) by Definition 1 and Theorem 1:
dF1 = [1, 1, 1, 1] = 1, kF11 = max{4, 4, 4, 4} = 4; dF21 = [2, 1, 1, 1] = 2, kF21 =
1

max{4, 3, 2, 2} = 4; dF31 = [2, 1, 2, 1] = 2, kF31 = max{3, 4, 5, 3} = 5. So,


the oscillation period d1 and index k 1 of fuzzy tensor Ai1 i2 i3 1 are as follows: d1 =
[dF11 , dF21 , dF31 ] = [1, 2, 2] = 2, k 1 = max{kF11 , kF21 , kF31 } = max{4, 4, 5} = 5.
If i4 = 2, i4 = 3, i4 = 4 we have: d2 = [2, 1, 1] = 2, k 2 = max{3, 6, 5} = 6; d3 =
[2, 1, 2] = 2, k 3 = max{3, 4, 4} = 4; d4 = [2, 3, 2] = 6, k 4 = max{5, 5, 4} = 5.
Hence, the oscillation period d1 and index k1 of fuzzy tensor Ai1 i2 i3 are: d1 =
[d1 , d2 , d3 , d4 ] = [2, 2, 2, 6] = 6, k1 = max{k 1 , k 2 , k 3 , k 4 } = max{5, 6, 4, 5} = 6.
Similar to the above analysis for Ai1 i2 i4 , Ai1 i3 i4 , Ai2 i3 i4 we obtain d2 = [2, 2, 2, 2] =
2, k2 = max{4, 4, 6, 5} = 6; d3 = [2, 2, 2, 6] = 6, k2 = max{6, 5, 6, 4} = 6; d4 =
[2, 2, 2, 2] = 2, k2 = max{5, 5, 5, 5} = 5. Based on Theorem 3 we get the oscillation
period D and index K of fuzzy tensor A: D = [d1 , d2 , d3 , d4 ] = [6, 2, 6, 2] = 6, K =
max{k1 , k2 k3 , k4 } = {6, 6, 6, 5} = 6.
This example verified the feasibility and correctness of the Algorithm 1.

2. Using minimal strong component for period of fuzzy tensor

In this section, by using graph theory tools, we give a method to find oscillation period
of fuzzy tensor. When m and n are not large and the number of nonzero elements is less
than half of the sum of fuzzy tensor elements, with the minimal strong component than
Power Method for oscillation period is simple and does not need much calculation. The
following definition in [11].
Let ΦA denote the set of all nonzero elements of fuzzy matrix A, for any λ ∈ ΦA ,
we call Aλ = (aλ )ij the cut matrix of A, where (aλ )ij = 1 if aij ≥ λ, else (aλ )ij = 0.
We follow [14,15,4] to show the period of Boolean matrix by strong components
and express the period of fuzzy matrix with minimal strong component. Furthermore, we
shall find the period of fuzzy tensor based on the minimal strong component.

Definition 4. (See [4]Definition4.2) We call S a strong component of a fuzzy matrix A,


if there is a λ ∈ ΦA such that S is a strong component of cut matrix Aλ .

If D is the digraph of fuzzy matrix A and S is a strong component, we let d(S) be


the period of S, Ω denote the set of minimal strong component of fuzzy matrix A.

Theorem 4. (See [4], Theorem 4.9). If A is an n × n fuzzy matrix, Ω = {s1 , s2 , · · · sw },


then d(A) = [d(si )], where si ∈ Ω.

According to the above discussion, we can develop the following algorithm for the
oscillation period of fuzzy tensor by minimal strong component.

Algorithm 2(Minimal strong component finding oscillation period of fuzzy tensor).


Input: An order n dimension fuzzy tensor A = (ai1 ···im ).
Step 1. Establish the induced 3-order fuzzy tensor Aij ik ih by Definition 3.
56 L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors

Table 2. Numerical data for Example 2


A(:, :, 1) A(:, :, 2) A(:, :, 3)
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0.5 0.3 0.4 0 0 0 0 0.8 0 0 0 0.1 0.8 0 0.5 0
0 0
⎜ 0 0.3 0 0 0 ⎟ ⎜ 0.5 0 0 ⎟ ⎜ 0 0 0 0.4 0 0.5 ⎟
⎜ 0 ⎟ ⎜ 0 0.3 0 ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0.4 0 0 0.3 0 0 ⎟ ⎜ 0 0.7 0 0 0.5 0 ⎟ ⎜ 0.5 0 0 0 0.4 0 ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0 0 0 0.3 0 ⎟ ⎜ 0.3 0 0.2 ⎟ ⎜ 0 0.5 0 0 0 0.4 ⎟
⎜ 0 ⎟ ⎜ 0.3 0 0 ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ 0 0 0 0.4 0 0 ⎠ ⎝ 0 0 0 0.5 0 0 ⎠ ⎝ 0 0 0.4 0 0 0 ⎠
0 0 0 0 0.3 0 0 0 0.2 0 0 0 0 0 0 0 0 0

a
1 a2

a1
a4
a
3
a
3

a
a 5 a
1 4 a6 a5
(a)Digraph of D0.5 (b)Digraph of D0.4 (c)Digraph of D0.3

Figure 2. Digraph of A(:, :, 1).

Step 2. Create Fij , Fik , Fih by Aij :: , A:ik : and A::ih .


Step 3. Compute the period of fuzzy matrix of Fij , Fik , Fih by minimal strong compo-
nent .
Step 4. According to Theorem 1, obtaining the period of Fij , Fik , Fih .
Step 5. Calculate the period of the induced 3-order fuzzy tensor Aij ik ih by Theorem 2 .
Step 6. Figure out the oscillation period of fuzzy tensor A by Theorem 3 .
To illustrate the Algorithm 2 works for fuzzy tensor, we test the following example.

Example 2. Let A be a third order six dimensional fuzzy tensor A = (A(:, :, 1), A(:, :
, 2), A(:, :, 3)) defined by Table 2.
For A(:, :, 1) See Figure 2 we have λ1 = 0, λ2 = 0.3, λ3 = 0.4, λ4 = 0.5 then
digraph Di (i = 1, 2, 3) can be represented as follows.
In D0.5 there is only one strong component S1 = {a1 }. In D0.4 there is one
strong component S2 = {a1 , a3 }. In D0.3 there are two strong components S3 =
{a1 , a2 , a3 }, S4 = {a4 , a5 }.
Notice that S4 is a strong component which has no common vertices with S1 , S2 , S3 .
Hence, we say that S4 is a newly appeared strong component. Moreover, we obtain that
the set of minimal strong components of fuzzy matrix A(:, :, 1) is Ω = {S1 , S4 }. Then
d(A(; , ; , 1)) = [d(s1 ), d(s4 )] = [1, 2] = 2.
Consider A(:, :, 2) and A(:, :, 3) we have d(A(; , ; , 2)) = [2, 3] = 6, d(A(; , ; , 3)) =
[1, 2, 2] = 2. Then d(A) = [d(A(:, :, 1)), d(A(:, :, 1), d(A(:, :, 1)] = [2, 6, 2] = 6.
This example illustrates Algorithm 2 has one great advantage that only uses the
directed graph of spare fuzzy matrix can find it’s oscillation period , and there is no need
the troublesome calculations.
L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors 57

3. Conclusions

In this paper, we proposed fuzzy tensor which is a new class of nonnegative tensor and
which is a form of higher order of fuzzy matrix. We gave the definition of the induced
third-order fuzzy tensor that has an advantage of intuitive geometric significance. Based
on these concepts, we investigated the oscillation period and index of fuzzy tensor with
the help of Power Method and Minimal Strong Component, respectively. Our numerical
results showed that two methods are feasible and favourable. Hence, it is necessary to
research many more properties of fuzzy tensors. In the future, we will continue to mull
over all aspects of fuzzy tensor.

Acknowledgements

The work of the first author was supported by Innovation Foundation of Guizhou Nor-
mal University for Graduate Students(201529, 201528), and the Shandong province Col-
lege’s Outstanding Young Teachers Domestic Visiting Scholar Program(2013). The work
of the second author was supported by the National Science Foundation of China (Grant
Nos.11261012).

References

[1] M.G.Thomason, Convergence of powers of a fuzzy matrix, Journal of Mathematical Analysis and Appli-
cations, 57(1977), 476–480.
[2] Z.T.Fan, D.F.Liu, On the oscillating power sequence of a fuzzy matrix, Fuzzy Sets and Systems, 93(1998),
75–85.
[3] J.X.Li, Periodicity of powers of fuzzy matrices, Fuzzy Sets and Systems, 48(1992), 365–369.
[4] W.B.Liu, Z.J.Ji, The periodicity of square fuzzy matrices based on minimal strong components, Fuzzy
Sets and Systems, 126(2002), 233–240.
[5] L.Q.Qi, Eigenvalues of a real supersymmetric tensor, Journal of Symbolic Computation, 40(2005), 1302–
1324.
[6] L.H.Lim, Singular values and eigenvalues of tensors: A variational approach, Proceeding of the IEEE
Internatinal Workshop on Computation advances in multi-tensor adaptive processing, 1(2005), 129–132.
[7] T.G.Kolda , B.W.Bader, Tensor decomposition and applications, SIAM Review, 51(2009), 455–500.
[8] W.Y.Ding, L.Q.Qi, YM Wei, Fast Hankel tensorĺCvector product and its application to exponential data
fitting, Linear Algebra and its Applications, 22(2015), 814–832.
[9] Y.Song, L.Q.Qi, Infinite and finite dimensional Hilbert tensors, Linear Algebra and its Applications,
451(2014), 1–14.
[10] C.Z.Luo, Introduction to fuzzy sets (Vol.1), Beijing Normal University Press,Beijing,(In Chinese), 1989.
[11] Z.T.Fan, D.F.Liu, On the power sequence a fuzzy matrix-Convergent power sequence, Journal of Com-
putational and Applied Mathematics, 4(1997), 147–165.
[12] L.A.Zadeh, Fuzzy sets, Information and Control, 8(1965), 338–353.
[13] S.G.Guu, Y.Y.Lur, C.T.Pang, On infinite products of fuzzy matrices, SIAM Journal on Matrix Analysis
and Applications, 22(2001), 1190–1203.
[14] B.De.Schutter, B.DE.Moor, On the sequence of consecutive powers of a matrix in a Boolean algebra,
SIAM Journal on Matrix Analysis and Applications, 21(1999), 328–354.
[15] K.H.Kim, Boolean matrix theory and application, Marcel Dekker, New York, 1982.
58 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-58

Toward a Fuzzy Minimum Cost Flow


Problem for Damageable Items
Transportation
Si-Chao LU1 and Xi-Fu WANG
School of Traffic and Transportation, Beijing Jiaotong University, Beijing, China

Abstract. In this paper, we have proposed a mathematical formulation of fuzzy


minimum cost flow problem for damageable items transportation. For the
imprecise model, capacity, cost, percentage of unit damage of each route have
been considered as triangular fuzzy numbers. This problem has been solved by
using the k-preference integration method, the area compensation method, and the
signed distance method. Finally, to show the validity of the proposed model, a
numerical example is provided and solved with Wolfram Mathematica 9.

Keywords. minimum cost flow, k-preference integration, area compensation,


signed distance

Introduction

As a classic combinatorial problem, the minimum cost flow problem has a wide range
of applications and ramifications. In the logistics industry, it is common for decision
makers to generate a plan to optimally transport damageable items from multiple
sources to multiple destinations through transshipment stations. Furthermore,
impreciseness in defining parameters such as the cost per unit on one route is another
commonly appeared problem in realistic environment. Therefore, this paper is devoted
to solve this problem.
With respect to the fuzzy minimum cost flow problem [1], there exist a lot of
fruitful outcomes. In the fuzzy minimum cost flow problem proposed by Shih and Lee
[2], the cost parameter and capacity constraints are taken as fuzzy numbers. In addition,
they proposed a fuzzy multiple objective minimum cost flow problem and used
minimization of the total passing time as the second objective in an example. Ding
proposed an ǩ-minimum cost flow problem to deal with uncertain capacities [3].
However, few studies refer to adaption of this problem to the damageable items
transportation. A close related problem is the multi-objective, multi-item intuitionistic
fuzzy solid transportation problem for damageable items which was proposed by
Chakraborty et al [4]. To defuzzify the imprecise parameters, we use the k-preference
integration method, the area compensation method, and the signed distance method
respectively. Computations to solve the problem are done by using the Wolfram
Mathematica 9.

1
Corresponding Author. Si-Chao LU, School of Traffic and Transportation, Beijing Jiaotong
University, Beijing, China; E-mail: lusichao@163.com.
S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem 59

The remaining of the paper is organized as follows: The next section offers a brief
introduction to fuzzy numbers and three defuzzification methods. The mathematical
model of fuzzy minimum cost flow problem for damageable items is proposed in
Section 2. A simulated problem instance is given and solved in Section 3. Finally, the
paper is concluded in Section 4.

1. Fuzzy Preliminaries

1.1. Fuzzy Numbers

Definition 2.1. If X is a universe of discourse, then a fuzzy number  in X is defined as:


={(x, (x)) | x∈X} (1)
where  : X [0,1] is a mapping called the membership function of x∈X in . [5]

Definition 2.2. A triangular fuzzy number (TFN)  can be defined as = ( 1 , 2 , 3 ),


which is shown in Figure 1. The membership function of  is determined in Eq. (2) [5].

0 x ≤ a1

⎪ x − a1
⎪ a1 ≤ x ≤ a2

( ) = aa2 − a1
⎨ 3−x (2)
a2 ≤ x ≤ a3
⎪a3 − a2

⎩ 0 a3 ≤ x
Figure 1. A triangular fuzzy number.

1.2. K-Preference Integration Representation Method

The k-preference integration method was introduced by Chen and Hsieh [6]. Based on
this method, the k-preference integration representation of a general TFN = (a1 , a2 , a3 )
is defined as:
1 1
1
 =  h[kL-1 (h)+(1-k)R-1 (h)] dh h dh = [ka1 +2a2 +(1-k)a3 ]
Pk  (3)
 0 3
From Eq.(3), it can be obviously seen that the k-preference integration is fairly
flexible compared with other defuzzification methods, because the value of k is
determined by the decision maker. It has been used to handle the fuzzy cold storage
problem [7] and the constrained knapsack problem in fuzzy environment [8].
If k=0.5 then the result generated by k-preference integration method will be the
same as that obtained by graded mean integration (GMI) method which was introduced
by Chen and Heieh [9].

1.3. Area Compensation Method

=(a1 , a2 , a3 ) can be
Based on the area compensation method [10], the TFN A
defuzzified as:
60 S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem

a
2 3 a
∫a h
(x)dh+ ∫a h
(x)dh (a3 − a1 )(a1 +a2 +a3 )⁄6 a1 +a2 +a3
 =
ΦA A 1
a2 a3

= = (4)
∫a 
(x)dh+ ∫a 
(x)dh (a3 − a1 )⁄2 3
1 

1.4. Singed Distance Method

The left and the right α-cuts of the TFN A = (a1 , a2 , a3 ) are L-1 (α) = a+(b-a)α and
R-1 (α) = c-(c-b)α [11]. Based on the ranking system for fuzzy numbers proposed by
is defined as follows:
Yao and Wu [12], the signed distance of the A
1 1
1 1 1
,0) =  [L-1 (α)+R-1 (α)] dα =  [a+(b-a)α+c-(c-b)α] dα= (a+2b+c)
d(A (5)
2 0 2 0 4
Shekarian et al. [11] combined this method with existing economic production
quantity models to find optimal production quantities.

2. Mathematical Formulation

The fuzzy minimum cost flow problem for damageable items transportation blends the
fuzzy set theory and the minimum cost flow problem. The objective of the proposed
problem is to minimize the total cost of sending the available supply through
transshipment nodes to satisfy the demand. It is also necessary to introduce constraints
that guarantee the feasibility of flows.
Let G=(N, A) be a directed network with node set N={1,2,3,…,n} and arc set A.
Each arc aij ∈ A stands for a route and has a positive upper bound capacity uij and a
positive cost cij . Both uij =(ulij ,uij , urij ) and cij =(clij ,cij , crij ) are taken as triangular fuzzy
numbers, because some vehicles may provide a small degree of leeway of capacity [5]
and the transportation cost of each route tends to vary. Each node i ∈ N has a bi , which
represents the nature of node n. If node i is a supply node then bi >0, if node i is a
demand node then bi <0, if node i is a transshipment node then bi =0. We use a TFN
l r
α
=(α
ij ij ,αij ,αij ) to denote the percentage of unit damage products for the route aij due to
physical vibration caused by bad road condition or improper driving behaviors etc. xij is
the decision variable which denotes the flow quantity through route aij .
Based on the above descriptions, the mathematical formulation can be developed
as follows.
min Z = ∑ni=1 ∑nj=1 cij xij (6)
s.t.
∑nj=1 xij - ∑nj=1 (1-α
ji )xji =bi , ∀i ∈ {i|bi ≥0} (7)
∑nj=1 xij - ∑nj=1 (1-α
ji )xji ≤bi , ∀i ∈ {i|bi <0} (8)
0≤ xij ≤ uij , ∀i, ∀j (9)
∑ni=1 bi -∑ni=1 ∑nj=1 α
ij xij ≥0 (10)
Here (6) indicates the cost minimization objective function. Constraint (7) and
constraint (8) represent the net flow of node i under two different situations
respectively. In addition, constraint (8) implies that demand nodes can be satisfied with
excessive items. Constraint (9) ensures that the total amount of transported damageable
items is less or equal to the capacity of route aij . Constraint (10) guarantees that the
S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem 61

total amount of items provided by the supply nodes is no less than the amount of
damaged items plus the total amount of items required by the demand nodes.
Based on the k-preference integration method, Eq. (6)-Eq. (10) can be redefined as
follows, where kc , kα , ku can be determined differently under the decision maker’s
preference.
r
min Z = ∑ni=1 ∑nj=1 [kc clij +2cij +(1-kc )cij ]xij ⁄3 (11)
s.t.
r
∑nj=1 xij - ∑nj=1 {1- [kα αlji +2αji +(1-kα )αji ]⁄3 }xji =bi , ∀i ∈ {i|bi ≥0} (12)
r
∑nj=1 xij - ∑nj=1 {1- [kα αlji +2αji +(1-kα )αji ]⁄3 }xji ≤bi , ∀i ∈ {i|bi <0} (13)
0≤ xij ≤ [ku ulij +2uij +(1-ku )urij ]⁄3, ∀i, ∀j (14)
n n n l r
∑i=1 bi -∑i=1 ∑j=1 (kα αij +2αij +(1-kα )αij )xij ⁄3 ≥0 (15)
Applying the area compensation method, Eq. (6)-Eq. (10) can be written in the
following form:
min Z = ∑ni=1 ∑nj=1 (clij +cij +crij )xij ⁄3 (16)
s.t.
∑nj=1 xij - ∑nj=1 [1- (αlji +αji +αrji )⁄3 ]xji =bi , ∀i ∈ {i|bi ≥0} (17)
n n l r
∑j=1 xij - ∑j=1 [1- (αji +αji +αji )⁄3 ]xji ≤bi , ∀i ∈ {i|bi <0} (18)
0≤ xij ≤ (ulij +uij +urij )⁄3, ∀i, ∀j (19)
∑ni=1 bi -∑ni=1 ∑nj=1 (αlij +αij +αrij )xij ⁄3 ≥0 (20)
Similarly, with the help of the signed distance method, Eq. (6)-Eq. (10) can be
expressed as:
min Z = ∑ni=1 ∑nj=1 (clij +2cij +crij )xij ⁄4 (21)
s.t.
∑nj=1 xij - ∑nj=1 [1- (αlji +2αji +αrji )⁄4 ]xji =bi , ∀i ∈ {i|bi ≥0} (22)
n n l r
∑j=1 xij - ∑j=1 [1- (αji +2αji +αji )⁄4 ]xji ≤bi , ∀i ∈ {i|bi <0} (23)
l r ⁄
0≤ xij ≤ (uij +2uij +uij ) 4, ∀i, ∀j (24)
∑ni=1 bi -∑ni=1 ∑nj=1 (αlij +2αij +αrij )xij ⁄4 ≥0 (25)

3. Numerical Experiment

The case in this section is adapted from an example in [1], which copes with the crisp
model of the minimum cost flow problem. Assume 60 units and 40 units of damageable
items are supplied by node A and node B, whereas no less than 30 units and 60 units of
damageable items are required by node D and node E respectively. Node C is a
transshipment node. Capacities and costs of the routes cannot be determined precisely
in advance. If the route aij has no specified capacity, then uij can be regarded as a large
number and hence be ignored in the mathematical model. Critical parameters of this
problem instance are shown in Figure 2.
Given that this problem is small-scale and hence can be solved by exact algorithms,
we use the Wolfram Mathematica 9 to generate optimal solutions. The imprecise
parameters are defuzzified using three methods, which are the k-preference integration
method, the area compensation method, and the signed distance method. To simplify
the problem, we let k=kc =kα =ku . Mathematical formulations and results by using the
62 S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem

GMI method and the area compensation method are shown in Figure 3 and Figure 4.
Computational results are shown in Table 1.

Figure 2. Network representation of a fuzzy minimum cost flow problem for damageable items
transportation.

Figure 3. Mathematical formulation and results by using the GMI method with Mathematica.

Figure 4. Mathematical formulation and results by using the area compensation method with Mathematica.

From Table 1, it can be clearly seen that the GMI method, the area compensation
method, and the signed distance method generated similar results. Furthermore,
decremented cost can be obtained when k is increased, which proves the correctness of
the fuzzification by using the k-preference integration method.
S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem 63

Table 1. Solutions obtained using the k-preference integration method, the signed distance method, and the
area compensation method

k=0.5 Signed Area


Variable k=0 k=0.2 k=0.8 k=1
(GMI) Distance Compensation
xAB 0.00 0.00 0.00 0.00 0.00 0.00 0.00
xAC 40.50 40.11 39.52 38.93 38.53 39.28 39.03
xAD 19.50 19.89 20.48 21.07 21.47 20.72 20.97
xBC 40.00 40.00 40.00 40.00 40.00 40.00 40.00
xCE 80.50 80.11 79.52 78.93 78.53 79.28 79.03
xDE 0.00 0.00 0.00 0.00 0.00 0.00 0.00
xED 11.22 10.92 10.48 10.04 9.75 10.21 9.94
Z 573.17 569.32 563.29 557.28 553.20 564.35 563.96

4. Conclusion

In this paper, we have presented a minimum cost flow problem for damageable items
transportation in imprecise environment. After defuzzifying the fuzzy parameters with
k-preference integration method, area compensation method, and the signed distance
method, the optimal flow can be obtained with Wolfram Mathematica.
There are three major avenues for future work. First, more defuzzification methods
such as the credibility measure method [8] or using the violence tolerance level [13]
can be used and the results could be compared to a further step. Second, more objective
functions could be added and more item properties can be considered. Finally, given
that Das et al. successfully solved a multi-objective solid transportation problem with
type-2 fuzzy variable [14], some parameters in this models could also be taken as type-
2 fuzzy numbers to better describe the problem and defuzzified to generate optimal
solutions.

References

[1] F. S. Hillier and G. J. Lieberman, Introduction to Operations Research (Ninth Edition), McGraw-Hill,
New York, 2010.
[2] H. S. Shih and E. S. Lee, Fuzzy multi-level minimum cost flow problems, Fuzzy Sets & Systems,
107(1999), 159-176.
[3] S. Ding, Uncertain minimum cost flow problem, Soft Computing, 18 (2014), 2201-2207.
[4] D. Chakraborty, D. K. Jana, T. K. Roy, Expected value of intuitionistic fuzzy number and its application
to solve multi-objective multi-item solid transportation problem for damageable items in intuitionistic
fuzzy environment, Journal of Intelligent & Fuzzy Systems, 30 (2016), 1109-1122.
[5] H. J. Zimmermann, Fuzzy Set Theory-and Its Applications, Fourth Edition, Luwer Academic Publishers,
Norwell, 2001
[6] S. H. Chen and C. H. Hsieh, A new method of representing generalized fuzzy number, Tamsui Oxford
Journal of Management Sciences, 13-14 (1998), 133-143.
[7] S. Lu and X. Wang, Modeling the Fuzzy Cold Storage Problem and Its Solution by a Discrete Firefly
Algorithm, Journal of Intelligent and Fuzzy Systems, 31(2016), 2431-2440.
[8] C. Changdar, G. S. Mahapatra, and R.K. Pal, An improved genetic algorithm based approach to solve
constrained knapsack problem in fuzzy environment, Expert Systems with Applications 42 (2015),
2276-2286.
64 S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem

[9] S. H. Chen and C. C. Wang, Representation, ranking, distance, and similarity of fuzzy numbers with step
form membership function using k-preference integration method, Joint 9th. IFSA World Congress and
20th NAFIPS International Conference, 2 (2001). IEEE, 801-806.
[10] S. K. De and I. Beg, Triangular dense fuzzy sets and new defuzzification methods, Journal of
Intelligent and Fuzzy Systems, 31(1) (2016), 469-477.
[11] E. Shekarian, C. H. Glock, S.M.P. Amiri, K. Schwindl, Optimal manufacturing lot size for a single-
stage production system with rework in a fuzzy environment, Journal of Intelligent and Fuzzy Systems
27 (2014), 3067-3080.
[12] J. S. Yao, K. Wu, Ranking fuzzy numbers based on decomposition principle and signed distance, Fuzzy
Sets and Systems, 116 (2000), 275-288.
[13] J. Brito, F. J. Martinez, J. A. Moreno, J. L. Verdegay, Fuzzy optimization for distribution of frozen
food with imprecise times, Fuzzy Optimization and Decision Making, 11 (2012), 337-349.
[14] A. Das, U. K. Bera, M. Maiti. Defuzzification of trapezoidal type-2 fuzzy variables and its application
to solid transportation problem, Journal of Intelligent and Fuzzy Systems, 30 (2016), 2431-2445.
Fuzzy Systems and Data Mining II 65
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-65

Research on the Application of Data


Mining in the Field of Electronic
Commerce
Xia SONG 1 and Fang HUANG
Shandong Agricultural Engineering Institute, Jinan, Shandong, China

Abstract. E-Commerce is a business mode based on internet and information


technology. Data mining techniques are widely used in E-Commerce for digging
out patterns and retrieving information from large scale noisy datasets. The
booming of E-Commerce enables businesses to collect large amount of data which
could be analyzed for enhancing revenues. The abundant data collected online is
the foundation of big data analysis. How to employ data mining models on
strategizing and making business decisions is an important topic in recent years.
This paper talks about data mining and its application in E-Commerce. The
application of data mining in electronic commerce developed based on data mining
technology of electronic commerce system to strengthen the ability of business
information analysis, it is concluded that the intrinsic relationship between data
and extract the useful information, to provide the expected information of the
electronic commerce for the business management personnel, to ensure the
effective operation of the electronic commerce. Data mining techniques could be
used for automated data analysis, pattern identification, information retrieving,
business strategizing as well as providing personalized services.

Keywords. E-Commerce, big data, data mining, case

Introduction

Electronic commerce is a new commerce mode in the field of business, refers to the use
of digital electronic technology to carry out all the business activities, it to the Internet
as the main body, with information technology as the core. Electronic commerce
appear to businesses and individuals to bring new opportunities and challenges,
promote the process of network of the traditional business model, change the business
activities of enterprises and personal consumption, to achieve the business activities of
the digital and intelligent.
Electronic commerce development prompted internal collected a lot of data, is in
urgent need of these data into useful information and knowledge, for enterprises to
create more potential profit. Internet access from the massive data enable data mining
with rich data base, using data mining technology can effectively help the enterprise to
highly automated analysis data, makes an inductive reasoning, found hidden in a
subsequent regularity, extract the effective information, guide enterprises to adjust their
marketing strategies, make the right business decisions, at the same time, provide the

1
Corresponding Author: Xia SONG, Shandong Agricultural Engineering Institute, Jinan Shandong,
China; E-mail: 643549139@qq.com
66 X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce

dynamic personalized and efficient service for customers, improve the core
competitiveness of enterprises.

1. E-commerce and Data Mining

E-Commerce is a business mode based on internet and information technology. It shifts


the traditional business mode and individual’s consumption patterns as more trades and
deals are carried out online. The booming of E-Commerce enables businesses to collect
large amount of data which could be analyzed for enhancing revenues. The abundant
data collected online is the foundation of big data analysis. Data mining techniques
could be used for automated data analysis, pattern identification, information retrieving,
business strategizing as well as providing personalized services. E-Commerce business
could develop online business system which uses data mining techniques to analyze
online business data, identifying correlations within the data and making predictions of
the market.

1.1. E-commerce

Electronic commerce is refers to individuals or enterprises on the Internet as the carrier,


the digital electronic exchanges business data and business activities [1]. E-Commerce
attracts users by its low cost, convenience, high reliability and free of time and space
constraint. There are lots of E-Commerce activities in China nowadays which include
online advertising, electronic business notes exchange, online shopping and online
payment as well as B2B, B2C and C2C business mode.
With the rapid development of network technology and database technology,
electronic commerce is more and more strong vitality, the amount of online
transactions rose year by year, but the development of electronic commerce, but also to
the traditional enterprise has brought many new problems. To increase the enterprise
electronic commerce, e-commerce platform, the emergence of a large number of
shopping websites, have all kinds of business information, these "big data" in a huge
commercial value. However, in the face of the huge amount of structural diversity,
different types of information, enterprises should be how to organize and utilize, to get
to their own value or is associated with their own demand information? The application
of data mining technology in electronic business has become an inevitable choice. The
data mining technology from noisy, disorderly data extraction, to extract the potential
unknown and useful data, and gives the logical reasoning and Visual interpretation, to
facilitate business decision-makers in a timely manner to grasp the market dynamics, to
make a reasonable decision in real time.
The application of data mining in electronic commerce developed based on data
mining technology of electronic commerce system to strengthen the ability of business
information analysis, it is concluded that the intrinsic relationship between data and
extract the useful information, to provide the expected information of the electronic
commerce for the business management personnel, to ensure the effective operation of
the electronic commerce. Many large electronic business enterprises (such as Taobao,
Jingdong Mall, etc.) provide a variety of data mining tools for managers to use in order
to increase sales, for customer relationship management also have very good help.
X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce 67

1.2. Data Mining

Data mining (DM), also known as the database knowledge discovery (knowledge
discovery in database (KDD), is from a large number of, completely, noisy, fuzzy and
stochastic data, extraction implied in them, people do not know in advance, but is
potentially useful information and knowledge process [2]. Data mining is a cross
discipline, it is a gathering of the database technology, artificial intelligence, machine
learning, data visualization, pattern recognition, parallel computing multiple fields of
knowledge.
Data mining is a new business information processing technology, is according to
the enterprise established business objectives, a large number of enterprise database
business data extraction, conversion, analysis and handling of other models, the
extraction key data that is helpful to business decision, reveal the hidden, unknown or
validation of known regularity and further advanced and effective method of the model.
In the electronic commerce data mining, Web mining, is to use data mining
technology from WWW resources (Web document) and behavior (Web service) to
automatically find and extract interesting and useful patterns and information [3].Web
data have 3 types: the Web data of HTML markers, Web document in the connection
structure data and user access data. According to the corresponding data type, Web
mining can be divided into 3 categories: Web content mining, is from the Web
document or the description of the process of knowledge selection; Web structure
mining, closed system is derived from the organizational structure and knowledge link
Web, its purpose is through clustering and analysis of Web links, web page structure
and useful patterns, find authoritative web pages; Web usage mining, is through mining
storage access log on Web, to discover patterns and potential customers users access
the Web, and other information of the process.

2. Data Mining Methodologies in E-Commerce

2.1. Correlation Analysis

Correlation analysis digging out hidden correlation within the dataset. For example, it
could analyze the correlation of different items in one online purchase. If the customer
buys an item A then the model could predict the probability of the customer buys item
B based on the correlation of A and B. A Prior algorithm is the most commonly used
method for correlation analysis [4].

2.2. Cluster Analysis

Cluster analysis is a technique that clustering objects into different groups. It could be
used to cluster customers with similar interests or items with common characteristics.
The most widely used clustering algorithms include: hierarchical clustering, centroid-
based clustering, distribution-based clustering and density-based clustering [5].
Cluster analysis is commonly used in E-Commerce for sub-dividing client groups.
The algorithm could cluster clients into different subgroups by analyzing the
similarities of their consumption patterns. The business owner could then make
different strategies and provide personalized services for different target groups.
68 X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce

2.3. Data Categorization

Data categorization is the process of classifying items by analyzing their certain


properties [6]. It solves for the optimal categorizing rules based on training data and
uses the rule to categorize data other than the training set. The most popular
categorizing algorithms include genetic algorithm, Bayesian classification and neural
network.
The goal of data categorization is to classify an item into a specific class. It could
be used for analyzing existing data as well as making predictions. The algorithm builds
a classification model based on existing training data and use the model to predict
possible reactions of customers with different characteristics. Business could make
personalized service for different categories of customers.

2.4. Serial Pattern Analysis

Like correlation analysis, serial pattern analysis identifies correlations between


different items. Serial pattern analysis focuses on analyzing time series data; it makes
prediction based on time series models. For example, it may discover that within a
certain time period, the purchase pattern of buying A then buying B then buying C has
a high occurring frequency [7]. It could dig out such bundles with high frequencies by
analyzing the purchasing data.
Serial pattern analysis enables the business to predict inquiry patterns of the
customers and then pushing advertisement and services that may meet customers’
demand accordingly.

3. Application of Data Mining in E-Commerce

Data mining is a powerful tool and provides informed guidance in the decision making
process of E-Commerce. It seeks pattern in the sea of unorganized internet traffic and
discovers valuable information to support decision making and strategy development.
Data mining is widely used in product positioning and purchasing behavior
analysis to formulate marketing strategy. It can also be applied to forecast sales market
by analyzing purchasing patterns. Currently, all the major data companies start to
embed the data mining function into their own products. For example, those giant
players such as IBM and Microsoft all incorporate the online analysis function into
their corresponding products. By mining the customer information including
customers’ visit behavior, visit content and visit frequency, the E-commerce
recommendation system based on data mining is able to analyze customer features and
to conclude their visiting patterns in order to offer tailor-made service and product
recommendation catering to customer need.
Data mining technique can figure out the correlation algorithm among products by
analyzing the portfolio in shopping cart and then fixing customers’ purchasing
behaviors accordingly, hereby generating the marketing strategy associated with
commodity displays, bundling sales and marketing promotion. The major task for
correlation analysis is to digging out hidden correlations within the dataset. One
example is the purchase of bread and butter implies the purchase of milk: over 90% of
customers who buy bread and butter will purchase milk as well. The business owner
could make better item bundling by analyzing the correlation among different goods.
X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce 69

The sales manager in Wal-Mart has found a surprising fact that the beer and
nappies, two apparently irrelevant products, are always purchased together [8]. This
phenomenon is frequently observed among young fathers who are used to pick up beers
when they are required to buy nappies in supermarket. This motivated the grocery store
to move the beer isle closer to the diaper, instant increase in sales of both.
In the E-commerce, by figuring out all the similar association rules through data
mining, online vendors can recommend commodities to customers based on their
existing products in shopping cart, thus enhancing the cross-selling. Furthermore, by
offering personalized commodity information and advertisement, customer interest and
loyalty are also expected to be increased.

4. Application Cases

Data mining techniques are used in E-Commerce for analyzing online inquiries, online
trades and registration information. It usually takes steps such as define business scope,
data collection, data preprocessing, model construction and evaluation, output analysis
and evaluation [9]. The steps above are usually repeated and iterated to get more
accurate results.
Data mining is playing more and more important role in E-Commerce. There are
successful cases applying data mining related theory and technology to the E-
Commerce [10].This section discusses the application of data mining in customer
segmentation on Taobao.com. Purchase behavior and sales behavior coexist on Taobao
platform. Experts suggest using the following 15 key factors and weights for
classifying customers and predict their behaviors, as shown in Table 1.

Table 1. purchase behavior and sales behavior


Influence Factors Weight
Purchase Voluntary phone inquire or onsite help 11.2
Behavior Show interest in product and inquire about promotion 10.3
69% Budget for web promotion 8.5
Have or in the process of hiring trade specialists 8.1
Used to e-commerce 7.5
Respond to EFAX/EDM/Phone Promotion 6.5
Participated in Alibaba conferences such as marketing, training and business 6.1
development
Experience with third party B2B web platform 5.4
Experience with overseas trade exhibition or domestic export exhibition 5.3
Sales Attempt to sign sales contract in the coming month 8.1
Behavior In direct competition with competitors 7.5
31% Presence of director, manager or colleague in sales process 5.4
Made proposal to clients 4.4
Client visit in one year 2.9
Open house in one year 2.6

Formula: Client score S=в(Influence Factor*Weight). By clustering result we get


the four tiers of clients: (1) S҆50, 90% potential client㧔2) 23҅S㧨50㧘50% potential
client㧔3) 11҅S㧨23㧘25% potential client㧔4) 0҅S㧨11㧘first time visit client.
By segmenting current customers and studying their responses to existing
marketing and promotion strategies, companies could design more targeted strategies
on how to communicate to each segment of customers.
70 X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce

5. Conclusion

E-Commerce is developing rapidly and generating tons of data to analyze. Data mining
enables businesses to predict market trend and customer’s behaviors, it also helps to
provide personalized service and push personalized advertisements. Business may
enhance the revenue by forming better strategies with the help of data mining analysis.
Data mining in E-Commerce will enjoy further development with the progress on
hardware technology and algorithms research as the accumulation of application
experiences.

References

[1] S. Z. Zhang, X. K. Qu, L. Zhang, Research on the Web data mining based on Electronic-Commerce,
Modern Computer, 03(2015),12–17.
[2] H. M. Wu, Sales data mining technology and e-commerce application research, Guangdong University of
Technology,2014
[3] Y. N. Zhang. Application of web data mining in e-commerce. Fujian computer, 05(2013),138-140
[4] J. X. Wu, Research on web data mining and its application in E-Commerce, Information System
Engineering, 01 (2010), 15–18.
[5] X. J. Chen, Research on data mining in electronic commerce, Information and Computer, 05(2014),135
[6] H. Y. Lu, Application of data mining techniques in e-commerce, Network and Information Engineering
(2014),73-75
[7] L. Huang, Research on the application of Web data mining in e-commerce, Hunan University,2014
[8] Y. Gao. Beer and diapers. Tsinghua University press, 2008
[9] S. Liu, Application of Web data mining technology for e-commerce analysis, Electronic technology and
software engineering, 07(2014),216-7
[10] China statistics web. Application of data mining in e-commerce.
http://www.itongji.cn/datamining/hangye/dianzishangwuzhongshujuwajuefangfadeyingyong/ 2010
Fuzzy Systems and Data Mining II 71
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-71

A Fuzzy MEBN Ontology Language Based


on OWL2
Zhi-Yun ZHENG, Zhuo-Yun LIU, Lun LI, Dun LI1 and Zhen-Fei WANG
School of Information Engineering, Zhengzhou University㧘
Zhengzhou 450001, China

Abstract. With the rapid development of the Semantic Web research, the demand
for representation and reasoning with uncertain information increases. Despite
ontology is capable of modeling the semantic and knowledge in knowledge-based
system, classical ontology languages are not appropriate to deal with uncertainty in
knowledge, which is inherent in most of the real world application domains. In this
paper, we address this issue by extending the power of expression in current
ontology language, that is, proposing a Fuzzy Multi Entity Bayesian Networks
ontology language which extends the PR-OWL and based on combination of
Fuzzy MEBN and ontology, defining and studying its syntax and semantics, and
showing representation of domain knowledge by RDF graphs. The proposed
language Fuzzy PR-OWL will move beyond the current limitation of modeling the
knowledge with fuzzy semantic or fuzzy relation in PR-OWL. By providing a
principled means of uncertainty representation and reasoning, Fuzzy PR-OWL can
serve for many applications with fuzzy and probability knowledge.

Keywords. the Semantic Web, Fuzzy MEBN, ontology language, PR-OWL

Introduction

With the rapid development of information technology, the techniques of data


collection, data storage, and high performance computing have gained significant
improvement. As some recent surveys, the amount of data around the world doubles
every 20 months. The mountainous amounts and various types of data complicate the
data relations. To enable the computer to automatically process and integrate valuable
data from the Internet, the semantic web, which aimed at seamless interoperability and
information exchange among web applications, and rapid and accurate identification
and invocation of appropriate web services [1], is put forward.
Nevertheless, there are several immature aspects in this area which need further
improvement. Specifically, as semantic services become more ambitious, there is
increasing demand for principled approaches to the formal representation under
uncertainty, such as incompleteness, randomness, vagueness, ambiguity and
inconsistency [2]. All these require reasonable semantic expression and enhanced
semantic inference. However, there are not enough existing theories and practices to
solve these problems well.

1
Corresponding Author. Dun LI, School of Information Engineering, Zhengzhou University,
Zhengzhou 450001, China ; E-mail: ielidun@zzu.edu.cn; iedli@zzu.edu.cn.
72 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2

Multi-Entity Bayesian Networks (MEBN) [3] is a theoretically rich language that


expressively handles semantics analysis and effectively model uncertainty management.
Albert its practical usefulness in many aspects, but MEBN lacks the capability of
modeling the fuzzy knowledge and concept. To address this problem, the Fuzzy MEBN
(Fuzzy Multi Entity Bayesian Networks) [4-5] has been proposed in recent years, it is
able to deal with ambiguous semantics and uncertain causal relationships between
knowledge entities [5]. In this paper, we present an ontology-based Fuzzy MEBN
solution termed as Fuzzy PR-OWL (Fuzzy Probability Web Ontology Language), an
extension to the OWL2 [6]. This is an attempt at modeling both probability and fuzzy
information by ontology.
To present more details, the structure of this paper is as follows. Section 1
comparative analyzes BN and MEBN and illustrates the advantages of MEBN as well
as Fuzzy MEBN. Then the Fuzzy MEBN ontology (Fuzzy PR-OWL) is illustrated in
Section 2. Section 3 presents the representation of domain ontology using Fuzzy
PR-OWL in RDF graph form. Finally, we set out some conclusions along with future
works in Section 4.

1. Related Research

1.1. BN and MEBN

The main models of the current uncertainty representation and reasoning of semantic
web are Probabilistic and Dempster-Shafer Models, Fuzzy and Possibilistic Models [1],
etc. The representative models in probabilistic models are mainly BN and MEBN,
which ontology languages based on are BayesOWL [7] and PR-OWL2 [8-9].
BN has the ability to deal with uncertain and probabilistic events and incomplete
data set according to the causality or other type of relationships in events. However,
standard BNs has limitations of the relational information representation. Figure 1a
shows a BN that represents the probability knowledge of bronchitis, that is, smoking
may cause bronchitis, and colds that may be incurred by factors like bad weather can
also lead to airway inflammation. The BN clearly shows causation of the patient’s
illness, but it cannot represent relational information such as the effect of harmful gas
produced by others smoking on the patient. While MEBN takes advantage of first-order
logic that makes it overcome the limitations of BN. Figure 1b, where ovals present
resident node, trapezoids present input node, and pentagons present context node,
shows that person and other are entities of the class Person and context rule
other=peopleAround (person), which may link to another MFrag, defines other is
people around the person. So MEBN can represent the relationship between entities
and take effect of others’ smoking on the probability of the patient having bronchitis
into account via parent node getCold (other).
In reality, however, the experience or knowledge of human beings is
characteristic of fuzziness that can’t be dealt with by MEBN. As the example above,
the impact of slight cold must differ from a bad cold. Though MEBN can represent the
possibility of getting a cold, for instance, getCold{true 1, false 0} where 1 and 0
represent probabilities, it cannot represent the degree of cold. Another situation of
value of resident node is state. For example, suppose the weather has two states {sunny,
cloudy}. MEBN assigns the probability to these states, say {sunny 0.5, cloudy 0.5}, but
situations like partly cloudy can’t be dealt with by MEBN.
Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2 73

Figure 1. Representing bronchitis knowledge in BN and MEBN

1.2. Fuzzy MEBN

Fuzzy MEBN redefines the semantics specifications of normal MEBN by incorporating


concepts of First-order Fuzzy Logic (FOFL) [10]. Therefore, contextual constraints of
MEBN are generalized in a way to represent the ambiguity which is usually delivered
with the imperfect semantic information. Moreover, Fuzzy MEBN updates regular BN
of MEBN to Fuzzy Bayesian Networks (FBN). Therefore, the fuzzy or ambiguity
information in section 1.1, that MEBN lacks the capability to process, can be dealt with
by Fuzzy MEBN. For example, slight cold can be represented as {true0.3 1, false0.7 0}
where the subscripts denote true values and partly cloudy can be set {sunny0.6 0.5,
cloudy0.4 0.5} where the subscripts denote membership degree.
The major differences between Fuzzy MEBN and MEBN are, in Fuzzy MEBN,
phenomenal (non-logical) constant symbols and Entity identifier symbols are followed
by a real-valued membership degree subscript within region [0,1], such as Vehicle0.85
and !V4280.75, and truth value symbols or logical findings are assigned a truth value or
from the predefined finite chain of truth values  = 〈! , " , … , # 〉.
The building blocks of a MEBN Theory (MTheory) are MEBN Fragments
(MFrags) that semantically and causally represent a specific notion of the knowledge.
Basic model of Fuzzy MEBN is similar to those of regular MEBN. The FMFrag can
define a probability distribution and some fuzzy rules of a resident node given
input/parent and context nodes.
A Fuzzy MFrag (FMFrag) [5] F = (C, I, R,G,D,S) consists of three kinds of nodes,
that is context nodes C, input nodes I and resident nodes R. Context nodes using
FOFL sentences to represent semantic structures of knowledge. Input nodes connect
resident nodes in other FMFrags, Finally, resident nodes are random variables
conditional on the values of the context and input nodes. Besides, G represents an
FMFrag graph set, D contains local distributions one for each resident node, and the
set S of fuzzy if-then rules used by the Fuzzy Inference System (FIS). It is worth
noted that the sets C, R and I are pairwise disjoint, and graph G is a directed acyclic
graph whose nodes belong to I∪R, and the root nodes correspond to members of I
only.
74 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2

2. Fuzzy PR-OWL

2.1. Elements

Figure 2 shows the classes of ontology language Fuzzy PR-OWL created by


protégé-4.1 [11]. Fuzzy PR-OWL extends PR-OWL with some properties and classes
such as fuzzy random variable, fuzzy states, membership degree, and fuzzy rule sets
(FRS) to increase expressive power.
2.Finding 2.Finding
2.Domain Resident Input
Resident
2.Generative
2.Finding Input
2.Resident
FMFrag 2.Input 2.If-Part

2.Domain 4.Membership
FMFrag 2.Context 2.If-Then Rules
1.FMFrag 1.Node
2.FExemplar 2.Then-Part
Argument 1.FRS
2.OVArgument 4.Probability 2.State Assignment
Assignment
4.FArgument 1.FMTheory
2.FConstant 2.Declarative
Argument 1.Probability Distribution
Distribution
2.FMapping 4.conditioning
Argument 2.FMExpression
1.Fuzzy state
Argument 2.FPR-OWL
1.FRandom Variable MExpression
table
3.Ordinary
Variable 3.Fuzzy 2.TrueValue
LogicalOperator 2.Simple FMExpression
FMExpression
2.TrueValue
Random Variable 1. (Main Classes/elements)
3.FExemplar 2. (SubClasses)

3.Quantifier 3. (Built-in Elements)

4. (Reified Relationships)

Figure 2. Elements of Fuzzy PR-OWL


Table 1 presents corresponding relationships between the elements of Fuzzy
MEBN, FOFL [12] and FuzzyPR-OWL. As shown in Table 1, the ontology proposed
in this paper can be represented as the sentence of Fuzzy MEBN based on FOFL.
Table 1. Corresponding relationships between FOFL, Fuzzy MEBN and FuzzyPR-OWL

Fuzzy MEBN FOFL FuzzyPR-OWL


Symbols for the general  ,
Quantifiers Class:Quantifier
existential quantifiers ∃
Ordinary variable symbols Variables x, y,… Class:OrdinaryVariable
Phenomenal constant symbols Constants: c, d,… Class: ConstantArgument
Truth value symbols Symbols for truth value: a; Class: TrueValueRandomVariable
Data Property: hasUID
Entity identifier symbols
(Range: Thing, Domain: string)
Binary connectives: v*, ∧∗ ,&*,
Logical connectives Class: FLogicalOperator
Ÿ* ,equality operator =
Class: FindingFMFrag,
Findings
FindingResidentNode
Domain- Logical random
n- ary predicate symbols p, q,… Class: TrueValueRandomVariable
specific variables
random Phenomenal random
n-ary functional symbols f, g,… Class: FRandomVariable
variable variables

2.2. Syntax

An overview about the basic model of the Fuzzy PR-OWL is described in Figure 3. In
this diagram, an oval and an arrow represent general class and major relationship
Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2 75

between classes, respectively. A probabilistic ontology has at least one individual of


class FMTheory which contains a group of FMFrags. For the syntax of FuzzyPR-OWL,
the link is expressed via the object property has FMFrag.
Individuals of class FMFrag are comprised of nodes. Each individual of class
Node is a random variable. Compared with PR-OWL, the major difference of this
ontology is using the FRS to define membership degrees of fuzzy states. The object
property hasFRS links one node to one or many FRS. Represented by class Probability
Distribution, the unconditional or conditional probability distributions of the random
variable are linked to its respective nodes via the object property hasProbDist. Finally,
logical expressions based on FOFL or simple expression that can describe random
variables in which arguments that may refer to entities are represented by class
FMExpression and linked to nodes via the object property hasFMExpression.
Includes
FMTheory FMFrag
(hasFMFrag)

Is built from
(hasNode)

Has context constraints Is defined by Probability


FMExpression Node
(hasFMExpression) (hasProbDist) Distribution

Has rules
(hasFRS)

Fuzzy Rule Set

Figure 3. Basic model of Fuzzy PR-OWL


Syntax of Fuzzy PR-OWL extends the abstract syntax of OWL. The syntax rules
can be defined as the Extended Backus Naur Form(EBNF) where definition symbols
can be represented via sign ::=, terminal symbols are enclosed with quotation marks
followed by a semicolon as terminating character, an alternative symbol can be
represented by the vertical bar | , an option where everything may present just once or
never and expressions that may be omitted or repeated can be represented through
squared brackets [...] and curly braces {...}, respectively. In this paper, expressions
that repeat or only present once can be represented through curly braces {...} + and
FMTheory needs URI reference to be identified. The fundamental structure of Fuzzy
PR-OWL presented as follows:
FMTheory ::= ‘FMTheory(’ [URI reference] |annotation| {FMFrag}+ ‘)’;
FMFrag ::= ‘FMFrag(’ FMFrag_id ‘,’ {Node}+ ‘,’{ ParentRel} ‘)’;
Node ::= ‘Node (’ Node_id ‘,’ FMExpression [‘,’ ProbilityDistribution ‘,’ { If-ThenRule } ] ‘)’;
ParentRel ::= ‘hasParent (’ Node ‘,’ Node ’)’;
*_id ::=’ UID(‘ letter { letter | digit } ’)’;
In literature [2], FRS can be defined in the form of if-then rules. For example, the
conditional probability of a variable with two parents can be shown as -(. = /! |0! =
-, 0" = 1). For the form of an if-then rule, such relation can be indicated as: ‘If P 1 is p
and P 2 is q, then V is v 1 ’, wherein all the states are fuzzy states which are denoted as
4 = [{056 }78 ] where 056 and 6 are the probability distribution and membership
degree of ith state, respectively. While according to the FBN formula of MEBN in
literature [4], the FRS only defines membership degree. Therefore, probabilities and
membership degrees are respectively defined through conditional distributions and
if-then rules in this paper. Next we will present models of conditional probability
distributions, FRS and fuzzy expression to show the syntactic structure of Fuzzy
PR-OWL.
76 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2

x Conditional Probability Distributions


A node’s probability distribution depends on the state configuration of its parents.
PR-OWL2 uses string to present the probability distribution, but this approach need
syntactic parser to analyze declarative syntax. To embed more probability information
to ontology, this paper describes probability distribution by ontology.
In FuzzyPR-OWL, class ProbabilityAssignment indicates the assignment of
probabilities in condition of parents’ state which presented by class ConditioningState.
Class StateAssignment indicates assignments of a state, such as state name and
probability, as illustrated in Figure 4. The basic structure is defined below:
ProbabilityDistribution ::= FPR-OWLTable;
FPR-OWLTable ::= ‘FPR-OWLTable(’PRTable_id‘,’{ProbabilityAssignment }+‘)’;
ProbabilityAssignment ::=‘ProbabilityAssignment(’ProbabilityAssignment_id‘,’
{StateAssignment}+ [’,’ { CondingtioningState}+ ] ‘)’;
StateAssignment ::= ‘State (stateName(’ string’)’ [‘, stateProbability (’float’)’] ‘)’;
CondingtioningState ::= ‘CondState(’ Node_id ‘,’ StateAssignment ‘)’;
string ::= letter { letter | digit };
Probability ResidentNode
Distribution FRS
/InputNode

ResidentNode * 1 *
/InputNode FPR-OWL Table
1 1 1 1 If-Then Rule
* *
1 1
* 1 1 1
Probability
*

ConditioningState StateAssignment * 1
Assignment
1 1 If-Part StateAssignment Then-Part
1 1 1 *

Figure 4. The model of conditional distributions Figure 5. The model of FRS


x FRS
Fuzzy PR-OWL adopts If-Then Rule to define FRS and constrain membership
degree of fuzzy states. As showed in Figure 5, an If-Then Rule of resident nodes may
include one or more If-Part and a Then-Part. Every instance of If-Part corresponds to
an assumption of a parent node and the instance of Then-Part corresponds to the
assignment of fuzzy states in resident node. The structure of If-Then Rule is defined
below:
FRS ::= If-ThenRule;
If-ThenRule ::= ‘If-ThenRule(’ If-ThenRule_id ‘, ’ {If-Part}+ ‘,’ Then-Part ‘)’;
If-Part ::= ‘if (’Node_id‘, ’ StateAssignment ‘)’;
Then-Part ::= ‘then(’ {StateAssignment}+ ’)’;
StateAssignment ::= ‘State (stateName(’ string’)’ [‘,’MembershipDegree] ‘) ’;
MembershipDegree ::=’Membership( degree(’ float ‘)’[‘ ,descript(’ string’)’];
1
*
* FMFrag
1 Ordinary
Node 1 Variable
*
1 *
1 Exemplar 1
Exemplar
1 FArgument
*

FMExpression
FArgument
*
1 *
*
RandomVariable *
1

Figure 6. The model of fuzzy expression


Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2 77

x Fuzzy Expression
As showed in Figure 6, this part proposes the model of fuzzy expression that can
represent the constraints or fuzzy relationships between entities.
The expression represents the relationship between entities in FuzzyPR-OWL
wherein the class FMExpression can present true value expression of the context node
or simple expression of other kind of nodes. The former indicates logical expressions
based on FOFL and the latter can be deemed as random variables of input nodes or
resident nodes with some arguments. The class Exemplar here indicates the general or
existential quantifiers in fuzzy expression which is in the form of Skolem. The
structure of fuzzy expression is defined below:
FMExpression::= ‘FMExpression(’FMExpression_id‘,’[ exits|forAll Exemplar_id ’,’] Expression ‘)’;
Expression::= Term [“and”|”or” Term] [“=”Term] [”implies”|’iff’ Term];
Term::=[“not”] RandomVariable_id [’(’Argument_id{,Argument_id}+’)’] | FMExpression_id|
OrdinaryVariable_id;
RandomVariable::=‘RandomVariable(’ RandomVariable_id ‘, hasPossibleValues(’ {URI
reference}+’)’ [‘,defineUcertaintyOf(’URI’)’] [‘,probDistr(‘PrTable_id’)’]
[‘,trueValue(‘float’)’]’)’;
OrdinaryVariable::=‘OrdinaryVariable(’ OrdinaryVariable_id ‘, ( class(’ DomainClass_URI ’))’;
Argument ::= ‘Argument(’Argument_id‘,’[‘type(’Thing‘)’][‘,typeOfData(’ Literal’)’ ]
[‘,‘MembershipDegree ‘])’;
Exemplar ::= ‘Exemplar (‘Exemplar_id ‘,’[‘type(’ Thing ‘)’] [‘,typeOfData(’ Literal’)’ ] )’;
[‘,’MembershipDegree ])’;

2.3. Semantic

The structure of FuzzyPR-OWL language LF based on Fuzzy MEBN is defined by


interpretations of FOFL TF [11].
The structure 9 = 〈9: , 0?>; , … ; A?>; , … ; BC, /C, … 〉is a 4-tuple with the follow structure:
x DI is a nonempty set called the domain of the structure;
x {0 CCCC
D } are n-ary relations adjoined to each n-ary predicate symbol{06 }㧧
# #

x EA CCDC#CFare n-ary (ordinary) functions defined on D I and adjoined to each n-ary


functional symbol{A6# }㧧
x BC, /̅ , … ∈ 9: are elements which are assigned to each constant u, v of the
language LF.
Assume that LF contains one constant 5 ∈ HI associated with each element
5̅ ∈ 9: (a name of d). Let B ∈ HI be a constant. Then its interpretation is an
element J(B) ∈ 9: . Let A CCC#C be a function assigned to A # and L! , L" , . . . , L# be terms
K
without variables, then JA # (L! , L" , . . . , L# ) = CACKC#C(L! , L" , . . . , L# ).
The fuzzy function, defined in fuzzy set theory, can be deemed as special fuzzy
relations. Note that functional symbols are introduced for the sake of completeness,
since they can be replaced by special predicates [11]. Consider the corresponding
relationships between elements of LF and that of TF shown in table 2, the definition of
D can be further illustrated below.
x LF using entity identifier symbols N identifies entities or elements which are
assigned to the constants.
x The phenomenal random variables and logical random variables in LF
represent the fuzzy functions and predicates respectively. The former possible
78 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2

values can be N ∪ {⊥}, and the latter can be either a real number  ∈ [0,1], or
a member of the chain  = 〈! , " , … , # 〉.
The random variables mentioned here can be represented as the expressions in
section 2.2. The probability and the membership degree of possible values of functions
can be assigned by joint probability distribution, and the If-Then rules, respectively. LF
uses phenomenal random variable with n-ary arguments to represent a function. The
(V ) (V ) (V )
function A ̅: ∆→7 Umaps a vector of entity identifier symbols ∆= 〈N! W , N"  , … , N# X 〉,
like input arguments, into the vector of identifier symbols U =
(Z ) (Z ) (Z )
〈Y! W , Y"  , … , Y\ ^ 〉,like fuzzy state or fuzzy value assignment, where the value of 
for various arrangements of arguments and possible values are predefined in the
language by the fuzzy interpretation of A,̅ which can also be represented as the fuzzy
relation [12] in which the truth values of a relation of input set are resulted, that is
_: 〈`, U〉 →  ∈ {! , " , … , # }. By the matching of domain entity identifier symbols and
domain entities, the function or relation can map the n-ary vector of domain entities
into the entities for phenomenal random variables or true values of domain assertions
for logical random variables.

3. Use Case

In the Equipment Diagnosis problem, the belt status and room temperature can affect
the engine status. This problem represented by an EngineStatus FMFrag is shown in
figure 7. In the figure, Isa(Machine,m) represents that m is an instance of machine,
EngineStatus(m), BeltStatus(b) and RoomTemp(r) represent the engine status of
machine m, status of belt b and temperature of room r, respectively. Suppose that the
engine status node has a local distribution shown in table 2 where superscripts denote
the membership degree.
Table 2 Local distribution of EngineStatus FMFrag
RoomTemp(r) BeltStatus(b) EngineStatus(m)
(Normalα1;Hotα2) (OKβ1;Brokenβ2) Satisfactoryα1 Overheatedα2
Normal OK 0.8 0.2 0
Normal Broken 0.6 0.4 0
… … … … …

isA(m,Machine) isA(r,Room)
...
m=BeltLocation(b) Isa(Belt,b)
MachineLocation(m)
R=MachineLocation(m)
MachineLocation_FMFrag

BeltStatus(b) RoomTemp(r)

EngineStatus(m)

EngineStatus_FMFrag

EquipmentDiagnosis_FMTheory

Figure 7. EngineStatus FMFrag


Representing the probability distribution of EngineStatus in the FuzzyPR-OWL is
shown in Figure 8. The upper large parallelogram and the nether large parallelogram
represent Fuzzy PR-OWL ontology and domain ontology, respectively. This figure
shows part of information in table 2 that is the probability distribution of node
Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2 79

EngineStatus which includes the probability assignment of states like OverHeated once
conditioning state of parent BeltStatus is OK.
fpr:Probability fpr:hasProbability fpr:Probability fpr:InputNode
Distribution Assignment Assignment (ResidentNode)

fpr:hasState fpr:hasCond
fpr:hasCondNode
Assign State

fpr:hasState fpr:Conditioning
fpr:StateAssignment
fpr:hasState Assign State
string Name fpr:hasState
Prob
float

subClassOf InstanceOf InstanceOf InstanceOf InstanceOf


Domain Ontology
es:EngineStatus fpr:hasProbability es:EngineStatus es:BeltStatus
Table Assignment PA1 InputNode
fpr:hasState fpr:hasCond
0.2 Assign fpr:hasCondNode
fpr:hasState State
Prob
fpr:hasState es:EngineStatusSA1 es:EngineStatus
Overheated Name ConditioningState
fpr:hasState
Assign
fpr:hasState
OK es:BeltStatusSA1
Name Parent Node EngineStatus

...
… BeltStatus Overheated …
… OK 0.2 …

Figure 8. Representation of the probability distribution


The FRS of Engine Status which denotes the membership degree of states in
condition of assignment of parent nodes is shown in Figure 9. The RDF graph shows if
state of parent node BeltStatus is Normal OK (suppose words that describe degree
include Very, Normal, A little), then membership degree of state Overheated for
EngineStatus is 0.5.
FuzzyPR-OWL ontology
fpr:hasThen
fpr:Then-Part Part
fpr:FRS fpr:ResidentNode
\InputNode
fpr:hasState
Assign fpr:hasIfPart
string fpr:hasCondNode
fpr:hasState
Name
fpr:StateAssignmt fpr:hasState
fpr:Membership fpr:hasMembership fpr:If-Part
Assign
fpr:hasMembership
Degree
float

InstanceOf InstanceOf InstanceOf SubClassOf InstanceOf InstanceOf


Domain Ontology
Es:EngineStatus fpr:hasThen Es:EngineStatus
Then-Part Part
es:BeltStatus
IfThenRule1
ResidentNode
fpr:hasState
Overheated fpr:hasIfPart fpr:hasCondNode
fpr:hasState Assign
Name
es:EngineStatus es:EngineStatus es:EngineStatus
Membership1 fpr:hasMembership
SA1 If-Part
fpr:hasMembership fpr:hasState
Degree Assign
es:BeltStatus
0.5 SA1
fpr:hasState If BeltStatus is Nomal OK and ,
fpr:hasDegreeDiscription Name
OK then degreeOf(EngineStatus) is
Normal {Overheated 0.5 , }

Figure 9. Representation of FRS


es:EquipmentDiagnosis_ r=MachineLocation(m)
FMTheory
fpr:hasFMFrag fpr:hasFMFrag

es:DomainFMFrag.Enginestate es:DomainFMFrag.BeltLocation

fpr:hasResidentNode
fpr:hasContextNode
fpr:hasOV fpr:hasOV
es:BeltLocation
es:ContextNode_C es:Enginestate_ _DomainRes ...
es:Enginestate_
X Mfrag.mechine
Mfrag.room fpr:hasFMExp
fpr:hasFMExp fpr:isSubstutedBy es:MachineLoc_
fpr:typeOfFArg FMExp
fpr:isSubstutedBy
File:/...#Machine
fpr:typeOfFArg es:FMExpression_CX1 fpr:typeOfFMExp
es:CX1_2_
File:/...Engine.owl es:MachineLoc_
inner_1
#Room fpr:hasFArg fpr:hasFArg RandomVariable
fpr:typeOfFMExp
es:CX1_1 fpr:hasFArg
es:CX1_2 fpr:typeOfFMExp
fpr:typeOfFArg
fpr:hasArgNumber es:equalTo es:CX1_2_inner_
fpr:hasArgNumber FMExp
1
fpr:hasPossibleValue 2

0.9

Figure 10. Representation of fuzzy expression


The fuzzy expression r=MachineLocation(m) of the context node is shown in
Figure 10. The expression defines the relation between room r and machine m which is
80 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2

connected to another FMFrag MachineLocation. The deep color ovals constitute the
main parts of the expression, including logical connective equalTo with a truth value,
arguments CX1_1 and CX1_2 which respectively correspond to the ordinary variable
room in EngineStatus FMFrag and random variable MachineLocation(m) in
BeltLocation FMFrag.

4. Conclusion

Representation and reasoning of uncertain knowledge is one of the goals in the


semantic web area. Probability ontology language based on OWL2 is envisioned as an
important approach to achieve this goal. In view of the weakness of the ontology
language on lacking the ability of synchronously modeling probability and fuzzy
knowledge, this paper proposed the Fuzzy PR-OWL ontology language based on Fuzzy
MEBN that adds expressive power of widespread fuzzy knowledge in PR-OWL2
related domain. Domain cases in the last part show that Fuzzy PR-OWL can represent
the probability or fuzzy information in a specific domain well.
As for future work, we intend to construct a reasoning frame for Fuzzy PR-OWL
by studying more about FOFL and fuzzy BN theory and improve it continuously.

Acknowledgment

This work was funded by the key scientific and technological project of Henan
Province (162102310616)

References

[1] P. Michael, Uncertainty Reasoning for the Semantic Web III, Springer International Publishing, 2013.
[2] K. J. Laskey and K. B. Laskey, Uncertainty Reasoning for the World Wide Web: Report on the
URW3-XG Incubator Group, International Workshop on Uncertainty Reasoning for the Semantic Web,
Karlsruhe, Germany, 2008.
[3] K. B. Laskey, MEBN: A language for first-order Bayesian knowledge bases, Artificial
Intelligence, 172(2008):140-178.
[4] K. Golestan, F. Karray, and M. S. Kamel, High level information fusion through a fuzzy extension to
Multi-Entity Bayesian Networks in Vehicular Ad-hoc Networks, International Conference on
Information Fusion, (2013):1180-1187.
[5] K. Golestan, F. Karray, and M. S. Kamel, Fuzzy Multi Entity Bayesian Networks: A Model for Imprecise
Knowledge Representation and Reasoning in High-Level Information Fusion, IEEE International
Conference on Fuzzy Systems, (2014):1678-1685.
[6] P. Hitzler, et al, OWL2 Web Ontology Language Primer(Second edition) (2015).
[7] Z. L. Ding, and Y. Peng, A Probabilistic Extension to Ontology Language OWL, Hawaii International
Conference on System Sciences, 4(2004):40111a-40111a.
[8] P. C. Costa, G. Da, K. B. Laskey and K. J. Laskey, PR-OWL: A Bayesian Ontology Language for the
Semantic Web, Uncertainty Reasoning for the Semantic Web I:, ISWC International Workshop, URSW
2005-2007, Revised Selected and Invited Papers, (2008):88-107.
[9] N. C. Rommel, K. B. Laskey, and P. C. G. Costa, PR-OWL2.0 – Bridging the Gap to OWL
Semantics, Uncertainty Reasoning for the Semantic Web II, Springer, Berlin Heidelberg, (2013):1-18.
[10] V. Novák, On the syntactico-semantical completeness of first-order fuzzy logic, Kybernetika
-Praha- 2(1990):47-66.
[11] N. F. Noy, et al, Creating semantic web contents with protégé-2000, IEEE Intelligent Systems, 16
(2001): 60–71.
[12] W. Gueaieb, Soft computing and intelligent systems design: Theory, tools and applications, Neural
Networks IEEE Transactions on, 17(2004):825-825.
Fuzzy Systems and Data Mining II 81
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-81

State Assessment of Oil-Paper Insulation


Based on Fuzzy Rough Sets
De-Hua HE1, Jin-Ding CAI, Song XIE, Qing-Mei ZENG
College of Electrical Engineering and Automation, Fuzhou University, Fuzhou, China

Abstract. Return voltage method (RVM) is a good method to study aging state of
transformer insulation, but it is difficult to accurately assess the insulation aging
state by a single characteristic quantity. In this paper, the fuzzy rough sets theory
combined with RVM is proposed to assess the oil paper insulation state of
transformer and construct the assessment system of oil paper insulation of
transformer based on a lot of test data. First, the evaluation index of oil-paper
insulation status of transformer is established by return voltage characteristic
parameters. Then, fuzzy c-means clustering algorithm is used to obtain the
membership function of the transformer test data along with fuzzy partition of
characteristics .Moreover, the fuzzy attributes of assessment table of oil paper
insulation statue is simplified according to the distinct matrix,and it extracts the
evaluation rule of oil paper insulation condition. Finally, the examples in this
paper demonstrate that the assessment system is effective and feasible, which
provides a new idea for the assessment of transformer oil-paper insulation state.
The research has practical value in application of engineering

Keywords. Return voltage, fuzzy rough sets, fuzzy C means clustering

Introduction

Transformers play a vital role in the whole electrical power system. Due to a large
number of transformers within electric utilities are approaching the end of their design
life, there has been a growing interest in the condition assessment of transformer
insulation currently. The degradation of the main insulation system in transformer is
recognized to be one of the major causes of transformer breakdown [1- 3].
Methods based on the analysis of electrical polarization in dielectrics are often
used in the diagnostics of paper-oil insulation state. Three parameters customarily were
selected to assess the oil-paper insulation [4-5]. However, due to the characteristics of
insulation aging affected by a variety of factors, it is difficult to accurately assess the
insulation aging state by a single feature. The grey correlation method was introduced
for the insulation condition assessment [6], but did not consider the amount of
redundant characteristics in condition assessment of oil paper insulation, the assesse
process is complicated.
In this paper, fuzzy rough set theory is introduced and multiple characteristics are
considered to comprehensive assesse the condition of oil paper insulation. The method
solves the problem of partial information is incomplete and unknown. It has been used

1
Corresponding Author: De-Hua HE, College of Electrical Engineering and Automation, Fuzhou
University, Fuzhou, China; E-mail:153367542@qq.com.
82 D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets

that the fuzzy C means clustering algorithm (FCM) in disperse important data category
to form classification attribution [7]. The characteristics fuzzy rules and the insulation
assessment system are established based on historical database.

1. Theory of Fuzzy Rough Sets

Rough set theory is a powerful tool in dealing with vague and uncertain information.
The basic idea of the fuzzy rough model is that a fuzzy similarity relation is used to
construct the fuzzy lower and upper approximations of a decision. The sizes of the
lower and upper approximations reflect the discriminating capability of a feature subset.
The union of fuzzy lower approximations forms the fuzzy positive region of decision.
Let a universe U as a finite nonempty set of objects. Each object within U is defined by
a set of attributes, denoted by A. The pair (U, A) is an information system (IS), where
for every subset P⊆A there exist an associated similarity relation. ǴRp(x,y) denote the
similarity of objects x, and y induced by the subset of features p. Given X⊆U, X can be
approximated by the information contained in P through the construction of the P-
lower and P-upper approximations of X as defined in Eqs. (1):

P R p X _ ( x) inf I ( P R p ( x, y ), P X ( y))
yU

P R p X ( x) sup T ( P R p ( x, y ), P X ( y ))

(1)
yU

Where I represents the fuzzy implicator and T is the t-norm, and Rp is the fuzzy
similarity relation induced by the subset of features P. The degree of similarity of
object with respect to subset of features can be constructed using Eq. (2)

PR p ( x, y) TaP ^PRa ( x, y)`


(2)

where μRa(x,y) is the degree to which objects x and y are similar for feature a. It
employs a quality measure termed the fuzzy-rough dependency function γP(Q) that
measures the dependency between two sets of attributes P and Q, which is defined by:

P POS ( x) ¦P POS RP ( x)
J Pi (Q ) RP xU

U U
(3)

where the fuzzy positive region, which contains all objects of U that can be
classified into classes of U/Q using the information in P, is defined as:

P posR p (Q ) ( x) sup X U/ Q P R P X ( x)


( )
D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets 83

2. Attributes Reduction Based on Rough Sets Theory

Not all attributes are necessary for assessment of oil-paper insulation system, removal
of these extra features and the amount of property vague language entry does not affect
the original oil-paper insulation diagnostic effect. The discernibility matrix can be
reduction of condition attributes and attribute values. Specific reduction steps are as
follows: 1: Calculate the similarity relation of fuzzy attributes Ck

­min{Ck ( xi ), Ck ( x j )} Ck ( xi ) z Ck ( x j )
Rk ( xi , x j ) ®
¯ 1 Ck ( xi ) Ck ( x j )
(5)

2: Calculate all fuzzy similarity relation Sim(R): Sim(R) ∩ R │R ЩR 3:


Calculate the matrix evaluation system M(U,R)=( cij) n×n,:

°^Rk :1  Rk ( xi , x j ) t Oi ` Oi t O j
­
cij ®
°̄ ‡ Oi  O j
( )

where: λi=Sim(R)*([xi]Q)(xi);λj=Sim(R Щ U)*([xi]Q)(xj) 㧘 [x]Q(x) Щ U/Q ‫ ޕ‬4: fD


(U,R)= ш { щ (cij): cij ≠ Ø} 㧧 5: gD(U,R)=( ш R1) щ ̖ щ ( ш Rl) 㧧 6:output
RedD(R)={ R1,̖, Rl}㧧7: Building assessment rules table, delete duplicate evaluation
rule, extract oil-paper insulation condition assessment rules.

3. Membership of Characteristic

In this paper, FCM is used to calculate the cluster center of each cluster and
membership of transformer test data. Let (U,PыQ)be(a fuzzy decision system with
U={x1, x2,̖, xn}, fuzzy condition attributes P is divided into three categories, and
cluster centers V={v1,v2, v3},The relationship between sample and cluster centers can
be expressed by membership degree. Membership function is obtained by the algorithm,
and then membership degree matrix μ is obtained:

ª P11 P1 j P1n º
« »
P « P21 P2 j P2 n » j 1,
1 ,n
« P31 P3 j P3n »¼
¬ ( )

2
(1/ x j  vi )1/ m1
Pij 3

¦ (1/
2
x j  vc )1/ m 1
c 1 ( )

The iteration objective function is:


84 D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets

3 n

¦¦ (P )
2
min J ( Pij , vn ) ij
m
x j  vi
i 1 j 1 (9)

The calculation formula of cluster center is:

n
1
vi n ¦ (P ij ) m x j , i 1, 2,3
¦ (P
j 1
ij )m j 1

(1 )

4. Assessment of Oil Paper Insulation Based on Fuzzy Rough Sets


The test data and ageing information of transformers are shown in Table 1. The P1‫ޔ‬P2‫ޔ‬
P3‫ޔ‬P4 and P5 is condition attribute of tcdom Urmp Srmax Rg Cg, respectively, Q
is the fuzzy decision attribute of oil paper insulation. According to the relevant
regulations power equipment, the transformer insulation is divided into good (B) and
bad (G). The characteristics are divided into 15 fuzzy attributes Ck(k=1,2,…,15). The
membership of the fuzzy attribute Ck of the test data is obtained by the FCM algorithm.
The membership is listed in Table 2.
Table 1. Return voltage test sample data of transformers

Trafo tcdom/s Urmp/V Srmax Rg/GΩ Cg/nF State


x1 2518 183.5 31.20 12.26 92.17 G
x2 546.6 353.4 257.1 1.96 186.9 B
x3 1214 256.0 96.4 4.899 109.3 G
x4 667.4 385.5 293.2 1.440 190.1 B
x5 2415 175.0 32.11 13.35 70.36 G
x6 1226 248.2 87.66 4.026 106.8 G
x7 649.5 363.4 179.2 2.743 169.9 B
x8 3613 269.4 80.10 11.00 45.23 G
x9 333.7 32.60 74.02 2.830 64.38 B
x10 3540 223.4 23.70 3.682 99.51 B
x11 1265 236.1 44.50 1.537 235.3 B
x12 2655 218.5 67.72 11.77 80.40 G
x13 896.9 169.7 120.5 2.885 149.8 B
x14 1524 320.3 79.24 2.832 183.0 G
x15 3289 239.7 19.71 13.05 47.88 G
x16 700.6 83.45 46.40 2.339 125.7 G
x17 189.1 313.8 54.50 1.253 168.8 B
x18 2706 110.7 32.43 12.40 127.1 G

According to Eqs.(1) and (4), And the most important for assessment is P4,
followed by P3, P5, P1 and P2, and the set is calculated by attributes reduction
D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets 85

algorithms. The set is {C3,C4,C8,C9,C10,C12,C15},which has removed redundant of


attribute, decision rules are listed in Table 3, the elements in the table are membership
of interval .Taking 3 transformers not in historical database as example, the basic
information are shown in Table 4. According to the insulation assessment process, the
membership of transformers is obtained, and the results are shown in Table 5. The
membership degree of Transformer T1 match the rule 1, based on the assessment rules,
the insulation of transformer T1 is well, does not need maintenance. Membership
degree of T2 match the assessment rule 6, according to the rules, insulation of T2 aging
serious, need maintenance. Membership degree of T3 match the assessment rules 9,
insulation of T3 aging serious. The diagnosis results of T3 is well judged by the method
proposed in the reference [4], the result is different from the actual condition. Three
diagnosis results are in perfect agreement with the actual condition; results have
verified the method based on fuzzy rough sets theory is effective and accurate.
Table 2. Membership function of partial fuzzy attributes

P1(10-2) P2 (10-2) P3(10-2) P4(10-2) P5(10-2)


T L M H L M H L M H L M H L M H Q
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15
1 5 13 81 8 87 4 99 0 0 0 0 99 37 59 2 G
2 99 0 0 0 0 99 0 0 99 98 1 0 0 1 97 B
3 1 98 0 3 2 13 2 96 0 8 90 1 1 98 0 G
4 95 3 0 1 5 93 1 1 97 91 8 0 0 0 98 B
5 7 20 72 14 80 5 99 0 0 0 1 97 97 2 0 G
6 1 98 0 2 89 8 0 99 0 0 99 0 3 96 0 G
7 97 2 0 0 1 98 16 38 45 88 11 0 4 18 76 B
8 3 6 90 4 66 28 1 98 0 2 3 94 92 6 1 G
9 94 4 0 94 3 1 8 90 0 82 17 0 99 0 0 B
10 2 5 92 0 99 0 96 4 0 7 91 0 14 83 2 B
11 0 99 0 1 96 2 95 4 0 92 7 0 4 8 86 B
12 2 6 91 0 99 0 24 75 0 0 0 99 79 18 1 G
13 55 43 1 18 75 5 13 81 4 78 21 0 9 57 32 B
14 5 92 2 0 6 93 2 97 0 82 17 0 1 3 94 G
15 1 1 97 1 94 3 94 5 0 0 0 98 93 5 1 G
16 92 6 0 98 1 0 92 7 0 99 0 0 3 94 2 G
17 89 9 1 1 10 88 72 26 0 84 11 1 5 20 74 B
18 1 4 93 85 12 2 99 0 0 0 0 99 3 93 3 G

Table 3. The rules of insulation assessment system

P1(H) P2(L) P3(M) P3(H) P4(L) P4(H) P5(H)


Rule
C3 C4 C8 C9 C10 C12 C15 Q

1 (0.5 , 1) (0 , 0.5) (0 , 1) (0 , 0.5) (0 , 0.5) (0.5 , 1) (0 , 0.5) G


2 (0 , 0.5) (0 , 0.5) (0.5 , 1) (0 , 0.5) (0.5 , 1) (0 , 0.5) (0.5 , 1) G
86 D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets

3 (0 , 0.5) (0.5 , 1) (0 , 0.5) (0 , 0.5) (0.5 , 1) (0 , 0.5) (0 , 0.5) G


4 (0 , 0.5) (0, 0.5) (0.5 , 1) (0 , 0.5) (0 , 0.5) (0 , 0.5) (0 , 0.5) G
5 (0.5 , 1) (0 , 1) (0 , 0.5) (0 , 0.5) (0 , 0.5) (0.5 , 1) (0 , 0.5) G
6 (0 , 0.5) (0, 0.5) (0 , 0.5) (0 , 1) (0.5 , 1) (0 , 0.5) (0.5 , 1) B
7 (0 , 0.5) (0, 1) (0.5 , 1) (0 , 0.5) (0.5 , 1) (0 , 0.5) (0 , 0.5) B
8 (0 , 0.5) (0.5 , 1) (0.5 , 1) (0 , 0.5) (0.5 , 1) (0 , 0.5) (0 , 1) B
9 (0.5 , 1) (0 , 0.5) (0 , 0.5) (0 , 0.5) (0 , 0.5) (0 , 0.5) (0 , 0.5) B

Table 4. The basic information of power transformers

Traf Model years tcdom Urmp Srmax Rg/GΩ Cg/nF Furfur state
T1 SFSE-220 1 2314 230 27 15.74 90.35 0.06 Good
T2 SFP-220 14 1449 243 45 1.795 364.2 0.74 Bad
T3 cub-/220 22 3328 289 24 4.027 95.48 0.99 Bad

Table 5. The membership and assessment results of power transformers

Traf P1(H) P2(L) P3(M) P3(H) P4(L) P4(H) P5(H) Rule Result
T1 0.6684 0.0034 0.0123 0.0007 0.0001 0.9980 0.0236 1 G
T2 0.0052 0.0132 0.0418 0.0014 0.9883 0.0050 0.9932 6 B
T3 0.9787 0.0341 0.0245 0.0016 0.0007 0.0000 0.0170 9 B

5. Conclusion
To avoid a single characteristics impact the correctness of the insulation condition
assessment, the fuzzy rough sets theory combine with RVM is proposed and used to
assess the oil paper insulation of transformer. The results demonstrate that the
assessment system is effective and feasible which provides a new idea for the
assessment of transformer oil paper insulation.

References

[1] T. K. Saha, Review of modern diagnostic techniques for assessing insulation condition in aged
transformers, IEEE Trans. Dielectr. Electr. Insul. 10(2003), 903-917.
[2] M. de Nigris, R. Passaglia, R. Berti, L. Bergonzi and R. Maggi, Application of modern techniques for the
condition assessment of power transformers, CIGRE Session 2004, , France, Paper A2-207, 2004.
[3] W. G. Chen, J. Du, Y. Ling, et al. Air-gap discharge process partition in oil-paper insulation based on
energy-wavelet moment feature analysis. Chinese Journal of Scientific Instrument, 34(2013):1062-1069.
[4] Y. Zou, J. D. Cai. Study on the relationship between polarization spectrum characteristic quantity and
insulation condition of oil-paper transformer. Chinese Journal of Scientific Instrument, 36(2015): 608-
614.
[5] R. J. Liao, H. G. Sun, Q. Yuan, et al. Analysis of oil-paper insulation aging characteristics using Return
voltage method. High Voltage Engineering, 37(2011): 136-142.
[6] J. D. Cai and Y. Huang. Study on Insulation Aging of Power Transformer Based on Gray Relational
Diagnostic Model. High Voltage Engineering, 41(2015): 3296- 3301.
[7] S. H. Gao, L. Dong, Y. Gao, et al. Mid-long term wind speed prediction based on rough set theory.
Proceedings of the CSEE, 32(2012): 32-37.
Fuzzy Systems and Data Mining II 87
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-87

Finite-Time Stabilization for T-S Fuzzy


Networked Systems with State and
Communication Delay
He-Jun YAO1 , Fu-Shun YUAN and Yue QIAO
School of Mathematics and Statistics, Anyang Normal University, 455000, Anyang,
Henan, China

Abstract. The finite-time stabilization problem for nonlinear networked systems


has been considered. T-S approach has been used to model the controlled
nonlinear systems. By using the Lyapunov functional method, a finite-time
stabilization sufficient condition has been given. Then, a state feedback fuzzy
controller has been designed to make the closed networked control systems finite-
time stable. Finally, the proposed design method has been used into the
temperature control system for polymerization reactor.

Keywords. networked systems; fuzzy; delay

Introduction

Networked control systems (NCSs) are the feedback control systems with a network.
As we all know, NCSs has a lot of advantages, for example ease of maintenance, low
cost, greater flexibility . In recent years, a number of papers have been report on
analysis and control of NCSs[2-4]. In order to design the networked-based control, Gao
obtained a new delay system approach by using LMI approach [5]. In [6], Walsh et al.
considered the asymptotic stable of nonlinear NCSs. For the NCSs with long
communication delay, the networked-based optimal controller has been designed in [7].
Yue etc. considered the H f control problem of NCSs with uncertainty [8].
As a useful approach, the fuzzy control approach is usually used to design the
robust control for nonlinear systems. With the well-known T-S approach, many papers
have been published on the stabilization and control problem for nonlinear delay
systems [9-10]. In [11], by considering the insertion of the network, in order to ensure
systems properties, a new two-step approach has been introduced. For the nonlinear
NCSs, the input-to-state stability problem has been considered in [12]. But the results
of the above papers have only focused on the asymptotic stability of dynamic systems.
A few paper considered the finite-time stability of nonlinear NCSs. Therefore, the
finite-time control problem of nonlinear NCSs worthy to be concerned, which
motivates this paper.

1
Corresponding Author. He-Jun YAO, School of Mathematics and Statistics, Anyang Normal
University, 455000, Anyang, Henan, China; E-mail addresses: yaohejun@126.com.
88 H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems

In this paper, by using the LMI approach, based on the Lyapunov functional
approach, we obtained the fuzzy controller design methods and a finite-time stability
condition.

1. Problem formulation

Consider the following plant in Figure 1[13]


Rule i :
IF z1 (t ) is M 1i , z 2 (t ) is M 2i ,…, z n (t ) is M ni
THEN x(t ) Ai x(t )  Adi x(t  d )  Biu (t )  GiZ (t ) (1)
x(t ) I (t ) t  [d , 0]
where z1 (t ) , z 2 (t ) , , z n (t ) are the premise variable. x(t )  Rn is the systems
state vector. u(t )  Rm is the controlled input vector 㧘
i
M k (i 1, 2, , r ; k 1, 2, , n) are fuzzy sets. r is the numbers of IF-THEN
rules. n is the numbers of fuzzy sets. Ai , Adi , Bi ,Gi are known constant matrices . d is
state delay . I (t )  Rn is initial state on [  d , 0] . Z (t )  Rl is the exogenous disturbance
and satisfies
T
³ 0
ZT (t )Z (t )dt d d , d t 0 (2)
Actuator Plant Sensor

Delay W ca Network Medium Delay Wsc

Controller

Figure 1. The closed networked control systems


By using the T-S approach, without considering the communication delay, the
networked systems are described by[13]
r
x(t ) ¦ P ( z(t ))[ A x(t )  A
i
i i di x(t  d )  Biu (t )  GiZ (t )] (3)
x(t ) I (t ) t  [d, 0]
where Pi ( z(t )) satisfying
r
Pi ( z (t )) t 0, ¦ Pi ( z (t )) ! 0 i 1, 2, ,,rr

i 1

Assumption1[14]. The controller and actuator are event driven, the sensor-
controller delay is W sc ; the sensor is time driven, the controller-actuator delay is
W ca .Therefore, the communication delay is W W sc  W ca .
By insertion of the network, with considering the communication delay W , the
control systems of Fig1 with a network is
r
x(t ) ¦ P ( z(t ))[ A x(t )  A
i
i i di x(t  d )  Biu (t  W )  GiZ (t )] (4)
x(t ) I (t ) t  [d, 0]
In this paper, we would design the following controller
H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems 89

r
u (t ) ¦ P ( z(t ))K x(t )
i
i i
(5)

Inserting the above controller㧔5㧕 into network system (4), we obtain the closed
systems:
r r
x(t ) ¦¦ P ( z(t ))P ( z(t ))[ A x(t )  A
i j
i j i di x(t  d )  Bi K j (t  W )  GiZ (t )] (6)

x(t ) \ (t ) t [ d, 0]
We suppose the initial state x(t ) \ (t ) is a smooth function on [d , 0] ,
d max{W , d} . So, ||\ (t ) ||d \ t [d , 0] , where \ is a positive constant.
Definition1[15] For the given positive scalars c1 , c2 , T , positive matrix R , the
time delay NCSs (6) (setting Z (t ) { 0 ) is finite-time stable, if
xT (0) Rx(0) d c1 Ÿ xT (t ) Rx(t )  c2 t  [0, T ] (7)
Definition2[16] For the given positive scalars c1 , c2 , T , positive matrix R , with the
state feedback controller, the time delay NCSs (6) is finite-time stabilization if the
following condition holds
xT (0) Rx(0) d c1 Ÿ xT (t ) Rx(t )  c2 t  [0, T ] (8)

2. Main Results

Theorem1. For the given positive scalars c1 , c2 , T , positive matrix R , the NCSs (6) is
finite-time stabilization, if there are scalar D t 0 , matrix K i  R mu n , positive matrices
P, Q, T  R nun , S  R l ul to make the matrix inequalities hold
ª; PAdi PGi º PBi K j
« Q 0 »» 0
« 0
(9)
« 0 » T
« »
¬ D S ¼
c1 (Omax ( P)  hOmax
m (Q )  WOmax
m x (T ))  d Om )(  e D T )
max ( S )(1
 c2 e D T (10)
Omin ( P)
where
; PAi  AiT P  Q  T  D P , P R 1/ 2QR
Q 1/ 2 , T R 1/ 2TR 1/ 2
R 1/ 2 PR
P 1/ 2 , Q
and Omax ( ) and Omin ( ) are the maximum and minimum eigenvalue.
Proof. For the positive matrix P, Q, T in Theorem 1, we choose the Lyapunov
function[13]:
t t
V ( x(t )) : xT (t ) Px(t )  ³ xT (T )Qx(T )dT  ³ xT (T )Tx(T )dT (11)
t -h t -W

The derivative of V ( x(t )) (6) is given by


T
ª x(t ) º ª PAi  AiT P  Q  T PAdi PBi K j PGi º ª x(t ) º
« x(t  d ) » « »« »
r r
« Q 0 0 » « x(t  d ) »
V ( x(t )) ¦¦
¦ Pi ( z (t )) P j ( z (t )) « »
i 1 j 1 « x(t  W ) » « T »
0 « x(t  W ) »
« » « »« »
¬ Z (t ) ¼ «¬ 0 »¼ ¬ Z (t ) ¼
From condition (9), we have
90 H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems

V ( x(t ))  D xT (t ) Px(t )  DZ T (t ) SZ (t )  DV ( x(t ))  DZ T (t ) S Z (t ) (12)


D t
Multiplying (12) by e , we obtain
e D tV ( x(t ))  e D tDV ( x(t ))  D e D t Z T (t ) SZ (t )
Furthermore
d D t
(e V ( x(t )))  D e D t Z T (t ) SZ (t )
dt
From 0 to t , integrating the above inequality, with t  [0, T ] ,
t
eD tV ( x(t ))  V ( x(0))  ³ D eDT Z T (T ) SZ (T )dT (13)
0

Noting that D t 0 , P R 1/ 2 PPR 1/ 2 , Q R1/ 2Q 1/ 2 1/ 2


QR1/ 2 , and T R TR , we can obtain the
following relation:
xT (t ) Px(t ) d V ( x(t ))  eDT [c1 (Omax ( P)  hOmax (Q)  WOmax m (T ))  d Omax )(  eD t )]
m ( S )(1
(14)
On the other hand, it yields
xT (t ) Px(t ) xT (t ) R1/ 2 PR1/ 2 x(t ) t Ommin ( P) xT (t ) Rx(t ) (15)
Putting together (14) and (15) we have
eDT [c1 (Omax ( P)  hOmax
m (Q )  WOmax
max (T ))  d Omax )(  e DT )]
m ( S )(1 (16)
xT (t ) Rx(t ) 
Omin ( P)
Condition (10) and inequality (16) imply,
xT (t ) Rx(t ) d c2 ,  t  [0, T ] .
Theorem2. For the given positive scalars c1 , c2 , T , positive definite matrix R ,
with the fuzzy controller (5), the NCSs(6) is finite-time stabilization, if there are
scalars D t 0, Oi ! 0, i 1, 2,3, 4. , matrix K  Rmun , positive matrices X , Q, T  R nun ,
S  R l ul to make the following matrix inequalities hold:
ª4 Adi X Bi K j Gi º
« » (17)
« Q 0 0 »
0
« T 0 »
« »
¬« D S ¼»
O1 R 1  X  R 1 (18)
O2 Q  O1 X (19)
O3T  O1 X (20)
0  S  O4 I (21)
ª d O4 (1  e D T )  c2 e D T c1 h Wº
« » (22)
« O1 0 0 »
0
« O2 0 »
« »
«¬ O3 »¼
where
4 Ai X  XAiT  Q  T  D X
Proof. Left-and right-multiplying the inequality (9) by diag{P -1 , P -1 , P -1 , I } , the
inequality (9) is equivalent to
ª6 Adi P 1 Bi K j P 1 Gi º
« 1 1 » (23)
«  P QP 0 0 »
0
«  P TP1 1
0 »
« »
¬« D S ¼»
where 6 Ai P 1  P 1 AiT  P 1QP 1  P 1TP 1  D P 1
H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems 91

By letting X P 1 , K j K j P 1 , Q P 1QP 1 , T P 1TP 1 , the inequality (23) is equivalent


to inequality (17).
On the other hand, we denote X R 1/ 2 XR 1/ 2 , Q R 1/ 2QR
Q 1/ 2 , T R 1/ 2TR 1/ 2 .

For the positive-definite matrix R , we have Omax (X )= 1 . And the inequalities


Omin (P)
(18-21) imply that
1 O O
1  Omin ( P)), Ommax ( P)  , O (Q)  1 Ommax ( P)), Omax (T )  1 Omax ( P)), Omax
m x ( S )  O4
(24)
O1 mmax O2 O3 m
With the Schur Lemma, the inequality (22) is equivalent to
c h W
d O4 (1  eDT )  c2 eDT  1    0 (25)
O1 O2 O3
With (24), the condition (10) follows that
c1 (Omax ( P)  hOmax
m (Q )  WOmax
m x (T ))  d Omax )(  e D T )
m ( S )(1 c h W
 d O4 (1  eD T )  1   (26)
Omin ( P) O1 O2 O3
Inserting the inequality (25) into (26), the inequality (10) is satisfied.

3. Numerical Example

The temperature control system for polymerization reactor is a inertia link with time
delay. The state space model of polymerization reactor is usually written as[6]
x1 (t ) x2 (t )
x2 (t ) a1 x1 (t )  a2 x2 (t )  bu (t )
y (t ) x1 (t )
It is impossible to avoid the external disturbance and time delay. We consider the
nonlinear delay system with norm-bounded uncertainties as following
x(t )
x( Ai x(t )  Adi x(t  d )  Biu (t )
x(t ) \ (t ) d dt d0
where
ª 30 0 º ª3 12º ª 2 0.5º ª 3 1º ª1 º ª0 º ª1º
A1 « 0 20 » , A2 «1 0 » , Ad 1 « 0.5 2 » , Ad 2 « 0.1 1» , B1 « 2» , B2 «1 » ,\ (t ) « 1» ,
¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼
d 0.2,W 0.5
Solving the LMIs (17), the gain matrix can be obtained
K1 K1P 1 [3.4529 1.6837], K 2 K 2 P 1 [8.6183  4.3602]
With the state feedback controller (5) in Theorem 2, and choosing the initial
conditions \ (t ) [2  0.5]T
The simulation results are shown in the following figures 2-3
92 H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems

Figure 2. x1 (t ) of systems

Figure 3. x2 (t ) of systems
In the above figures, one can see that the systems is well finite-time stable.

4. Conclusion

In this paper, by introducing the Lyapunov approach and a new finite-time stable
approach, a finite-time stabilization condition is obtained. Based on this condition, the
state feedback fuzzy controller has been designed by using LMI.

Acknowledgments

This work was supported by Anyang Normal University Innovation Foundation Project
under Grant ASCX/2016-Z113.
H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems 93

References

[1] Y. Xia, Y. Gao, Recent progress in networked control systems-a survey, International Journal of
Automation and Computing, 12(2015), 343-367.
[2] G. Chen, Q. Lin, Finite-time observer based cooperative tracking control of networked large range
systems, Abstract and Applied Analysis, 2014, Article ID 135690.
[3] B. Chen, W. Zhang, Distributed fusion estimation with missing measurements, random transmission
delays and packet dropouts. IEEE Transactions on Automatic Control, 59(2014), 196-1967.
[4] J. Chen, H. Zhu, Finite-time H f filtering for a class of discrete-time Markovian jump systems with partly
unknown transition probabilities. International Journal of Adaptive Control and Signal Processing,
28(2014), 1024-1042.
[5] H. Gao, T. Chen, J. Lam, A new delay system approach to network-based control, Automatica, 44(2008),
39-52.
[6] G. C. Walsh, H. Ye, L G. Bushnell, Stability analysis of networked control systems, IEEE Trans on
Control Systems Technology, 10(2002), 438-446.
[7] S. Hu, Q. Zhu, Stochastic optimal control and analysis of stability of networked control systems with
long delay, Automatica, 39(2003),1877–1884.
[8] D. Yue, Q. L. Han, and J. Lam, Network-based robust H∞ control of a system with uncertainty,
Automatica, 4(2005), 999- 1007.
[9] Z. H. Guan, J. Huang, G. R. Chen, Stability Analysis of Networked Impulsive Control Systems, Proc. 25th
Chinese Control Conference, 2006, 2041-2044.
[10] Y. Tian, Z. Yu, Multifractal nature of network induced time delay in networked control systems,
Physics Letter A, 361(2007), 103-107.
[11] G. C. Walsh, O. Beldiman, L. G. Bushnell, Asymptotic behavior of nonlinear networked control
systems, IEEE Transactions on Automatic Control, 46(2001), 1093–1097.
[12] D. Nesic, Observer design for wired linear networked control systems using matrix inequalities,
Automatica, 44(2008), 2840-2848.
[13] S. He, H. Xu, Non-fragile finite-time filter design for time-delayed Markovian jumping systems via T-S
fuzzy model approach, Nonlinear Dynamic, 80(2015), 1159-1171.
[14] D. Huang, S. Kiong, State feedback control of uncertain networked control systems with random time
delays, IEEE Transactions on Automatic Control, 53(2008), 829-834.
[15] F. Amato, M. Ariola, P. Dorate, Finite-time stabilization via dynamic output feedback, Automatica,
42(2006), 337-342.
[16] F. Amato, M. Ariola, C, Cosentino, Finite-time control of discrete- time linear systems: Analysis and
design conditions, Automatica, 46(2010), 919-924.
94 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-94

A Trapezoidal Fuzzy Multiple Attribute


Decision Making Based on Rough Sets
Zhi-Ying LVa,b,1, Ping HUANGb, Xian-Yong ZHANGc,d and Li-Wei ZHENGe
a
College of Mathematics, University of Electronic Science and Technology of Chi-
na Chengdu, Sichuan, China
b
College of Management, Chengdu University of Information Technology,Chengdu,
Sichuan, China
c
College of Mathematics and Software Science, Sichuan Normal Universi-
ty,Chengdu, Sichuan, China
d
Institute of Intelligent Information and Quantum Information,Sichuan Normal
University, Chengdu, Sichuan, China
e
College of Applied Mathematics, Chengdu University of Information Technology,
Chengdu, Sichuan, China

Abstract. Fuzzy multiple attribute decision making (FMADM) is an efficient way


to solve complex systems, and has wide, practical application. This paper studies
the FMADM of the trapezoidal fuzzy number. In order to achieve desirable deci-
sion making, the similarity measures between two trapezoidal fuzzy numbers is
defined, which is based on a new method for ranking fuzzy numbers. A new algo-
rithm is proposed to remove surplus attributes. This algorithm is based on rough
sets and the technique for order of preference by similarity to ideal solution (TOP-
SIS) method; Finally, an example is examined to demonstrate the model’s use in
practical problems.

Keywords. FMADM, trapezoidal fuzzy number, centroid points, attribute reduc-


tion, rough sets

Introduction

The main idea of multiple attribute decision making (MADM) problems is to rank the
alternatives or choose the optimal solution. However, the available information is often
imprecise or vague. In this case, a better solution is to use fuzzy number. Fuzzy theory
[1] is able to address many decision problems that experts and decision makers struggle
to respond to, because of lack of information. Over the years, many theories and appli-
cations have been proposed for solving FMADM problems [2-3].To deal with these
fuzzy situations, experts are usually encouraged to use the trapezoidal fuzzy number,
which can involve the triangular number and interval number. At the same time, rank-
ing fuzzy numbers [4-5] is very important in real time decision-making applications.
Therefore there is a need for a procedure which can rank fuzzy numbers in more condi-

1
Corresponding Author: Zhi-Ying Lv, College of Mathematics, University of Electronic Science and
Technology of China, Chengdu 611731, China; College of Management, Chengdu University of Information
Technology; E-mail: lvZhiying1979@163.com.
Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets 95

tions. Ref [6] gives a way to rank trapezoidal fuzzy numbers based on the circumcenter
of centroids. This is a very practical method, which can incorporate the importance of
using the mode and spreads of fuzzy numbers.
Study found that correlations among the attributes will seriously affect the scientif-
ic objectivity and fairness of the evaluation, so attribute reduction [7-8] is an essential
subject in MADM. Usually, the rough set theory is a useful tool to study the attribute r
reduction problems. This theory is initiated by Pawlak in 1982 [9]. However, few stud-
ies have been conducted on the problem of attribute reduction in fuzzy decision making.
In this paper, a new FMADM method is presented, in which the distance between
two trapezoidal fuzzy numbers is defined and a fuzzy number attribute reduction meth-
od based on the TOPSIS method and rough sets [10] is proposed.

1. Preliminaries

In this section, we give the concepts of rough sets and trapezoidal fuzzy numbers and
their extensions.

1.1. Pawlak Rough Sets

An approximation space apr (U , R) is defined by a universe U and a relation R, where


U is a set which have finite elements and R is an equivalence relation on U , then the
equivalence class containing x is given by [ x]R .
Let S (U , C,V , f ) be an information system, where C is the set of attributes, V is
the domain of attribute values, V U cCVc , where Vc is a nonempty set of values of
attribute c  C , called the domain of c , f : U u C o V is an information function that
maps an object in U to exactly one value in Vc such that c  C , x  U , f ( x, c)  Vc .
For B Ž C , denote [ x]RB { y U | ( f ( x, b), f ( y, b))  R, b  B} , RB {[ x]RB : x U } , that
is, RB is the set of equivalence relation classes. A subset B Ž C has its lower and upper
approximations of X Ž U , which are defined as:
apr ( X ) {x U | x |RB Ž X } and apr ( X ) {x  U | x |RB  X z )}
| apr (U ) |
and the approximate quality is rBU .
|U |
Because rCU 1 , if ck  C , s.t. rCU{ck } 1 , then ck is a reduction of C . Otherwise,
ck is a dispensable attribute. The set of all dispensable attributes is the core of C ,
which can be denoted by core(C) .

1.2. Trapezoidal Fuzzy Number

Below, we briefly review the definition of the trapezoidal fuzzy number and the rank-
ing method.
Definition 1 The membership function of a trapezoidal fuzzy number
~
P (a, b, c, d ; Z ) is given by:
96 Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets

­ Z (a  t )
° a b adt db
°° Z bdt dc
P ~p (t ) ® Z (d  t )
° cdt dd
° d c
¯° 0 otherwise

~
where, -f  a d b d c d d  f , 0 d Z d 1 . If Z 1 , then P is normalized and can be
~
denoted by P (a, b, c, d ) , which is shown in figure 1.

Figure 1. Trapezoidal fuzzy number


We may see a trapezoidal fuzzy number as a trapezoid, which can be divided into
three plane figures. These figures are two triangles (APB and CQD) and a rectangle
(BPQC). Suppose G1, G2 , G3 are the centroids of these figures, which can form a new
triangle ( G1, G2 , G3 ).
Now, we give the definition of the circumcenter of the trapezoidal fuzzy number.
Definition 2[6] Let ~p (a, b, c, d ;Z) is a generalized trapezoidal fuzzy number, The
circumcenter S ~p ( ~x0 , ~y0 ) of the triangle( G1 G2 G3 ) can be defined as:
a  2b  2c  d (2a  b  3c)(2d  c  3b)  5Z 2
S ~p ( x 0 , y 0 ) ( , ) (1)
6 12Z
Definition 3[6] Based on the circumcenter of centroids S ~p ( ~x0 , ~y0 ) , the ranking
function of fuzzy number is defined as:
R( ~
p) (~
x0 ) ~p ( ~
y0 ) ~p (2)
This represents the area of a rectangle, which is formed by S ~p ( ~x0 , ~y0 ) and the origin.
As the value of R( ~p) increases, so does the fuzzy number ~p .We can define the dist-
ance between two normalized trapezoidal fuzzy numbers according to the distance bet-
ween their circumcenter points of the centroids because these points can be considered
red better balancing points for the trapezoidal fuzzy numbers.
~ ~
Definition 4 Let P1 (a1, b1, c1, d1) and P2 (a2 , b2 , c2 , d2 ) be two normalized trapezoi-
~ ~
dal fuzzy numbers, let SP1 (~
x01, ~
(~
x02 , ~
y01 ) and S P2
y02 ) be the circumcenter of the centroi-
~ ~ ~ ~
ds of P1 and P2 respectively, then the distance between P1 and P2 is defined by
~ ~
d ( P1, P2 ) ~x
1 ~2 2
0  x0  ~y
1
0 ~
y02
2
(3)
Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets 97

2. Fuzzy Multiple Attribute Decision Method Based on Rough Sets

2.1. Problem Description

Suppose X {x1, x2 ,", xn } is an alternative set. Alternatives are assessed on m attributes.


Denote the set of all attributes by C {c1, c2 ,", cm} . Assume the weight vector of the att-
m
ributes is Z (Z1 , Z2 ,", Zm )T , such that ¦ Z j 1 , where Z j t 0 and Z j denotes the wei-
j 1
~
ght of attribute C j . Suppose P ( ~pij ) num is the trapezoidal fuzzy decision matrix given
by the expert, where ~pij (aij , bij , cij , dij ) is the attribute value of the alternative xi with
respect to the attribute c j  C .

2.2. Decision Method

Given the fuzzy and rough theories described above, the proposed FMADM procedure
is defined as follows:
~ ~
Step 1. Construct the circumcenter of the centroid matrix O (( xij , yij )) of P .
~
Step 2. Construct the value matrix Q (qij ) of P .
Step 3. Determine the positive ideal and negative idea solution using the following
steps:
­ ½ ­ ½
°~
p j
~
® pij : i  N , qij max qij °¾ and p j
~ °~
® pij : i  N , qij min qij °¾ (4)
°̄ iN °¿ °̄ iN °¿
Then,
{ p1 , p2 ,!, pm
AP 
} and A N { p1 , p2 ,!, pm

} (5)
~
Step 4. The distance between pij and the positive value or negative values are de-
fined as:

dij d(~ p j )
pij , ~ x 
j  xij  y
2 
j  yij
2
and dij d(~ p j )
pij , ~ x 
j  xij  y
2 
j  yij
2
(6)


where x j , y j and x j , x j are the circumcenters of the centroids of p j and p j respec-
tively. Then calculate the similar degrees tij between ~pij and the idea solution and con-
struct matrix T (tij )mun , where
dij
tij (7)
dij  dij
Step 5. Construct a judgment matrix M (mij )8u6 through T (tij )mun , where
­0 0 d tij  0.3
°
mij ®1 0.3 d tij  0.6 (8)
°2 0.6 d t d 1
¯ ij
98 Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets

Step 6. Let S (U , C,V , f ) be an information system, and construct the equivalence


relation RB about B Ž C .
For xi U and [ xi ]RB {xk : mki mij , c j  B} , so RB {[ xi ]RB : i  N } .The lower ap-

proximation of U about B is defined by apr (U ) {xi : [ xi ]RB Ž [ xi ]CR , i  N } , then the ap-
| apr (U ) |
proximate quality is rBU . Because rCU 1 , if ck  C , s.t. rCU{ck } 1 , then ck is a
|U |
reduction of C .
Step 7. Give the weight vector Z (Z1, Z2 ," , Zt ) of the set of all non-superfluous at-
tributes, then calculate the values of all alternatives:
t
di ¦ Z jtij (i 1,2,", n) (9)
j 1
Then choose the best alternatives based on the ranking value of d i .

3. An Application Analysis of the Proposed Method

In this section, we present an example to show how the given model works in practice.
A fuzzy multiple attribute decision with trapezoidal fuzzy number involves a company
making an investment decision. Let us consider an investment company, which wants
to make the best investment decision for a given sum of money.
There is a panel with eight possible alternatives U {x1, x2 ,", x8} in which the
company can invest. Each alternative is assessed on six attributes C {c1, c2 ,", c6} . The
decision makers compare these eight companies with respect to the attributes, then con-
~
struct the decision matrix P ( ~pij )8u6 , which is shown below:
§ (0.7,0.72,0.75,0.8) (0.4,0.45,0.6,0.63) (0.7,0.72,0.82,0.9) (0.5,0.5,0.64,0.72) (0.18,0.19,0.2,0.21) (0.09,0.1,0.14,0.17) ·
¨ ¸
¨ (0.54,0.57,0.59,0.6) (0.5,0.52,0.6,0.63) (0.5,0.62,0.62,0.7) (0.5,0.5,0.54,0.6) (0.18,0.19,0.2,0.21) (0.09,0.09,0.098,0.1) ¸
¨ (0.7,0.73,0.78,0.79) (0.5,0.52,0.6,0.63) (0.6,0.72,0.8,0.9) (0.8,0.85,0.9,0.92) (0.21,0.23,0.25,0.27) (0.1,0.1,0.15,0.2) ¸¸
¨
¨ (0.6,0.63,0.66,0.73) (0.4,0.45,0.6,0.63) (0.7,0.72,0.86,0.9) (0.44,0.5,0.66,0.7) (0.17,0.18,0.18,0.19) (0.09,0.12,0.15,0.18) ¸
¨ ¸
¨ (0.72,0.75,0.77,0.8) (0.7,0.73,0.81,0.83) (0.7,0.72,0.8,0.83) (0.7,0.7,0.74,0.8) (0.19,0.21,0.24,0.26) (0.1,0.16,0.18,0.22) ¸
¨ (0.54,0.57,0.59,0.6) (0.4,0.46,0.5,0.56) (0.7,0.75,0.8,0.92) (0.4,0.5,0.54,0.62) (0.18,0.19,0.2,0.21) (0.1,0.12,0.13,0.13) ¸
¨ ¸
¨ (0.6,0.63,0.69,0.71) (0.5,0.52,0.7,0.74) (0.41,0.45,0.5,0.51) (0.44,0.5,0.66,0.7) (0.18,0.19,0.21,0.23) (0.12,0.18,0.21,0.22) ¸
¨ (0.72,0.75,0.77,0.8) (0.5,0.52,0.6,0.63)
© (0.71,0.72,0.86,0.9) (0.7,0.7,0.74,0.8) (0.19,0.21,0.24,0.26) (0.1,0.16,0.18,0.22) ¸¹
~
Step1.Using Eq. (1), construct the circumcenter of the centroid matrix O (( xij , yij )) :
§ (0.7400,0.4146) (0.5217 ,0.3933) (0.7800,0.4036) (0.5833,0.3964) (0.2267 ,0.4161) (0.1233,0.4146) ·
¨ ¸
¨ (0.5767 ,0.4159) (0.5617 ,0.4907 ) (0.6133,0.4135) (0.5300,0.4143) (0.1950,0.4165) (0.0943,0.4166) ¸
¨ (0.7517 ,0.4137 ) (0.5617 ,0.4907 ) (0.7567 ,0.3991) (0.8700,0.4127 ) (0.2400,0.4158) (0.1333,0.4135) ¸¸
¨
¨ (0.6517 ,0.4138) (0.5217 ,0.3933) (0.7933,0.3975) (0.5767 ,0.3887 ) (0.1800,0.4166) (0.1350,0.4148) ¸
¨ ¸
¨ (0.7600,0.4155) (0.7683,0.4097 ) (0.7617 ,0.4097 ) (0.7300,0.4143) (0.2250,0.4153) (0.1667 ,0.4146) ¸
¨ (0.5767 ,0.4159) (0.4800,0.4119) (0.7867 ,0.4085) (0.5167 ,0.4092) (0.1950,0.4165) (0.1217 ,0.4165) ¸
¨ ¸
¨ (0.6583,0.4123) (0.6133,0.3867 ) (0.4700,0.4134) (0.5767 ,0.3887 ) (0.2017 ,0.4160) (0.1867 ,0.4147 ) ¸
¨ (0.7600,0.4155)
© (0.5617 ,0.4097 ) (0.7950,0.3983) (0.7300,0.4143) (0.2250,0.4153) (0.1667 ,0.4146) ¸¹
~
Step 2. Based on Eq. (2), construct the value matrix Q (qij )8u6 of P as follows:
Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets 99

§ 0.3068 0.2052 0.3148 0.2312 0.0943 0.0511 ·


¨ ¸
¨ 0.2398 0.2756 0.2536 0.2196 0.0812 0.0393 ¸
¨ 0.3110 0.2756
¨ 0.3020 0.3590 0.0999 0.0551 ¸¸
¨ 0.2697 0.2052 0.3153 0.2242 0.0750 0.0560 ¸
¨ ¸
¨ 0.3158 0.3148 0.3121 0.3024 0.0934 0.0691 ¸
¨ 0.2398 0.1977 0.3214 0.2114 0.0812 0.0507 ¸
¨ ¸
¨ 0.2714 0.2372 0.1943 0.2242 0.0839 0.0774 ¸
¨ 0.3158 0.2301
© 0.3166 0.3024 0.0934 0.0691 ¸¹
Step 3. Based on Eqs. (4) - (5), determine the positive ideal and negative ideal so-
lutions:
AP {(0.72,0.75,0.77,0.8) (0.7,0.73,0.81,0.83) (0.71,0.72,0.86,0.9) (0.8,0.85,0.9,0.92)
(0.21,0.23,0.25,0.27) (0.12,0.18,0.21,0.22)}
AN {(0.54,0.57,0.59,0.6) (0.4,0.46,0.5,0.56) (0.41,0.45,0.5,0.51) (0.4,0.5,0.54,0.62)
(0.17,0.18,0.18,0.19) (0.09,0.09,0.098,0.1)}
Step 4. Construct the similar degree matrix T (tij )8u6 based on Eqs. (6) - (7) as
follows:
§ 0.8908 0.1559 0.9512 0.1910 0.7783 0.3144 ·
¨ ¸
¨ 0 0.2835 0.4401 0.0402 0.2500 0 ¸
¨ 0.9537 0.2835 0.8823 1 1 0.4228 ¸¸
¨
¨ 0.4092 0.1559 0.9942 0.1773 0 0.4407 ¸
¨ ¸
¨ 1 1 0.8923 0.6038 0.7500 0.7836 ¸
¨ 0 0 0.9601 0 0.2500 0.2965 ¸
¨ ¸
¨ 0.4453 0.4640 0 0.1773 0.3618 1 ¸
¨ 1
© 0.2835 1 0.6038 0.7500 0.7836 ¸¹
Step 5. Based on Eq. (8) , construct the judgment matrix M (mij )8u6 :
0 2 0 2 1· §2
¨¸
0 1 0 0 0¸ ¨1
0 2 2 2 1¸ ¨2
¨¸
0 2 0 0 0¸ ¨1
M ¨¸
2 2 2 2 2¸ ¨2
0 2 0 0 0¸ ¨0
¨¸
1 0 0 1 2¸ ¨1
0 2 2 2 2 ¸¹¨2
©
Step 6. Compute the equation class RB , where B Ž C .
RC c1 ^^x1`,^x2`,^x3`,^x4 , x6`,^x5`,^x7 `` ,
RC c2 ^^x1`, ^x2 `, ^x3`, ^x4`, ^x5 , x8`, ^x6`, ^x7 ``,
RC  c ^^x1`, ^x2 , x4 `, ^x3`, ^x5 `, ^x6 `, ^x7 `, ^x8 `` ,
3

RC c4 ^^x1, x3`, ^x2`, ^x4`, ^x5`, ^x6`, ^x7 `, ^x8``,


RC  c ^^x1`, ^x2 `, ^x3`, ^x4 `, ^x5 `, ^x6 `, ^x7 `, ^x8 `` ,
5

RC  c ^^x1`, ^x2 `, ^x3 , x8 `, ^x4 `, ^x5 `, ^x6 `, ^x7 `` ,


6

RC {c ,c } ^^x1`, ^x2 `, ^x3`, ^x4 , x6 `, ^x5 `, ^x7 `` ,


1 5
100 Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets

RC {c4 ,c5 } ^^x1, x3`, ^x2 `, ^x4 `, ^x5`, ^x6 `, ^x7 `, ^x8`` ,
RC {c3 ,c5 } ^^x1`, ^x2 , x4 `, ^x3`, ^x5 `, ^x6 `, ^x7 `, ^x8 `` ,
RC {c2 ,c5 } ^^x1`, ^x2 `, ^x3`, ^x4 `, ^x5 , x8 `, ^x6 `, ^x7 `` ,
RC {c5 ,c6 } ^^x1`, ^x2 `, ^x3 , x8 `, ^x4 `, ^x5 `, ^x6 `, ^x7 `` ,
RC ^^x1`, ^x2 `, ^x3`, ^x4 `, ^x5 `, ^x6 `, ^x7 `, ^x8 `` .
Thus, rC {c5 } ( X ) 1 ; therefore, c5 is a reduction of C . so, core(C ) ^c1, c2 , c3 , c4 , c6` .
So we can deduct the fifth line in the matrix T .
Step 7. Let Z {0.18,0.24,0.16,0.23,0.19} be the weight vector of ^c1, c2 , c3 , c4 , c6 ` .
Then using Eq. (9) calculates the values of the alternatives as follows:
d1 0.0511 , d 2 0.0393 , d3 0.0551 , d 4 0.0560 ,
d5 0.0691 , d 6 0.0507 , d 7 0.0774 , d 4 0.0691 .
Therefore, we can conclude that the most desirable alternative is x7 .

4. Conclusion

In this article, a new fuzzy attribute decision making method is proposed, in which the
attributed values are trapezoidal fuzzy numbers. An attribute reduction method is pro-
posed based on the distance definition between two trapezoidal fuzzy numbers and
rough sets, which can improve the accuracy of the evaluation. In future research, the
decision model presented in this paper will be extended to interval type-2 fuzzy values
based on Ref. [10].

Acknowledgment

This paper is supported by the National Natural Science Foundation Project of China
(No.61673285; No.61203285; No. 41601141); the Province Department of Soft Sci-
ence Project in Sichuan (2016ZR0095); soft Science Project of the technology bureau
in Chengdu (2015-RK00-00241-ZF); the high level research team of the major projects
division of Sichuan province (Sichuan letter [2015] no.17-5); the Project of Chengdu
University of Information Technology (N0.CRF201508, CRF201615)

References

[1] L.A. Zadeh, Fuzzy sets. Information and Control, 8(3)(1965):338-353.


[2] Z.Y. Lv, X. N. Liang, X. Z. Liang, L. W. Zheng, A fuzzy multiple attribute decision making method
based on possibility degree, 2015 12th International Conference on Fuzzy Systems and Knowledge
Discovery, January 13, 2016, 450-454.
[3] D.F. Li, Multiple attribute decision making method using extended linguistic variables. International
Journal Uncertain Fuzziness Known Based System, 17(2009): 793-806.
[4] G. Facchinetti, R.G. Ricci and S. Muzzioli, Note on fuzzy ranking Triangular numbers. International
Journal of Intelligent Systems, 13(1998):613-622.
[5] Z.S. Xu and Q.L. Da, Possibility degree method for ranking internal numbers and its applications.
Journal of Systems and Engineering, 18(1)(2003):67-70.
Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets 101

[6] P.B. Rao and N.R. Shanker, Ranking fuzzy numbers with an area method using circumcenter of centroids.
Fuzzy Information and Engineering, 1( 2013): 3-18
[7] Z.Y. Lv, T. M. Huang and F.X. Jin. Fuzzy multiple attribute lattice decision making method based on the
elimination of redundant similarity index. Mathematics in Practice and Theory, 43(10)(2013):173-
181
[8] X.Y. Zhang and D.Q. Miao, Quantitative/qualitative region-change uncertainty/certainty in attribute
reduction, Information Sciences, 334-335(2016):174--204.
[9] Z. Pawlak, Rough Sets. International Journal of Computer and Information Science, 11(1982):34-356.
[10] L. Dymova, P. Sevastjanov and A. Tikhonenko, An interval type-2 fuzzy extension of the TOPSIS
method using alphacuts. Knowledge-based Systems, 83(2015):116-127.
102 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-102

Fuzzy Rule-Based Stock Ranking Using


Price Momentum and Market
Capitalization
Ratchata PEACHAVANISH1
Department of Computer Science, Thammasat University, Pathum Thani, Thailand

Abstract. Stock market investing is an inherently risky and imprecise activity,


requiring complex decision making under uncertainty. This paper proposes a
method that applies fuzzy rule-based inference to rank stocks based on price
momentum and market capitalization. Experiments performed on Thai stock
market data showed that high-momentum stocks significantly outperformed the
market index benchmark, and that stocks of companies with small market
capitalization performed better than larger ones. Fuzzy rule-based inference was
applied to combine both the momentum factor and the market capitalization factor,
with different sets of rules for different prevailing market conditions. The result
produced a higher investment return than using either momentum or market
capitalization alone.

Keywords. fuzzy, stock, finance, technical analysis, momentum.

Introduction

Stock market investing is a high-risk activity with a potentially high reward, requiring
complex decision making based on imprecise and incomplete information under
uncertainty. Typically, two analytical approaches are utilized in investment decision
making: fundamental analysis and technical analysis. Decisions based on fundamental
analysis primarily consider the business entity represented by a stock. The information
under consideration includes the nature of the business, its profitability, its
competitiveness, and most importantly its financial standing through detailed study of
its financial statements. For technical analysis, a stock is treated separately from the
business entity. Only stock price movements and patterns generated by them are used
in making trading decisions. Technical analysis views price movements as being
governed by supply and demand of market participants and aims to exploit them.
This paper proposes a technical analysis-based method that applies fuzzy rule-
based inference on stock price momentum and market capitalization (company size),
with different sets of rules for different prevailing market conditions. The method was
tested on the Stock Exchange of Thailand.

1
Corresponding Author: Ratchata PEACHAVANISH, Department of Computer Science, Thammasat
University, Pathum Thani, Thailand; E-mail: rp@cs.tu.ac.th.
R. Peachavanish / Fuzzy Rule-Based Stock Ranking 103

1. Related Works

There is a large and diverse body of research literature on computerized stock market
investing. Techniques in soft computing, fuzzy logic, machine learning, and traditional
data mining have been applied to address various aspects of stock trading, utilizing
both fundamental analysis and technical analysis. Support vector machine and genetic
algorithm were applied on business financial data to perform stock selection that can
outperform market benchmark [1, 2]. Fuzzy logic was applied on stock price
movements to market time stock trades [3], to create a new technical indicator that
incorporated investor risk tendency [4], and to assist in portfolio management [5, 6].
Machine learning experiments on technical analysis-based trading conducted by [7] did
not outperform the market benchmark when using transaction costs. In addition, using
sentiment data obtained from social networks to assist in stock market investing has
also been attempted [8, 9]. A recent comprehensive review of works using evolutionary
computing methods can be found in [10].
Stock markets in different regions have different rules and characteristics. Highly-
developed and efficient markets, such as the New York Stock Exchange, differ greatly
from emerging markets like the Stock Exchange of Thailand. In smaller markets,
extreme price movements are more common as a few well-funded participants can
dictate the market direction in the short term and affect market volatility. This is
especially true for market participants that are classified as foreign fund flows [11].
Lack of regulation and enforcement against insider trading in emerging markets like
Thailand also makes market inefficient and unfair [12]. These differences make
comparisons among research studies difficult. A working strategy under one market
environment may not be effective in another. Nevertheless, the industry-standard way
of judging an investment strategy is to compare the investment return against the
market index benchmark. Most mutual funds, in the long term, failed to outperform the
market [13]. The method proposed in this paper provides a superior investment return
to the market index. It is described in the next section.

2. Method

The strategy proposed in this paper is based on a key technical analysis principle: price
moves in trend and has momentum. This momentum effect, which implies that stock
price tends to continue on its current direction due to inertia, has been observed in
stock markets [14, 15]. Price reversal then occurs after the momentum weakens.
According to this principle, buying stocks with strong upward momentum is likely to
give superior result to buying stocks with weaker or downward momentum. The
strategy is then to make trading decisions based on a technical indicator that reflects
stock price momentum, which by definition is computed from past price series. This
reactive approach therefore makes no attempt is explicitly forecast future price, but
rather to take actions based on past price behavior.
Additionally, past evidence suggested that company’s market capitalization, or its
size, also determines the characteristic of its stock return [16]. In general, stocks of
small companies (the so called “small caps” stocks) tend to be far more volatile than
those of large, established companies (“big caps”). This is simply due to the tendency
for small companies to grow faster, albeit with higher risk. During a bull market, small-
104 R. Peachavanish / Fuzzy Rule-Based Stock Ranking

cap stocks as a group far outperform big-cap stocks. On the other hand, investors prefer
the relative safety of big-cap stocks during an economic downturn or a bear market.
To see how trading using momentum and market capitalization can provide
addition returns above the market index, experiments were performed on the Thai
stocks spanning January 2012 to July 2016. The pool of stocks for the experiments
comprised all constituents of the Stock Exchange of Thailand’s SET100 index. These
stocks are the 100 largest and most liquid stocks in the market (SET100 members are
updated semiannually). These relatively large stocks are considered investment grade
and are least susceptible to manipulations. The daily closing price data of the stocks
were obtained from the SETSMART system [17]. The experiments were conducted
using a custom-written software implemented in the C# language and Microsoft SQL
Server.
The momentum indicator used in the experiment was the Relative Strength Index
(RSI) [18], a standard technical indicator widely-used by stock traders for measuring
the strength of stock price movements. The RSI is a bounded oscillating indicator
calculated using past n-period closing price data series (1).

100
_bc = 100 −
d;
1+
9;

d;−1 ∗ (> − 1) + d; 9;−1 ∗ (> − 1) + 9;


d; = 9; =
> >
(1)

-f;gh; − -f;gh;−1 , ;A -f;gh; > -f;gh;−1 -f;gh;−1 − -f;gh; , ;A -f;gh;−1 > -f;gh;
d; = e 9; = e
0, ;A -f;gh; ≤ -f;gh;−1 0, ;A -f;gh;−1 ≤ -f;gh;

The RSI is effectively a ratio of average gain to average loss during a given past n
consecutive trading periods. An RSI value is bounded between 0 and 100 where a value
higher than 50 indicates an upward momentum and a value lower than 50 indicates a
downward momentum. An extreme value on either end indicates an overbought or an
oversold condition, often used by traders to identify point of price reversal. For this
experiment, the 60-day RSI was chosen.
For trading, the portfolio was given 100 million Thai Baht of cash for the initial
stock purchase. The algorithm selected a quartile of 25 stocks from the pool of 100
stocks ranked by 60-day RSI. They were then purchased on an equal weight basis using
all available cash and held on to for 20 trading days (one month). The process was then
repeated – the algorithm chose a new group of stocks and the portfolio was readjusted
to hold on only to them. Trading commission fees at retail rate were incorporated into
the experiments.
Similarly, the same 100 stocks, this time ranked by market capitalization, were
divided into four quartiles for the algorithm to choose from. However, since the weight
distribution of stocks in the market was nonlinear, each of the four quartiles contained
different numbers of stocks: the first quartile comprised the 4 largest stocks in the
market, the second quartile comprised the next 8 largest stocks, the third quartile
comprised the next 16 largest stocks, and the last quartile comprised the remaining 72
stocks. In other words, every quartile weighted approximately the same when the
market capitalizations of its component stocks are summed.
R. Peachavanish / Fuzzy Rule-Based Stock Ranking 105

The results of the experiments are shown in Table 1. Monthly trading based on 60-
day RSI momentum indicator significantly outperformed the market index. Small-cap
stocks outperformed big-cap stocks.

Table 1. Portfolio returns based on monthly trading using momentum and market capitalization, compared to
the return of the SET100 market index benchmark.
Group By Momentum By Market Capitalization
First Quartile 126.61 % 9.40 %
Second Quartile 68.82 % 29.82 %
Third Quartile 32.12 % 76.96 %
Fourth Quartile -5.29 % 65.31 %
SET100 40.40 % 40.40 %

Experiments using momentum and market capitalization have provided the basis for
stock selection: buy small-cap stocks with high momentum. However, this strategy
does not work during market downtrend. While small-cap stocks as a group outperform
the market during normal times, they severely underperform during market downtrends
due to their lower liquidity. In addition, stocks with high momentum are indicative of
being overbought and have a much greater chance of sudden and strong price reversal.
Price momentum, company size as measured by market capitalization, and
prevailing market condition are the three dimensions that influence stock price
behavior. Each has inherently vague and subjective degrees of measure and so fuzzy
logic [19] is an appropriate tool to assist in the decision-making process. For the
proposed method, fuzzy rules were constructed based on these three factors with
membership functions shown in Figure 1 and fuzzy rule matrix shown in Figure 2. The
60-day RSI indicator was used to indicate both the momentum of stocks and the
prevailing market condition (bull market is characterized by a high RSI value, and vice
versa). There were three linguistic values expressing the momentum – “Weak”,
“Moderate”, and “Strong”, with a typical non-extreme 60-day RSI value ranging
between 40 and 60. For company size, relative ranking of market capitalization was
used instead of the absolute market capitalization of a company. The largest 50 stocks
out of 100 were considered “Large” and “Mid”, with overlapping fuzzy memberships.
The remaining half was considered “Mid” and “Small”, also with overlapping fuzzy
memberships. For output, there were five levels of stock purchase ratings in linguistic
terms: “Strong Buy” (SB), “Buy” (B), “Neutral” (N), “Sell” (S), and “Strong Sell” (SS),
having overlapping numerical scoring ranges between 0 and 10.

Weak Moderate Strong Small Mid Large SS S N B SB


1

40 45 50 55 60 10 30 50 70 90 1 3 5 7 9
Momentum (RSI) Market Capitalization Purchase Rating

Figure 1. Fuzzy membership functions for momentum as measured by RSI (left), company size as measured
by market capitalization (middle), and purchase rating of stock (right).
Mamdani-type [20] fuzzy inference was used to determine stock purchase rating.
For each rule, the intersection between antecedents was evaluated. Consequents of
106 R. Peachavanish / Fuzzy Rule-Based Stock Ranking

rules were then combined using Root-Sum-Square method and the Center of Gravity
defuzzification process was performed to obtain the final crisp stock purchase rating.
The Fuzzy Framework [21] C# library was used to implement the fuzzy logic rule-
based algorithm.
Market Capitalization
Small Mid Large Small Mid Large Small Mid Large
Stock Momentum

Weak SS S N Weak S S S Weak N N N

Moderate SS N B Moderate N N N Moderate N N N

Strong SS B SB Strong B B B Strong SB B N

Weak Market Moderate Market Strong Market

Figure 2. Fuzzy rules for different market conditions as measured by momentum (RSI): weak market (left),
moderate market (middle), and strong market (right).
During strong market condition, money should be allocated first to small-cap
stocks with strong momentum and second to mid-cap stocks, also with strong
momentum. During weak market condition, small-cap stocks should be avoided and
priority should be given to big-cap stocks with strong momentum. For moderate market
condition, desirability of a stock was decided on its momentum.
Portfolio readjustments were performed in the same manner to the previous
experiments. The algorithm chose the top quartile of stocks with the best purchase
rating computed from the fuzzy rules. The portfolio returns 161.76%, which was better
than the best return from the experiment using momentum alone (126.61%) or market
capitalization alone (76.96%). The fuzzy rule-based approach also outperformed both
the SET100 index benchmark (40.40%) and one of the best actively-managed mutual
funds in the industry (“BTP” by BBL Asset Management Co., Ltd. at 124.43%) The
results are shown in Figure 3.
200
150
100
50
0
SET100 BTP Mutual Fund Momentum Market Capitalization Fuzzy Rules

Figure 3. Investment returns by algorithms: best result from momentum-only strategy (126.61%), best result
from market capitalization-only strategy (76.96%), and fuzzy rule-based method (161.76%). Returns of the
SET100 index benchmark and “BTP” mutual fund are shown for comparison.

3. Conclusions and Future Works

This paper proposes a method that uses fuzzy rule-based inference to rank stocks based
on a combination of price momentum, company’s market capitalization, and prevailing
market condition. The method yields superior return to both the market index
benchmark as well as an industry-leading mutual fund. The method can be further
R. Peachavanish / Fuzzy Rule-Based Stock Ranking 107

improved in the future by incorporating the ability to hold cash during market
downturns. Additionally, short-term indicators may also be used to detect imminent
weakening or strengthening of momentum – information that is potentially useful in
making trading decisions.

References

[1] H. Yu, R. Chen, and G. Zhang, A SVM stock selection model within PCA, 2nd International Conference on
Information Technology and Quantitative Management, 2014.
[2] C. Huang, A hybrid stock selection model using genetic algorithms and support vector regression,
Applied Soft Computing, 12 (2012), 807-818.
[3] C. Dong, F. Wan, A fuzzy approach to stock market timing, 7th International Conference on Information,
Communications and Signal Processing, 2009.
[4] A. Escobar, J. Moreno, S. Munera, A technical analysis indicator based on fuzzy logic, Electronic Notes
in Theoretical Computer Science 292 (2013), 27-37.
[5] K. Chourmouziadis, P. Chatzoglou, An intelligent short term stock trading fuzzy system for assisting
investors in portfolio management, Expert Systems with Applications, 43 (2016), 298-311.
[6] M. Yunusoglu, H. Selim, A fuzzy rule based expert system for stock evaluation and portfolio
construction: An application to Istanbul Stock Exchange, Expert Systems with Applications, 40 (2013),
908-920.
[7] A. Andersen, S. Mikelsen, A novel algorithmic trading frame-work applying evolution and machine
learning for portfolio optimization, Master’s Thesis, Norwegian University of Science and Technology,
2012.
[8] J. Bollen, H. Mao, X. Zeng, Twitter mood predicts the stock market, Journal of Computational Science, 2
(2011), 1-8.
[9] L. Wang, Modeling stock price dynamics with fuzzy opinion networks, IEEE Transactions on Fuzzy
Systems, (in press).
[10] Y. Hu, K. Liu, X. Zhang, L. Su, E. W. T. Ngai, M. Liu, Application of evolutionary computation for
rule discovery in stock algorithmic trading: a literature review, Applied Soft Computing, 36 (2015), 534-
551.
[11] C. Chotivetthamrong, Stock market fund flows and return volatility, Ph.D. Dissertation, National
Institute of Development Administration, Thailand, 2014.
[12] W. Laoniramai, Insider trading behavior and news announcement: evidence from the Stock Exchange
of Thailand, CMRI Working Paper, Thai Stock Exchange of Thailand, 2013.
[13] C. Mateepithaktham, Equity mutual fund fees & performance, SEC Working Papers Forum, The
Securities and Exchange Commission, Thailand, 2015.
[14] N. Jegadeesh, S. Titman. Returns to buying winners and selling losers: implications for stock market
efficiency, Journal of Finance, 48 (1993), 65-91.
[15] R. Peachavanish, Stock selection and trading based on cluster analysis of trend and momentum
indicators, International MultiConference of Engineers and Computer Scientists, 2016.
[16] T. Bunsaisup, Selection of investment strategies in Thai stock market, Working Paper, Capital Market
Research Institute, Thailand, 2014.
[17] SETSMART (SET market analysis and reporting tool), http://www.setsmart.com.
[18] J. Welles Wilder, New concepts in technical trading systems, Trend Research, 1978.
[19] L. Zadeh, Fuzzy sets, Information and Control, 8 (1965), 338-353.
[20] E. Mamdani, S. Assilian, An experiment in linguistic synthesis with a fuzzy logic controller,
International Journal of Man-Machine Studies, 7 (1975), 1-13.
[21] Fuzzy Framework, http://www.codeproject.com/Articles/151161/Fuzzy-Framework.
108 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-108

Adaptive Fuzzy Sliding-Mode Control of


Robot and Simulation
Huan NIUa, Jie YANGa,1 and Jie-Ru CHIb
a
School of Electrical Engineering, Qingdao University, Qingdao, Shandong, China
b
School of Electronic and Information Engineering, Qingdao University, Qingdao,
Shandong, China

Abstract. Aiming at control method of 2-DOF joint robot, the 3D robot model is
established in ADAMS firstly, and then dynamic equation of the robot is deduced
by using the obtained parameters. And dynamic model is combined with control
system model in MATLAB/Simulink by the ADAMS/Control module and is
established coordinated simulation system. In order to eliminate the effect of the
modeling error and uncertainty signal, a sliding-mode control is proposed. In this
method, a linear sliding surface is used to ensure the system to reach equilibrium
with the sliding surface in finite time; and fuzzy control is used to compensate for
the modeling error and uncertainty signal. Equivalent control law and switching
control law are derived by using Lyapunov stability criterion and exponential
reaching law. Fuzzy control law and membership function are set up by using
fuzzy control rules. Through online adaptive learning of fuzzy, buffeting is
weakened. Simulation result shows that the control method is effective.

Keywords. Joint Robot, fuzzy control, sliding-mode control, simulation

Introduction

In order to achieve accurate control of the multi-joint robot system including modeling
errors and uncertainty signals, there have been many effective methods. And the
development of robot control theory has gone through three stages: traditional control,
modern control and intelligent control. The traditional control theory mainly includes
PID control, feed-forward control, and so on; modern control theory mainly includes
robust control, sliding-mode control and so on; intelligent control theory mainly
includes fuzzy control, neural network control, adaptive control, etc [1-2].Robot
control is divided into point-to-point control(PTP) and trajectory tracking control(or
continuous path control, CP).Point-to-point control only requires that the end effector
of the robot is moved from one point to another without taking into account the motion
trajectory. Robot trajectory tracking control is that the driving torque of each joint is
given, so that the position, velocity and other state variables of robot are tracked the
known ideal trajectory. For the entire trajectory, it is necessary to strictly control [3-6].
In recent years, fuzzy control and sliding-mode control have been got more and
more people's attention for their strong robustness. As for the sliding-mode control, by
designing a stable sliding surfaces can ensure that the control system would be run into

1
Corresponding Author: Jie YANG, School of Electrical Engineering, Qingdao University, 308
Ningxia Rd, Qingdao, Shandong, PRC, 266071, China; E-mail: jackiey69@sina.com.
H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation 109

the surfaces from any of the initial state within a limited time frame, and be sported
near the balance point on the surface. But the problem of buffeting is still existed in the
control system and the upper limit of the modeling error and uncertainty signal of the
control system must be knew in advance in this method, which is hard to do in the
actual robot control[7]. However, the fuzzy control has overcome these deficiencies,
which is an effective way to eliminate the buffeting of sliding-mode control system. Its
strong adaptive learning capability can also be used to weaken the uncertain signal. So
combining sliding-mode control with fuzzy control is used to implement the control of
trajectory tracking, which ensures the stability and effectiveness of the control system.
In this paper, the first part mainly introduces the establishment of the 3D model
and the derivation of dynamic equation of robot; in the second part, the design of the
sliding-mode control system is introduced; in the third part, the design of the sliding-
mode control system is introduced; in the fourth part, the simulation experiment and
simulation results of the robot control system are introduced; a brief summary is at the
end of the paper. Those have a certain reference value for the robot control in the future.

1. Mechanical Virtual Prototype System

Firstly, the 3D model of the robot is established in function module of ADAMS VIEW,
which has two robotic arms and can be realized 2-DOF rotary motion in YOZ plane.
The robotic arms’ length is set to 0.225m and qualities are set to 0.03kg, as shown in
Figure 1.

Figure 1. 3D model of robot

The barycenter positions of two robotic arms are


x1 =0.1125, y1 =0.33, z1 =0; x2

=0.3375,
y2 =0.39, z2 =0.The inertial parameters of robot are I =0.1732, I yy =0.1588,
xx

I zz =0.0251.Based on D-H coordinate method, the dynamic equation of robot is


deduced.
M (q) > q1 q2 @  C (q, q) > q1 q2 @  G ( q )  U t
T T
W (1)
In Eq.(1): W is controlling moment; q, q, q  R are respectively position
n

vector ,velocity vector and acceleration vector of joint angular.


ª M M 12 º nu n
M (q ) « 11
M M »  R , it is inertia matrix; parameters of it are: M 22 0.0252 ,
¬ 21 22 ¼

M11 0.004545cos q2  0.005265sin q2  0.0519 , M12 M 21 0.00227cos q2  0.00263sin q2  0.0252 ;


110 H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation

ª C111 C112 C121 C122 º


»  R , it is centrifugal force matrix, parameters of it
n
C ( q, q ) «C
¬ 211 C212 C221 C 222 ¼

are: C111 C212 C221 C222 0 , C112 C121 C122 0.0026325cos q2  0.0022725sin q2 ,
C211 0.0026325cos q2  0.0022725sin q2 ; G(q) >G1 G2 @ is gravity matrix, parameters of
it are: G1 G2 0 ; U t is the modeling error and uncertainty signal; it is generally set
to the same form of input signal, which the amplitude is > 2% 5%@ of input signal[8].

2. Sliding-mode Control System of Robot

2.1. The Design of Sliding-mode Surface

The purpose of trajectory tracking control of robot is to make the joint position
vectorconsistent with the desired joint angular displacement as much as possible[9-10].
Therefore, the sliding-mode surface is designed to Eq.(2):
s e De (2)
In Eq.(2): D is constant of sliding-mode surface; e q  qr is tracking error; e q  qr
is derivative of tracking error. And the exponential reaching law of sliding-mode
s
control is designed to s M  Ks , and M , K ! 0 .
s

2.2. The Design of Control Law

The Eq.(2)is simultaneous withthe reaching law, so the Eq.(3) can be got:
W ueq  uvss (3)
In Eq.(3):
s
ueq M (q)qr  C (q, q )q  G (q )  U (t )  D M (q )e , uvss M M (q )  KsM (q ) ;
s
K ! K  U (t ) , K is any small positive number; M , K are parameters of exponential
reaching law.

3. Fuzzy Control System of Robot

3.1. The Design of Fuzzy Control Rule

In multi-joint robot system,the effect of modeling error and uncertainty signal is always
existed. So combining sliding-mode control with fuzzy control is usually used to
weaken the effect, which ensures the stability and effectiveness of the control system.
Fuzzy reasoning is used to establish fuzzy rules. The fuzzy set is defined as shown in
Table 1:
H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation 111

Table 1Rule Set of Fuzzy Controller

S*ds ds
NB NM NS ZO PS PM PB
s
PB NB NB NM PM PB PB PB
PM NB NM NS PS PM PB PB
PS NM NS NS PS PS PM PB
ZO NM NS NS ZO PS PS PM

NS PB PM PS PS NS NS NM
NM PB PB PM PS NS NM NB
NB PB PB PB PM NM NB NB
Among the Table 1: NB is represented the maximum of negative number; NM is
represented the middle value of negative number; NS is represented the minimum of
negative number; ZO is represented the zero; PN is represented the minimum of
positive number; PM is represented the middle value of positive number; PB is
represented the maximum of positive number. Fuzzy rules are the model of IF-THEN:
Rm : if i is A and i is B si si is C
l

A, B and C are taken from Table 1.

3.2. The Design of Membership Function and Control Law

The membership function of fuzzy control is set up by fuzzy logic toolbox of


MATLAB/Simulink. The basic form is set to triangular membership function(trimf);
the range of values is set to [-3 3]; defuzzify is the method of membership degree of
average maximum.
After the fuzzy control is introduced into the sliding-mode control, the control law
should be changed to the form of Eq.(4):
W ueq  uvss  u f (4)
In Eq.(4): u f u f (x T ) ª¬u f 1 ( x1 T1 ) u f 2 ( x2 T 2 ) º¼ , it is the output of fuzzy
controller; xi > si , si @ , it is the input of fuzzy controller; Ti ri si[ ( x) , it is adaptive
law; ri is learning coefficient of control system.

3.3. Stability Analysis of Fuzzy Control

For the 2-DOF robot, it is assumed that the upper bound of the modeling error and
uncertainty signal is Ui (t ) d Li ; optimal approximation parameter of adaptive laws is:
Ti arg min ª¬sup u fi ( xi Ti )  ( Li  K ) sign( s) º¼ ;adaptive error is Ti Ti  Ti ; the upper
T R

bound of approximation error of the fuzzy controller is > w1 w2 @ ; the minimal


112 H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation

approximation error of fuzzy controller is H i u fi ( xi T )  ( Li  K ) sign( si )  wi .The Lyapunov


1 T 1 2 1
function is V s M (q)s  ¦ TiT Ti ; it is taken the derivative:
2 2 i 1 ri
2
V
1 T
s M (q ) s  sT M (q ) s  sT M (q ) s  ¦ TiT Ti
1
2 i 1 ri
(5)
2

¦ ª¬( L  K ) s
i 1
i i  H i si  Ui (t )  wi si º¼  0

The result of Eq.(5) shows that the control system has the global stability.

4. Simulation Experiment

The control system is set up in MATLAB/Simulink, as shown in Figure 2.

Figure 2.Control system


Physical parameters of 2-DOF robot are set as follow: the parameters of sliding-
mode surface: D1 D 2 1 ;the parameters of exponential reaching law: M 10, K 10000 ;
s is replaced with s +0.000001 to prevent the emergence of the Singularity;
memory module is used to prevent the emergence of the algebraic loop, and the
parameters of it is set to 1; learning coefficient: ri >0.85 0.85 0.85 0.85 0.85 0.85@ , desired
trajectory: qr1 1  cos(S t ) , qr 2 0.5  0.5cos(S t) ; modeling error and uncertainty signal:
U1 (t ) 0.5sin 0.5t , U2 (t ) 0.1sin 0.1t .
S-Function is written in MATLAB, which is used to simulate. MATLAB is
connected with ADAMS by the interface of Control, and the simulation of control
system is implemented. After that, the trace curve of joint_1, trace curve of joint_2 and
error curve are gotten, as shown in Figure3, Figure4and Figure5.
If PD Control is used, which the control law is Eq.(6):
W i k pi ei  kdi ei (6)
H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation 113

Figure 3. Trace Curve of Joint_1 Figure 4. Trace Curve of Joint_2

Figure 5. Error Curve of Joint1/2 Figure 6. Error Curve of PD Control


The error curve of PD control will be gotten, as shown in Figure 6.
Through the observation of Figure3, Figure4 and Figure5, the adaptive fuzzy
sliding-modecontrol has better ability in trajectory tracking. When the interfering signal
is imported, the control system can be restored to steady state operation near the
equilibrium point of sliding-mode surface(s=0). So the control system is effective and
robust. There is no obvious buffeting in simulation experiment, so the control system is
met the requirement of design. Comparing Figure5 and Figure6 shows that adaptive
fuzzy sliding-mode control is superior to the PD control. For the effects of same
interference signal, anti-jamming capability of PD control is poorer. The control error
is increased and the control precision is reduced sharply. This can also verify the
validity of adaptive fuzzy sliding-mode control.

5. Conclusions

In allusion to position control of 2-DOF joint robot and the modeling error and
uncertainty signal of control system, adaptive fuzzy sliding-mode control is proposed.
Simulation experiment is conducted in MATLAB and ADAMS, and the experiment
result of adaptive fuzzy sliding-mode control is compared with PD control. The
simulation result shows that adaptive fuzzy sliding-mode control is effective and robust.
And there is no obvious buffeting in control system. The trajectory tracking is more
effective than PD control. So this controlpolicy has practical operability, and the study
would supply a certain practice guidance with value in theory.
114 H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation

Acknowledgment

This work is supported by the Science & Technology Project of College and University
in Shandong Province (J15LN41).

References

[1] J. X. Lv, Y. H. Li, X. Z. Wang, X. L. Bao, Mechanical structure optimization and power fuzzy control
design of picking robot end effector, Journal of Agricultural Mechanization Research, 38(2016): 36-40.
[2] S. H. Ju, Y. M. Li, Research on nonholonomic mobile robot based on self-adjusting universe fuzzy
control, Electronic Design Engineering, 24(2016), 103-106.
[3] Z. M. Ju, Fuzzy, control is applied to wheel type robot target tracking, Computer Measurement &
Control, 22(2014): 614-616.
[4] J. L. Zhang, Comprehensive obstacle avoidance system based on the fuzzy control for cleaning robot,
Machine Tool & Hydraulics, 18(2014): 92-95.
[5] Z. B. Ma, Self-adjusting parameter fuzzy control for self-balancing two-wheel robots, Techniques of
Automation and Applications, 33(2014): 9-13
[6] S. B. Hu, M. X. Lu, Fuzzy integral sliding mode control for three-links spatial robot, Computer
Simulation, 20(2012): 162-166.
[7] L. Lin, H. R. Wang, Y. N. Hu, Fuzzy adaptive sliding mode control for trajectory tracking of uncertain
robot based on saturated function, Machine Tool& Hydraulics, 36(2008): 137-140.
[8] C. Z. Xu, Y. C. Wang, Nonsingular terminal fuzzy sliding mode control for multi-link robots based on
back stepping, Electrical Automation, 34(2012): 8-9.
[9] W. D. Gao, Y. M. Fang, W. L. Zhang, Application of adaptive fuzzy sliding mode control to
servomotor system, Small& Special Electrical Machines, 37(2009): 32-36.
[10] T. W. Wu, Y. S. Yang, Research on simulation of adaptive sliding-mode guidance law, Modern
Electronics Technique, 34(2011): 23-25.
Fuzzy Systems and Data Mining II 115
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-115

Hesitant Bipolar Fuzzy Set and Its


Application in Decision Making
Ying HAN 1 , Qi LUO and Sheng CHEN
B-DAT & CICAEET, Nanjing University of Information Science and Technology,
Jiangsu, Nanjing 210044, P. R. China

Abstract. In this paper, by combining hesitant fuzzy set with bipolar-valued fuzzy
set, the concept of hesitant bipolar value fuzzy set is introduced, and the hesitant
bipolar fuzzy group decision making method based on TOPSIS is proposed. Our
study firstly integrates fuzziness, hesitation and incompatible bipolarity in multiple
criteria decision making method. An illustrative case of chemical project evaluation
also demonstrates the feasibility, validity, and necessity of our proposed method.
Keywords. Fuzzy set, Bipolar-valued fuzzy set, Hesitant fuzzy set, Multiple criteria
decision making, Incompatible bipolarity

Introduction

As an extension of fuzzy set [1], hesitant fuzzy set (HFS) was introduced by Torra and
Narukawa to describe the case that the membership degrees of an element to a given set
have a few different values, which arises from hesitation the decision makers hold [2].
A growing number of studies focus on HFS and some extensions are presented, such as
interval-valued HFS [3], possible-degree generalized HFS [4] and linguistic HFS [5].
On the other hand, in recent years, incompatible bipolarity has attracted re-
searchers’ attentions with some instructive results have devoted to it [6,7]. In fact,
incompatible bipolarity is inevitable in the real world. See an example of the psychol-
ogy disease-bipolar disorder. A patient suffering bipolar disorder has episodes of mania
and depression. Two poles may simultaneously reach extreme cases, i.e., the sum of
positive pole value and negative pole value is bigger than 1. Bipolar-valued fuzzy set
(BVFS) was pointed out is suitable to handle incompatible bipolarity [8,9].
The aforementioned HFS and its extensions can not accommodate incompatible
bipolarity. Considering BVFS is adept at modeling incompatible bipolarity, by combin-
ing BVFS with HFS, hesitant bipolar fuzzy set (HBFS) is introduced in this paper. And a
hesitant bipolar fuzzy multiple criteria group decision making (MCGDM) method based
on TOPSIS [10] is presented. Our study firstly accommodates fuzziness, hesitation, and
incompatible bipolarity in fuzzy set and multiple criteria decision making.
The rest is structured as follows. In Section 1, some related notions are reviewed.
The concept of HBFS is introduced and some related properties are discussed. In Section
1 Corresponding Author: Ying Han, B-DAT & CICAEET, Nanjing University of Information Science and

Technology, Jiangsu, Nanjing 210044, P. R. China; E-mail:hanyingcs@163.com.


116 Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making

2, a hesitant bipolar fuzzy group decision making method based on TOPSIS is presented.
In Section 3, an illustrated case about chemical project evaluation is included to show the
feasibility, validity, and necessity of the theoretical results obtained. Finally, the paper is
concluded in Section 4.
Throughout the paper, denote I P = [0, 1], I N = [−1, 0]. The sets X always repre-
sents the finite discourse.

1. Hesitant Bipolar Fuzzy Set

In this section, firstly, some related notions are reviewed. Then, the concept of HBFS is
introduced and some related properties are discussed.
In [2], Torra and Narukawa suggested the concept of HFS permitting the member-
ship degree of an element to a set presented as several possible values in I P . In [11],
bipolar-valued fuzzy set B in X is defined as B = {< x, B(x) = (B P (x), B N (x)) >|
x ∈ X}. Where the functions B P : X → I P , x → B P (x) ∈ I P and B N : X →
I N , x → B N (x) ∈ I N define the satisfaction degree of the element x ∈ X to the prop-
erty corresponding and the implicit counter-property to the BVFS B in X, respectively.
Denote L = {α = (αP , αN ) | αP ∈ I P , αN ∈ I N }, then α is called a bipolar-valued
fuzzy number (BVFN) in [9]. For any α = (αP , αN ), the preference order relation is
defined as α ≤ β if and only αP ≤ β P and αN ≤ β N . The preference order relation is
P N
partial. Denote αM = α +α 2 and we see that if α ≤ β, then αM ≤ β M , then we can
rank all the BVFNs according to their mediation values [9].
Next, the concept of the HBFS is introduced, accommodating fuzziness, hesitation,
and incompatible bipolarity in fuzzy set theory for the first time.

Definition 1 Hesitant bipolar fuzzy set in X is defined as à = {< x, h̃à (x) >| x ∈ X}.
Where h̃Ã (x) is a set of some different BVFNs in L, representing the possible bipolar
membership degree of the element x ∈ X to the set Ã. For convenience, h̃Ã (x) is called
a hesitant bipolar fuzzy element (HBFE), a basic unit of HBFS.

Inspired by work about HFS proposed by Xia et al. [12], for a HBFE h̃Ã (x), it
is necessary to arrange the BVFNs in h̃Ã (x) in the increasing order according to the
mediation value. Suppose that l(h̃Ã (x)) stands for the number of BVFNs in HBFE h̃Ã (x)
σ
and h̃Ãj (x) be the jth largest BVFN in h̃Ã (x). Given two different HBFSs Ã, B̃ in X,
denote lx = max{l(h̃Ã (x)), l(h̃B̃ (x))}. If l(h̃Ã (x)) = l(h̃B̃ (x)), then the shorter one
should be extended by adding the largest value until it has the same length with the longer
one.
In the following paper, all of HBFSs in X are denoted by F̃ (X). HBFE is denoted
by h̃ for simplicity, and the set of all of the h̃ is denoted by L̃. And the preference order
relation in L̃ is defined in the following definition.

Definition 2 Let h̃1 , h̃2 ∈ L̃, then define preference order relation in L̃ as follows: h̃1 ≤
σ σ σ σ σj
h̃2 if and only if (h̃1 j )P ≤ (h̃2 j )P and (h̃1 j )N ≤ (h̃2 j )N . Where, (h̃i ) be the jth
σj
largest BVFN in (h̃i ), (i = 1, 2) according to the mediation value.
Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making 117

Aggregation operator is the fundamental element in MCGDM method, thus by in-


troducing some operations about HBFEs, a hesitant bipolar fuzzy aggregation operator
is proposed.

Definition 3 Let h̃1 , h̃2 ∈ L̃ and λ > 0. Define operations in L as follows:



h̃1 ⊗ h̃2 = (γ̃1P · γ̃2P , −γ̃1N · γ̃2N ),
γ̃1 ∈h̃1 ,γ̃2 ∈h̃2
  
(h̃1 )λ = (γ̃1P )λ , −|γ̃1N |λ
γ̃1 ∈h̃1

Definition 4 Let h̃i ∈ L̃ (i = 1, 2, · · · , n),and w = (w1 , w2 , · · · , wn ) be the weight


n
vector of h̃i satisfying wi ∈ I P as well as i=1 wi = 1. Then, a hesitant bipolar fuzzy
weighted geometric (HBFWG) operator is a mapping defined as follows:

HBF W G(h̃1 , h̃2 , · · · , h̃n ) = ⊗ni=1 h̃w


i
i
(1)

Distance is needed in TOPSIS method, next, axiom definition of distance about


HBFSs is introduced.

Definition 5 For any Ã, B̃, C̃ ∈ F̃ (X), if the operation d˜ : F̃ (X) × F̃ (X) → I P
satisfying the following conditions: 1◦ 0 ≤ d( ˜ Ã, B̃) ≤ 1 and d(
˜ Ã, B̃) = 0 if and only if
à = B̃; 2◦ d( ˜ B̃, Ã); 3◦ d(
˜ Ã, B̃) = d( ˜ Ã, C̃) ≤ d(
˜ Ã, B̃) + d(
˜ B̃, C̃). Then d˜ is called the
distance in F̃ (X).

A distance about HBFSs is proposed in the next example.



n
Example 1 Let ωi ∈ I P (i = 1, 2, · · · , n) satisfying ωi = 1. For any Ã, B̃ ∈ F̃ (X),
i=1
weighted Hamming distance d˜wh between à and B̃ is defined as follows:
 lxi 
 
 σj P σ 
d˜wh (Ã, B̃) = ωi 1
2lxi (h̃Ã ) (xi ) − (h̃B̃j )P (xi )
 j=1  (2)
 σ σ 
+ (h̃Ãj )N (xi ) − (h̃B̃j )N (xi )

2. Hesitant Bipolar Fuzzy Multiple Criteria Decision Making Method

In this section, based on the theory results in the above section, a hesitant bipolar fuzzy
MCGDM method based on TOPSIS is presented.
Considering a MCGDM problem with hesitant bipolar fuzzy information. Let
{x1 , · · · , xm } be the alternatives set, {c1 , · · · , cn } be the evaluation criteria set and t
experts be invited to make evaluation. The hesitant bipolar fuzzy evaluation value to
alternative xi about the criteria cj given by the sth expert is denoted by the HBFE
(h̃sij ), then we can derive the hesitant bipolar fuzzy matrix (HBFM) given by the sth
expert as H̃ s = (h̃sij )m×n (i = 1, · · · , m; j = 1, · · · , n, s = 1, · · · , t). Suppose all the
118 Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making

BVFNs in h̃sij is arranged in the increasing order according to the mediation value. The
weights vector about experts is supposed to be known as w = (w1 , · · · , wt ) satisfying
s
ws ∈ I P and ws = 1 and the weights vector about criteria is supposed to be known
t=1

n
as ω = (ω1 , · · · , ωn ) satisfying ωj ∈ I P and ωj = 1.
j=1
The hesitant bipolar fuzzy multiple criteria decision making method based on TOP-
SIS is given as follows:
Step 1. Use (1) to aggregate the HBFM H̃ s to get the comprehensive HBFM H̃ =
(h̃ij )m×n (i = 1, · · · , m; j = 1, · · · , n). Where h̃ij = HBF W G(h̃1ij , h̃2ij , · · · , h̃tij ).
Step 2. Denote lj = maxi=1,···,m {l(h̃ij )}. For j = 1, · · · , n, if l(h̃ij ) < lj , adding
the largest value in it until its length equal to lj . And compute

 
(h̃j )∗ =
σ(1) σ(1)
maxi=1,···,m (h̃ij )P , maxi=1,···,m (h̃ij )N , · · · ,
 
σ(l ) σ(l ) (3)
maxi=1,···,m (h̃ j )P , maxi=1,···,m (h̃ j )N
ij lj

and

 
σ(1) σ(1)
(h̃j )∗ = mini=1,···,m (h̃ij )P , mini=1,···,m (h̃ij )N , · · · ,
 
σ(l ) σ(l ) (4)
mini=1,···,m (h̃ij j )P , mini=1,···,m (h̃lj j )N .

Then h̃∗ = {(h̃1 )∗ , · · · (h̃n )∗ } is the positive ideal point and h̃∗ = {(h̃1 )∗ , · · · (h̃n )∗ } is
the negative ideal point.
Step 3. Denoted h̃i = {h̃i1 , · · · , h̃in }. For any i = 1, · · · , m, compute the distance
(d˜i )∗ ((d˜i )∗ )between h̃i and h̃∗ (h̃∗ ) by (2).
˜i )∗
Step 4. Computes ξi = (d˜ )(d+( d˜i )∗
, i = 1, · · · , m
i ∗
Step 5. Rank the alternatives according to the principle that the smaller ξi is, the
better the project xi is.

3. Case Study

In this section, a chemical project evaluation problem is presented to demonstrate how


to use the proposed method to make evaluation with incompatible fuzzy bipolarity and
hesitation information.

Example 2 Considering a chemical project evaluation problem. Suppose there are four
chemical projects {x1 , x2 , x3 , x4 } to be evaluated, and two experts are invited to make
evaluation. c1 : economy, c2 : environment and c3 : society are evaluation criteria. Con-
sidering its economy evaluation criteria: in a short time, it may bring huge benefits to
the company, resulting its positive evaluation value is 0.8; on the other hand, in the long
run, the pollution needs a huge amount of money to fix, resulting its negative evaluation
value is 0.7. The sum of two poles is 1.5, bigger than 1, i.e., there exists incompatible
bipolarity. And when make evaluation, experts may have hesitation among several mem-
Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making 119

berships, thus, to alternative xi about the criteria cj the evaluation values given by the
sth expert is denoted by the HBFE h̃sij . Suppose all the BVFNs in h̃sij is arranged in the
increasing order according to the mediation value. The HBFMs given by Expert 1 and
Expert 2 are presented in Table 1 and Table 2, respectively. The weights vectors about
experts and criteria are given as w = (0.7, 0.3) and ω = (0.3, 0.5, 0.2), respectively.

Table 1 HFBM given by Expert 1


X c1 c2 c3
x1 ([0.9, −0.7]) ([0.8, −0.6], [0.7, −0.4]) ([0.7, −0.4], [0.8, −0.3])
x2 ([0.8, −0.1]) ([0.8, −0.3]) ([0.9, −0.2], [0.8, −0.1])
x3 ([0.7, −0.2]) ([0.6, −0.3]) ([0.7, −0.1])
x4 ([0.6, −0.4]) ([0.5, −0.3]) ([0.7, −0.4])

Table 2 HFBM given by Expert 2


X c1 c2 c3
x1 ([0.8, −0.6]) ([0.8, −0.5]) ([0.8, −0.4])
x2 ([0.8, −0.2]) ([0.7, −0.1]) ([0.8, −0.2], [0.7, −0.1])
x3 ([0.6, −0.3]) ([0.7, −0.2], [0.8, −0.4]) ([0.8, −0.3])
x4 ([0.6, −0.3]) ([0.6, −0.2], [0.7, −0.1]) ([0.6, −0.4])

Next, we will see how to use the proposed method to make evaluation.
Step 1. Use (1) to aggregate the HBFM H̃ s given by the sth expert to get the com-
prehensive HBFM H̃ = (h̃ij )4×3 .
The comprehensive HBFM is given in Table 3.

Table 3 Comprehensive HFBM


U c1 c2
x1 ([0.8688, −0.6684]) ([0.8000, −0.5681], [0.7286, −0.4277])
x2 ([0.8000, −0.1231]) ([0.7686, −0.2158])
x3 ([0.6684, −0.2259]) ([0.6284, −0.2656], [0.6541, −0.3270])
x4 ([0.6000, −0.3669]) ([0.5281, −0.2656], [0.5531, −0.2158])
X c3
x1 ([0.7286, −0.4000], [0.8000, −0.3270])
x2 ([0.8688, −0.2000], [0.7686, −0.1000], [0.8000, −0.1231], [0.8346, −0.1625])
x3 ([0.7286, −0.1390])
x4 ([0.6684, −0.4000])

Step 2. Compute the positive (negative) ideal point h̃∗ (h̃∗ ) by (3) ((4)).
By (3), we have h̃∗ = {([0.8688, −0.1231]), ([0.8000, −0.2158], [0.7686, −0.2158]),
([0.8688, −0.1390], [0.8000, −0.1000], [0.8688, −0.1231], [0.8000, −0.1390])}.
By (4), we have h̃∗ = {([0.6000, −0.6684]), ([0.5281, −0.5681], [0.5531, −0.4277]),
([0.6684, −0.4000], [0.6684, −0.4000], [0.6684, −0.4000], [0.6684, −0.4000])}.
Step 3. Compute the distance (d˜i )∗ ((d˜i )∗ ) between h̃i and h̃∗ (h̃∗ ) by (2), i =
1, 2, 3, 4.
By (2), we have (d˜1 )∗ = 0.1850, (d˜2 )∗ = 0.0199, (d˜3 )∗ = 0.1116, (d˜4 )∗ = 0.1934;
(d˜1 )∗ = 0.1178, (d˜2 )∗ = 0.2878, (d˜3 )∗ = 0.1977, (d˜4 )∗ = 0.1093;
˜i )∗
Step 4. Computes ξi = (d˜ )(∗d+( d˜i )∗
, i = 1, 2, 3, 4.
i
We have ξ1 = 0.3890, ξ2 = 0.9352, ξ3 = 0.6392, ξ4 = 0.3610.
Step 5. Rank the alternatives according to the principle.
120 Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making

We get the conclusion that x2 is the optimal project.


Comparison
If we just consider the positive evaluation of criteria in Example 2, we get (ξ1 )P =
0.9230, (ξ2 )P = 0.8377, (ξ3 )P = 0.3796, (ξ4 )P = 0, then x1 is the optimal project.
The result is different from the one considering incompatible bipolarity. Comparing with
the existing method, by accommodating incompatible bipolarity, fuzziness and hesitation
information in decision making for the first time, our method is more suitable to the
urgent demands about protection of environment and resource.

4. Conclusions

In this paper, by combining hesitant fuzzy set with bipolar-valued fuzzy set, the concept
of hesitant bipolar value fuzzy set is introduced, and then, a hesitant bipolar fuzzy group
decision making method is presented. Our study firstly accommodates fuzziness, hesita-
tion, and incompatible bipolarity in information process. In the following work, we will
try to combining rough set theory with hesitant bipolar fuzzy set.

Acknowledgements

This work was supported in part by the Joint Key Grant of National Natural Science
Foundation of China and Zhejiang Province (U1509217), the National Natural Sci-
ence Foundation of China (61503191) and the Natural Science Foundation of Jiangsu
Province, China (BK20150933).

References

[1] L.A. Zadeh, Fuzzy sets, Inform. and Control, 8 (1965) 338–353.
[2] V. Torra and Y. Narukawa, On hesitant fuzzy sets and decision, in: the 18th IEEE International Confer-
ence on Fuzzy Systems, Korea, 2009, 1378–1382.
[3] N. Chen, Z.S. Xu and M.M. Xia, Correlation coefficients of hesitant fuzzy sets and their applications to
clustering analysis, Applied Mathematical Modeling, 37 (2013) 2197–2211.
[4] Y. Han, Z.Z. Zhao, S. Chen and Q.T. Li, Possible-degree generalized hesitant fuzzy set and its Applica-
tion in MADM, Advances in Intelligent Systems and Computing, 27 (2014) 1–12.
[5] F.Y. Meng and X.H. Chen, A hesitant fuzzy linguistic multi-granularity decision making model based
on distance measures, Journal of Intelligent and Fuzzy Systems, 28 (2015) 1519–1531.
[6] J. Montero, H. Bustince, C. Franco, J.T. Rodríguez, D. Gómez, M. Pagola, J. Fernández and E. Bar-
renechea, Paired structures in knowledge representation, Knowledge-Based Systems, 100 (2016) 50–58.
[7] C.G. Zhou, X.Q. Zeng, H.B. Jiang, L.X. Han, A generalized bipolar auto-associative memory model
based on discrete recurrent neural networks, Neurocomputing, 162 (2015) 201–208.
[8] H. Bustince, E. Barrenechea, M. Pagola, J. Fernandez, Z.S. Xu, B. Bedregal, J. Montero,H. Hagras,
F. Herrera and B.D. Baets, A historical account of types of fuzzy sets and their relationships, IEEE
Transactions on Fuzzy Systems, 24 (2016) 179–194.
[9] Y. Han, P. Shi and S. Chen, Bipolar-valued rough fuzzy set and its applications to decision information
system, IEEE Transactions on Fuzzy Systems, 23 (2015) 2358–2370.
[10] Y.J. Lai, T.Y. Liu and C.L. Hwang, TOPSIS for MODM, European Journal of Operational Research, 76
(1994) 486–500.
[11] W.R. Zhang, Bipolar fuzzy sets and relations: a computational framework for cognitive modeling and
multiagent decision analysis, Proceedings of IEEE Conf., 1994: 305–309.
[12] M.M. Xia and Z.S. Xu, International Journal of Approximate Reasoning, 52 (2100) 395–407.
Fuzzy Systems and Data Mining II 121
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-121

Chance Constrained Twin Support Vector


Machine for Uncertain Pattern
Classification
Ben-Zhang YANG a ,Yi-Bin XIAO b , Nan-Jing HUANG a,1 , and Qi-Lin CAO c,2
a Department of Mathematics, Sichuan University, Chengdu, Sichuan, P.R. China
b Department of Mathematics, University of Electronic Science and Technology of

China, Chengdu, Sichuan, P.R. China


c Business School, Sichuan University, Chengdu, Sichuan, P.R. China

Abstract. In this paper, using chance constrained programming formulation, a new


chance constrained twin support vector machine (CC-TWSVM) is proposed. This
paper studies twin support vector machine classification when data points are un-
certain with measurement statistically noise. With some properties known for the
distribution, the CC-TWSVM model aims to ensure the small probability of error
classification for the uncertain data. We also provide equivalent second-order cone
programming (SOCP) model of the CC-TWSVM model by the properties of mo-
ment information of uncertain data. The dual problem of SOCP model is introduced
and the optimal value of the CC-TWSVM model can be solved directly. In addition,
we also show the performance of CC-TWSVM model in artificial data and real data
by numerical experiments.
Keywords. support vector machine, robust optimization, chance constraints,
uncertain classification.

Introduction

Nowadays, support vector machines (SVMs) are considered as one of the most effective
learning methods for classification. The main idea of this classification technique is by
mapping the data to the higher dimensional space with some kernel methods and then
determine a hyperplane separating binary classes with maximal margin [1,2].
Binary data classification methods have made breakthrough progress in recent years.
Mangasarian et al. [3] proposed generalized evigenvalue proximal support vector ma-
chinie (GEPSVM). Different from canonical SVM, GEPSVM aims to find two optimal
nonparallel planes such that each hyperplane is closer to its class and is as far as possi-
ble from the other class. Motivated by GEPSVM Jayadeva et al. [4] proposed a a twin
support vector machine (TWSVM) to solve the classification of binary data. The main

1 Corresponding Author: Nan-Jing Huang, Department of Mathematics, Sichuan University, Chengdu,

Sichuan, P.R. China, 610000, E-mail: nanjinghuang@hotmail.com.


2 Corresponding Author: Qi-Lin Cao, Business School, Sichuan University, Chengdu, Sichuan, P.R. China,

610000, E-mail: qlcao@scu.edu.cn.


122 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification

idea of TWSVM is generating two nonparallel planes that have the similar properties
in GEPSVM. But different from GEPSVM, the two planes in TWSVM are obtained by
double related programming problems. At the same time, the ν -TWSVM [5] was pro-
posed for handling outliers as an extension of TWSVM. Some extensions to the TWSVM
can be founded in [6].
For the above-mentioned methods, the parameters in the training data sets are im-
plicitly assumed to be known exactly. However, in real world application, parameters are
perturbed as they are estimated from the data of the measurement and the statistical error
[7]. For instance, the real data points are always incorporating the uncertain information
in automatic acoustic identification and other imbalanced data problems [8]. When the
data points are uncertain, some SVM models for processing uncertainties have been pro-
posed as the development of previous model. Trafalis et al. [9] proposed a robust opti-
mization model when the noise of the uncertain data is norm-bounded. Robust optimiza-
tion [10] was also introduced in the cases of chance constraints. The usage of robust op-
timization in chance constraints is to ensure small probability of error classification for
the uncertainty. More precisely, this assurance is to require the probability of construct-
ing a maximum margin linear classifier by random variables more higher. It also means
that probability which the points of one class are classified to the other is controlled by a
extremely low value. Ben Tal et al. [11,12] employed moment information of uncertain
training data to developing a different chance-constrained SVM (CC-SVM) model. How-
ever, to our best knowledge, there is no researcher considering the chance-constrained
optimization in TWSVM problem. Therefore, it is interesting and important to study the
TWSVM with chance constraints for the uncertain data classification problem. The main
purpose of this paper is to make an attempt in this direction.
Combining the capability of processing the uncertainty of chance constraints and
the benefits of TWSVM, in this paper, we propose a chance constrained twin support
vector machine (CC-TWSVM). The main method of this paper is ,by using the moment
information of uncertain data, to transform chance constrained programming into second
order cone programming. Section 1 recalls SVM and TWSVM briefly. In Section 2, we
introduce the model of CC-TWSVM. Experimental results on the uncertain data sets are
presented in Section 3. Conclusions are provided in Section 4.

1. Preliminaries

In this section, we briefly recall some concepts of SVM and TWSVM for binary classi-
fication problem.

1.1. SVM

Let us consider the linearly separable classification problem. Given training set

{(x1 , y1 ), · · · , (xl , yl )} ⊆ Rm × {−1, +1}.

SVM aims to find an optimal hyperplane wT x + b = 0 which separates the data into
2
two classes based on maximizing the distance w between two support hyperplanes,
2
which can be formulated as follows
B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification 123

min 12 w 22 +C ∑li=1 ξi
w,b
s.t. yi (wT xi + b) ≥ 1 − ξi , (1)
ξi ≥ 0, i = 1, · · · , l.

After solving (1), a new point is classified as class +1 or class −1 according to the
finally decision function f (x) = sgn(wT x + b).

1.2. TWSVM

Consider a binary classification problem of l1 positive points and l2 negative points (l1 +
l2 = l). Suppose that data points belong to positive class are denoted by A ∈ Rl1 ×n , where
each row Ai ∈ Rn (i = 1, · · · , l) represents a data point with label +1. Similarly, B ∈ Rl2 ×n
represents all the data points with the label −1. The TWSVM determines two nonparallel
hyperplanes:

f+ (x) = wT+ x + b+ = 0 and f− (x) = wT− x + b− = 0, (2)

where w+ , w− ∈ Rn , b+ , b− ∈ R. Here, each hyperplane is close to one of the two classes


and is at least one distance from the other class points. The formulation of TWSVM are
as follows:

min 1 Aw+ + e+ b+ 22 +C1 eT− ξ


w+ ,b+ 2 (3)
s.t. −(Bw+ + e− b+ ) + ξ ≥ e− , ξ ≥ 0

and

min 1 Bw− + e− b− 22 +C2 eT+ η


w− ,b− 2 (4)
s.t. (Aw+ + e+ b+ ) + η ≥ e+ , η ≥ 0,

where C1 ,C2 are pre-specified penalty factors, e+ , e− are vectors of ones of correspond-
ing dimensions. It is apparent from the formulations that the vector of ones e+ is l2 di-
mensions and e1 is l1 . The nonparallel hyperplanes (2) can be obtained by solving (3)
and (4). Then the new point is classified by following decision function

xT wr + br = min | xT wr + br |, (5)
r=+,−

where r represents the class label +1 or −1.

2. Chance Constrained Twin Support Vector Machine

In this section, we introduce chance constrained programming (CCP) briefly and propose
a chance constrained twin support vector machine (CC-TWSVM) to process uncertain
data points.
When uncertain noise exists in the datast, the TWSVM model need to be modified
to contain the uncertainty information. Suppose there are l1 and l2 training data points in
124 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification

Rn , use A i = [Ai1 , · · · , Ain ], i = 1, · · · , l1 to denote the uncertain data points and the label
is positive +1. And let B i = [Bi1 , · · · , Bin ], i = 1, · · · , l2 to denote the uncertain data points
and the label is negative −1 respectively. Then A = [A 1 , · · · , A
l ]T and B = [B
1 , · · · , B
l ]T
1 2
represent two data sets. The chance-constrained program is to determine two nonparallel
planes such that each hyperplane is closer to its class in the sense of expectation and is
as far as possible from the other class in probability. The chance-constrained TWSVM
formulations are

l1
∑ ξi
2 E{ Aw+ + e+ b+ 2 } +C1
1 2
min
w+ ,b+ i=1 (6)
s.t. P{−(B i w+ + b+ ) ≤ 1 − ξi } ≤ ε
ξi ≥ 0, i = 1, · · · , l1

and

l2
∑ ηi
2 E{ Bw− + e− b− 2 } +C2
1 2
min
w− ,b− i=1
(7)
s.t. P{(A i w− + b− ) ≤ 1 − ηi } ≤ ε
ηi ≥ 0, i = 1, · · · , l2 .

where E{·} denote the expectation under corresponding distribution, C1 ,C2 are user-
given regularization parameters, 0 < ε < 1 is a parameter close to 0 and P{·} is the prob-
ability distribution of uncertain data points of binary classes sets. The objective functions
of model ensure that minimum distance between each hyperplane to its class in average.
The chance constraints of model ensure that an upper bound on the misclassification
probability which the point is assigned to another class. The chance constraints in the
model have the advantages of guaranteing classification correctly with high probability.
And the determined planes constructing by maximum margin classifiers are robust to
uncertainties in data. But two quadratic optimization problems (6) and (7) with chance
constrained are obviously non-convex, so the model is difficult to solve. So far using
different bounded inequalities is always effective technique to deal with CCP. When the
mean and covariance matrix of uncertain data points are known, then multivariate bound
[13,14,15] can be adopted to express the chance constraints by robust optimization.
Let X ∼ (μ , Σ) denote random vector X with mean μ and covariance matrix Σ, the
multivariate Chebyshev inequality states that for any closed convex set S, the supremum
of the probability that X take a value in S is

sup P{X ∈ S} = 1
1+d 2
X∼(μ ,Σ) (8)
d 2 = inf (X − μ )T ∑−1 (X − μ ).
X∈S

Assume the first and second moment information of random variables A i and B i are
known. Let μi+ = E[A i ] and μi− = E[B i ] be the mean vector seperately. And let ∑+
i =
+ T − − T −
E[(Ai − μi ) (Ai − μi )] and ∑i = E[(Bi − μi ) (Bi − μi )] be the covariance matrix of
+

the two data set uncertain points respectively. Then the problems (6) and (7) could be
reformulated respectively as:
B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification 125

l1
T +T b + 1 l b2 +C
2 w+ G w+ + w + μ ∑ ξi
1 T +
min + 2 1 + 1
w+ ,b+ i=1 (9)
1
s.t. −(μi− w+ + b+ ) ≥ 1 − ξi + k ∑−
i
2
w+ , ξi ≥ 0

and

l2
1 T − T −T b + 1 l b2 +C
min 2 w− G w− + w − μ − 2 2 − 2 ∑ ηi
w− ,b− i=1 (10)
1
s.t. μi+ w− + b− ≥ 1 − ηi + k ∑+
i
2
w− , ηi ≥ 0,

1−ε
where k = ε and

l1 l1
G+ = ∑ (μi+ μi+ + Σ+ μ + = ∑ μi+
T
i ),
i=1 i=1

with
l2 l2
G− = ∑ (μi− μi− + Σ− μ − = ∑ μi− .
T
i ),
i=1 i=1

Let
 
1 G+ μ +T
H+ = . (11)
2 μ + l1

Then the matrix H + is positive semi-define. To ensure the strict convexity of problem
(9), we can always append a perturbation ε I (ε > 0, I is the identity matrix) such that the
matrix H + + ε I is positive define. Without loss of generality, suppose that H + is positive
define.
The dual problems of chance-constrained TWSVM models (9) and (10) can be for-
mulated as the following models

l1 T T T T
max ∑ λi − 12 s+
i H1 G H1 si − 2 l1 si H2 H2 si − μi H1 si H2 si
+ + + + 1 + + + + + + + + +
λi ,ν i=1
l  1
 l1

1 T (12)
s.t. − ∑ λi μi− + kΣi− 2 ν , ∑ λi = s+ i
i=1 i=1
0 ≤ λi ≤ C1 , ν ≤ 1

and
l2 T T T T
max ∑ γi − 12 s− − − − − − − − − + + − + −
i H1 G H1 si − 2 l2 si H2 H2 si − μi H1 si H2 si
1
γi ,υ i=1
l  1
 l 
2
+T
2 (13)
s.t. − ∑ γi μi − kΣi υ , ∑ γi = s−
+2
i
i=1 i=1
0 ≤ γi ≤ C2 , υ ≤ 1,
126 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification

where
−1 −1
H+ = [H1+ , H2+ ], H− = [H1− , H2− ].

3. Numerical Experiments

In this section, our CC-TWSVM model is illustrated by numerical test based on two
types of data sets. The first test is implemented to certify the performance of our CC-
TWSVM by artificial data. And in second test, we also test the performance of CC-
TWSVM model on real-word classifying data sets from UCI Machine Learning Repos-
itory. All results were averaged on 10 train-test experiments and carried out by Matlab
R2012a with 2.5GHz CPU, 2.5G usable RAM. The SeDuMi 3 software is employed to
solve the SOCP problems of CC-TWSVM.

3.1. Artificial data

To give an direct interpretation of CC-TWSVM performance, we generate one uncertain


set of 2-dimension data randomly. The normal distribution contribution of binary classes
of the data is
       
0 10 − −1 − 70
μ =+
, Σ =+
,μ = , Σ = .
2 04 0 03

8 8

6 6

4 4

2 2

0 0

−2 −2

−4 −4

−6 −6
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3

(a) ε = 0.1 (b) ε = 0.01

Figure 1. The performance of CC-TWSVM in the first data set.

Figure 1 showes that the performance of CC-TWSVM to two uncertain data set
points. In numerical experiments, different data points are generated by respective dis-
tribution. In data set, one class points were generated by normal distribution (μ + , Σ+ )
and the other by (μ − , Σ− ). Each class has 50 points, and 20 points are randomly picked
as the training points, the other points are the test points. In Figure 1 , the blue stars are
the points of +1 class, while -1 class with the red circles. For simplicity, we set ε to be
0.1 and 0.01 respectively. The penalty parameters C1 and C2 are selected form the set
3 http://sedumi.ie.lehigh.edu/
B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification 127

{10i |i = −5, · · · , 5}. After 10 times experiments, we obtain the results of two parameters
of hyperplanes and set the average parameters to be the ultimate result. The blue and red
lines are the separating hyperplane (2-D) that we look for. In fact, the value of parame-
ter ε also effect the determination of two hyperplane. When the parameter ε decreases
from 0.1 to 0.01, the average accuracy of classifier is more higher and the planes are
more closet to responding classes. Figure 1. (a) and (b) perform the effect of various
parameters.

3.2. Real data

In numerical experiments, this section presents results in two real data sets. The following
datasets were used in the experiments:
• WBCD Wisconsin Breast Cancer Diagnostic dataset was also obtained from UCI
dataset [16]. WBCD data is 10-dimensional. The data set has 699 samples, with
444 benign samples are labeled as the class +1 and the remaining malignant as
the class -1.
• IONOSPHERE Ionosphere dataset was collected from UCI dataset . Ionosphere
data is of 34-dimension. The data set has 351 samples, with 225 good samples
are labeled as the class +1 and the remaining as the class -1.
The distribution properties are often so unknown that need to be estimated from data
points. If an uncertain point x i = [
xi1 , · · · , x
in ] has N samples xik , k = 1, · · · , N, then the
T
N
sample mean xi = 1
N ∑ xik is used to estimate the mean vector μi = E[
xi ], and the sample
i=1
covariance

1 N
Si = ∑ (xik − xi )(xik − xi )T
N − 1 i=1

is used to estimate the covariance matrix

xi − μi )(
Σi = E[( xi − μi )T ].

But these could bring possible estimation errors in some condition that the mean
vector μi and covariance matrix Σi may not obtained. Panos M. Pardalos et al. [17] has
discussed the way to processing these special cases. In our practical experiments, similar
to Pardalos, we employ mentioned methods to modify the estimation.
Since the data sets are uncertain, the measures of performance are worth studied.
Ben-Tal et al. [11] proposed using nominal error and optimal error to evaluate the perfor-
mance. In our experiment, we choose these index to calculate the accuracy of our model.
The formula of NomErr is
∑ 1yipre =yi
i
NomErr = × 100%
the amount o f training data

The optimal error (OptErr) is defined on the basis of the misclassification probabili-
ty. The chance constraints in the model (6), (7) can be reformulated to (9), (10), then we
can derive the least value of ε called εopt . Te OptErr of data point xi is defined as
128 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification

1 if yipre = yi
OptErr =
εopt if yipre = yi

And the OptErr of dataset is defined as

∑ OptErri
i
OptErr = × 100%.
the amount o f training data

We tested on the WBCD firstly. Because each data points in WBCD has 10 attributes,
the amount of time in calculating SOCP would take too much. We used principle com-
ponent analysis (PCA) to exact the two important features. Then 80% of the data points
was used as training and the remaining as the test data. For the setting of parameter ε ,
three parameter values {0.1, 0.05, 0.01} were adopted separately. Similar to the exper-
iments in artificial data, the penalty parameters C1 and C2 were selected form the set
{10i |i = −5, · · · , 5}.

3.1
5.5
5.5

3.08
5
5
3.06
4.5
4.5
3.04
4

4 3.02
3.5

3.5 3
3

e=0.1 e=0.05 e=0.01 e=0.1 e=0.05 e=0.01 e=0.1 e=0.05 e=0.01

(a) NomErr (b) OptErr (c) Training time

Figure 2. The performance of CC-TWSVM in the Wisconsin breast cancer data set.

The average results over 10 runs are shown in Figure 2. In Figure 2.(a), it is obvi-
ously that NomErr decreases slightly when ε descends from 0.1 to 0.01. That is because
ε represents the upper bound of misclassification. The result for OptErr is also the case
in Figure 3.(b). When ε decrease from 0.1 to 0.01, average OptErr rate decrease from
5.4% to 5.3% approximately. So we can get the conclusions that classifying accuracy
improves when parameter ε decreases. Since the definitions of OptErr and NomErr, it is
not difficult to see that OptErr was bigger than NomErr from the previous two maps. In
addition, the model takes more time when ε increases. This is due to solving process of
second cone programming problem is related heavily to the parameters.

22
22 2.1
21
21 2.09
20
20 2.08
19
19 2.07
18
2.06
18
17
2.05
17
16
2.04
16 15
2.03
15 14
2.02
14 13
2.01
13 12 2
e=0.1 e=0.05 e=0.01 e=0.1 e=0.05 e=0.01 e=0.1 e=0.05 e=0.01

(a) NomErr (b) OptErr (c) Training Time

Figure 3. The performance of CC-TWSVM in the Ionosphere set.


B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification 129

Table 1. Misclassification rate with different model


Data Sets Classes Instance Features TWSVM CC-SVM CC-TWSVM
Bliver 2 345 6 0.3521 0.3514 0.3504
Heart-c 2 303 14 0.1867 0.1875 0.1802
Hepatitis 2 155 19 0.2082 0.2074 0.1991
Ionosphere 2 351 34 0.0633 0.0625 0.0604
Votes 2 435 16 0.0824 0.0816 0.0736
WBCD 2 699 10 0.1643 0.1606 0.1578

The average results for Ionosphere set over 10 runs are shown in Figure 3.. Similar
to the process in WBCD, we obtained 3 principal attributes of Ionosphere by PCA. Based
on these principal components, 80% of the data points was used as training and the
remaining as the test data. For the setting of parameter ε , three parameter values set
{0.1, 0.05, 0.01} were adopted and the penalty parameters C1 and C2 were selected form
the set {10i |i = −5, · · · , 5} respectively. We can also get the conclusions that classifying
accuracy improves when parameter ε decreases. And in this experiment, it is easy to
see that OptErr was bigger than NomErr. Moreover, because of the usage of SeDuMi
software in solving SOCP, the model takes more time when ε increases.
We also tested our model to compare with previous model, such as TWSVM and
CC-SVM. The experiment data sets were ”Bliver”, ”Heart-c”, ”Hepatitis”, ”Inosphere”,
”Votes”, and ”WBCD” which were selected from UCI datasets. In the experiments,
the penalty parameters in three model were all same. They were selected from the set
{10i |i = −5, · · · , 5} respectively. The parameter ε in CC-SVM and CC-TWSVM model
was selected from the set {0.1, 0.05, 0.01} respectively, and 80% of the data points ware
used as training and the remaining as the test data. Comparison of previous models and
our model is given in Table 1. It is easy to see that the average misclassification rate
of CC-TWSVM is better than original TWSVM. Furthermore, the performance of CC-
TWSVM is better than CC-SVM. This is consistent with the result that two nonparallel
planes has advantages than single hyperplane.

4. Conclusions

A new chance constrained twin support vector machine (CC-TWSVM) via chance con-
strained programming formulation was proposed, which can attend to data set with mea-
surement noise efficiently. This paper studied twin support vector machine classification
when data are uncertain statistically. With chance constraints programming (CCP) in the
model, the CC-TWSVM was used to ensure the low probability of classification error
for the uncertainty. The CC-TWSVM model could be transformed to second-order cone
programming (SOCP) by the properties of moment information of uncertain points and
the dual problem of SOCP model was also introduced. Then we obtained the twin hyper-
planes by the calculating the dual problem. In addition, we also showed the performance
of CC-TWSVM model in artificial data and real data by numerical experiments. In the
future work, how to further make the model more robust is under our consideration. In
addition, to deal with the situation of nonlinear classification with chance constrained is
also interesting.
130 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classification

Acknowledgement

This work was supported by the joint Foundation of the Ministry of Education of China
and China Mobile Communication Corporation (MCM20150505) and the Fundamental
Research Funds foe the Central Universities of Sichuan University (skqy201646).

References

[1] B. Scholkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regulation, Optimiza-
tion, and Beyond, MIT press, Cambridge, 2002.
[2] B. Z. Yang, M. H. Wang, H. Yang, T. Chen, Ramp loss quadratic support vector machine for classifica-
tion, Nonlinear Analalysis Forum, 21 (2016), 101-115.
[3] O. Mangasarian, E. Wild, Multisurface proximal support vector classification via generalized eigenval-
ues, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28 (2006), 69-74.
[4] Jayadeva, R. Khemchandani, S. Chandra, Twin support vector machines for pattern classification, IEEE
Transactions on Pattern Analysis and Machine Intelligence, 29 (2007), 905-910.
[5] X. J. Peng, A v-twin support vector machine (v-TWSVM) classifier and its geometric algorithms, Infor-
mation Sciences, 180 (2010), 3863-3875.
[6] Y. J. Lee, O. L. Mangasarian, SSVM: a smooth support vector machine for classification, Computational
Optimization and Applications, 20 (2001), 5-22.
[7] D. Goldfarb, G. Iyengar, Robust convex quadratically constrained programs, Mathematical Program-
ming, 97 (2009), 495-515.
[8] Paul Bosch, Julio López, Héctor Ramı́rez, Hugo Robotham, Support vector machine under uncertainty:
An application for hydroacoustic classification of fish-schools in Chile, Expert Systems with Applica-
tions, 40 (2013), 4029-4034.
[9] T. B. Trafalis, R. C. Gilbert, Robust classification and regression using support vector machines, Euro-
pean Journal of Operational Research, 173, (2006), 893-909.
[10] C. Bhattacharyya, L. R. Grate, M. I. Jordan, G. L. El, I. S. Mian, Robust sparse hyperplane classifier:
application to uncertain molecular profiling data, Journal of Computational Biology, 11 (2004), 1073-
1089.
[11] A. Ben-Tal, S. Bhadra, C. Bhattacharyya, J.S. Nash, Chance constrained uncertain classificiation via
robust optimization, Mathematical Programming, 127 (2011), 145-173.
[12] A. Ben-Tal, A. Nemirovski, Selected topics in robust convex optimization, Mathematical Programming,
112 (2008), 125-158.
[13] D. Bersimass, I. Popescu, Optimal inequities in probality theory: a convex optimization approach, SIAM:
SIAM Journal on Optimization, 15 (2005), 780-804.
[14] A. W. Marshall, I. Olkin, Multivariate chebyshev inequities. The Annals of Mathematical Statistics, 31
(1960), 1001-1014.
[15] A. Nemirovski, A. Shapiro, Convex approximations of chance constrained programs. SIAM: SIAM Jour-
nal on Optimization, 17 (2010) , 969-996.
[16] A. Frank and A. Asuncion, UCI Machine Learning Repository, 2010. Available at
http://archive.ics.uci.edu/ml.
[17] X. Wang, N. Fan, P. M. Pardalos, Robust chance-constrained support vector machines with second-order
moment information. Annals of Operations Research, (2015), 10.1007/s10479-015-2039-6
Fuzzy Systems and Data Mining II 131
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-131

Set-Theoretic Kripke-Style Semantics for


Monoidal T-Norm (Based) Logics
Eunsuk YANG 1
Department of Philosophy, Chonbuk National University, Jeonju, KOREA

Abstract. This paper deals with non-algebraic binary relational semantics, called
here set-theoretic Kripke-style semantics, for monoidal t-norm (based) logics. For
this, we first introduce the system MTL (Monoidal t-norm logic) and some of its
prominent axiomatic extensions, and then their corresponding Kripke-style seman-
tics. Next, we provide set-theoretic completeness results for them.
Keywords. relational semantics, (set-theoretic) Kripke-style semantics, substructural
logic, fuzzy logic, t-norm (based) logics

1. Algebraic Kripke-style Semantics

After introducing algebraic semantics for t-norm (based) logics, their corresponding
Kripke-style semantics have been introduced. For instance, after Esteva and Godo intro-
ducing algebraic semantics for monoidal t-norm (based) logics in [4], their correspond-
ing Kripke-style semantics were introduced by Montagna and Ono [6], Montagna and
Sacchetti [7], and Diaconescu and Georgescu [3]. These semantics have one important
common feature as follows:
• While such semantics are called Kripke-style semantics in the sense that those
semantics are provided using forcing relations, they are still algebraic in the sense
that their completeness results are provided using the fact that such semantics are
equivalent to algebraic semantics.
Because of this fact, Yang [8,9,10] called these semantics algebraic Kripke-style
semantics. Although non-algebraic Kripke-style semantics, where the “non-algebraic”
means that their completeness results are provided without using the above fact, were
provided for some particular systems (see e.g. [9]), such semantics have not yet been
established for basic fuzzy logics in general.
The aim of this paper is to provide set-theoretic Kripke-style semantics for basic core
fuzzy logics2 . As its starting point, we investigate set-theoretic Kripke-style semantics
for the logic system MTL (Monoidal t-norm logic) and its most prominent axiomatic
1 Corresponding Author: Eunsuk Yang, Department of Philosophy & Institute of Critical Thinking and
Writing, Chonbuk National University, Rm 307, College of Humanities Blvd. (14-1), Jeonju, 54896, KOREA
Email: eunsyang@jbnu.ac.kr.
2 Here, fuzzy logics are logics complete with respect to (w.r.t.) linearly ordered algebras and core fuzzy logics

are logics complete w.r.t. the real unit interval [0, 1] (see [1,2]).
132 E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics

extensions. For this, first, in Section 2, we discuss monoidal t-norm (based) logics and
their corresponding Kripke-style semantics. Next, in Section 3, we provide set-theoretic
completeness results for them.
For convenience, we adopt notations and terminology similar to those in [1,7,8,9,10]
and assume reader familiarity with them (together with results found therein).

2. Monoidal T-norm Logics and Kripke-style Semantics

Monoidal t-norm (based) logics are based on a countable propositional language with the
set of formulas FOR built inductively from a set of propositional variables VAR, propo-
sitional constants , ⊥, and binary connectives →, &, ∧, and ∨. Further connectives are
defined as follows:

df1. ϕ ↔ ψ := (ϕ → ψ) ∧ (ψ → ϕ), and


df2. ¬ϕ := ϕ → ⊥.

Th constant  may be defined as ⊥ → ⊥. Henceforth, the customary notations and termi-


nology are followed and the axiom systems are used to provide a consequence relation.
We first list the axioms and rules of MTL, the most basic monoidal t-norm logic.

Definition 1. The logic MTL is axiomatized as follows:


A1. (ϕ → ψ) → ((ψ → χ) → (ϕ → χ)) (suffixing, SF)
A2. ϕ → ϕ (reflexivity, R)
A3. (ϕ ∧ ψ) → ϕ, (ϕ ∧ ψ) → ψ (∧-elimination, ∧-E)
A4. ((ϕ → ψ) ∧ (ϕ → χ)) → (ϕ → (ψ ∧ χ)) (∧-introduction, ∧-I)
A5. ϕ → (ϕ ∨ ψ), ψ → (ϕ ∨ ψ) (∨-introduction, ∨-I)
A6. ((ϕ → χ) ∧ (ψ → χ)) → ((ϕ ∨ ψ) → χ) (∨-elimination, ∨-E)
A7. ⊥ → ϕ (ex falsum quodlibet, EF)
A8. ϕ →  (verum ex quodlibet, VE)
A9. (ϕ → (ψ → χ)) ↔ (ψ → (ϕ → χ)) (permutation, PM)
A10. (ϕ → (ψ → χ)) ↔ ((ϕ&ψ) → χ) (residuation, RES)
A11. (ϕ&ψ) → ϕ (weakening, W)
A12. (ϕ → ψ) ∨ (ψ → ϕ) (prelinearity, PL)
ϕ → ψ, ϕ  ψ (modus ponens, mp)
ϕ, ψ  ϕ ∧ ψ (adjunction, adj)

Well-known monoidal t-norm logics are axiomatic extensions (extensions for short)
of MTL. We introduce some prominent examples.

Definition 2. The following are famous monoidal t-norm logics extending MTL:

• Basic fuzzy logic BL is MTL plus (DIV) (ϕ ∧ ψ) → (ϕ&(ϕ → ψ)).


• Łukasiewicz logic Ł is BL plus (DNE) ¬¬ϕ → ϕ.
• Gödel logic G is BL plus (CTR) ϕ → (ϕ&ϕ).
• Product logic Π is BL plus (CAN) (ϕ → ⊥) ∨ ((ϕ → (ϕ&ψ)) → ψ).

For easy reference, we let Ls be a set of the monoidal t-norm logics defined previ-
ously.
E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics 133

Definition 3. Ls = {MTL, BL, Ł, G, Π}.


For convenience, “,”, “⊥,” “¬,” “→,” “∧,” and “∨” are used ambiguously as propo-
sitional constants and connectives and as top and bottom elements and frame operators,
but context should clarify their meanings.
Now we provide Kripke-style semantics for Ls. First, Kripke frames are defined as
follows.
Definition 4. (Cf. [7,8,10])
(i) (Kripke frame) A Kripke frame is a structure X = (X, , ⊥, ≤, ∗), where (X, , ≤
, ∗) is an integral linearly ordered commutative monoid such that ∗ is residuated,
i.e., for every a, b ∈ X, the set {c : c ∗ a ≤ b} has a supremum, denoted here a → b.
The elements of X are called nodes.
(ii) (L frame) An MTL frame is a Kripke frame, where ∗ is conjunctive (i.e., ⊥ ∗  = ⊥)
and left-continuous (i.e., if there exists sup{ci : i ∈ I}, then sup{ci ∗ a : i ∈ I} =
sup{ci : i ∈ I} ∗ a). Consider the following conditions: for all a, b ∈ X,
• (DIVF ) min{a, b} ≤ a ∗ (a → b).
• (DNEF ) ¬¬a ≤ a.
• (CTRF ) a ≤ a ∗ a.
• (CANF )  = a → ⊥ or  = (a → (a ∗ b)) → b.
BL frames are MTL frames satisfying (DIVF ); Ł frames are BL frames satisfying
(DNEF ); G frames are BL frames satisfying (CTRF ); Π frames are BL frames
satisfying (CANF ). We call all these frames (including MTL frames) L frames.
An evaluation on a Kripke frame is a forcing relation  between nodes and proposi-
tional variables, constant, and arbitrary formulas satisfying the following conditions: For
every propositional variable p,
(Atomic hereditary condition, AHC) if a  p and b ≤ a, then b  p;
(min) ⊥  p,
for the propositional constant ⊥,
(⊥) a  ⊥ iff a = ⊥, and
for arbitrary formulas,
(∧) a  ϕ ∧ ψ iff a  ϕ and a  ψ;
(∨) a  ϕ ∨ ψ iff either a  ϕ or a  ψ;
(&) a  ϕ&ψ iff there exist b, c ∈ X such that a ≤ b ∗ c, b  ϕ, and c  ψ;
(→) a  ϕ → ψ iff for each b ∈ X, if b  ϕ, then a ∗ b  ψ.
Definition 5. (i) (Kripke model) A Kripke model is a pair (X , ), where X is a
Kripke frame and  is a forcing on X .
(ii) (L model) An L model is a pair (X , ), where X is an L frame and  is a forcing
on X .
Definition 6. Given a Kripke model (X , ), a node a of X and a formula ϕ, we say
that a forces ϕ to express a  ϕ. We say that ϕ is true in (X , ) if   ϕ, and that ϕ is
valid in the frame X (expressed by X |= ϕ) if ϕ is true in (X , ) for each forcing 
on X .
134 E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics

3. Soundness and Completeness for Ls

We first introduce two lemmas, which can be easily proved:

Lemma 1. (Hereditary condition, HC) Let X be a Kripke frame. For every formula ϕ
and for any two nodes a, b ∈ X , if a  ϕ and b ≤ a, then b  ϕ.

Lemma 2.   ϕ → ψ iff for every a ∈ X, if a  ϕ, then a  ψ.

We then provide soundness and completeness results for Ls.

Proposition 1. ([6]) Let X = (X, , ⊥, ≤, ∗) be an L frame and v be an evaluation in


X . For each atomic formula p and for each a ∈ X , let a  p iff a ≤ v(p). Then (X , )
is an L model, and for each formula ϕ and for each a ∈ X , we have that a  ϕ iff
a ≤ v(ϕ).

Proposition 2. (Soundness) For any formula ϕ, if L ϕ, then ϕ is valid in every L frame.

Proof. Here we consider the formulas (PL), (DIV ), (DNE), (CT R) and (CAN) as exam-
ples.
(PL): By the condition (∨), it is sufficient to prove that   ϕ → ψ or   ψ → ϕ. By
Proposition 1, we can instead show that  ≤ v(ϕ → ψ) or  ≤ v(ψ → ϕ). Proposition 1
also ensures v(ϕ → ψ) = v(ϕ) → v(ψ) for all formulas ϕ and ψ. If v(ϕ) ≤ v(ψ), then
 ∗ v(ϕ) ≤ v(ψ) and thus  ≤ v(ϕ → ψ). If v(ψ) ≤ v(ϕ), then  ∗ v(ψ) ≤ v(ϕ) and
thus  ≤ v(ψ → ϕ).
(DIV ): Lemma 2 ensures that in order to prove   (ϕ ∧ ψ) → ((ϕ&(ϕ → ψ)) → ψ),
it is sufficient to show that for each node a ∈ X, if a  ϕ ∧ ψ, then a  ϕ&(ϕ → ψ). By
Proposition 1, we can instead assume a ≤ v(ϕ ∧ ψ) and show a ≤ v(ϕ&(ϕ → ψ)). Note
that Proposition 1 also ensures v(ϕ ∧ ψ) = min{v(ϕ), v(ψ)} and v(ϕ&ψ) = v(ϕ) ∗ v(ψ)
for all formulas ϕ and ψ. Then, since min{v(ϕ), v(ψ)} ≤ v(ϕ) ∗ (v(ϕ) → v(ψ)) by
(DIV F ), we have a ≤ v(ϕ&(ϕ → ψ)).
(DNE): As above, it is sufficient to prove that for each a ∈ X, if a  ¬¬ϕ, then
a  ϕ. By Proposition 1, we instead assume a ≤ v(¬¬ϕ) and show a ≤ v(ϕ). Note that
v(¬ϕ) = v(ϕ → ⊥) = ¬v(ϕ). Then, since a ≤ v(¬¬ϕ) = ¬¬v(ϕ) and ¬¬v(ϕ) ≤ v(ϕ)
by (DNE F ), we have a ≤ v(ϕ).
(CT R): As above, it is sufficient to prove that for each a ∈ X, if a  ϕ, then a  ϕ&ϕ. Let
a  ϕ. By Proposition 1, we have a ≤ v(ϕ). Then, using the monotonicity and (CT RF ),
we also have a ≤ a ∗ a ≤ v(ϕ) ∗ v(ϕ). Hence, by the condition (&) and Proposition 1, we
obtain a  ϕ&ϕ.
(CAN): We need to show that either   ϕ → ⊥ or   (ϕ → (ϕ&ψ)) → ψ. Ob-
viously, v(ϕ) = ⊥ ensures  ≤ v(ϕ → ⊥) since v(⊥ → ⊥) = v(⊥) → v(⊥) = v().
Thus, by Proposition 1, we have   ϕ → ⊥ in case v(ϕ) = ⊥. Let v(ϕ) = ⊥. In or-
der to prove   (ϕ → (ϕ&ψ)) → ψ, we assume a  ϕ → (ϕ&ψ) and show a  ψ.
By Proposition 1, we instead assume a ≤ v(ϕ → (ϕ&ψ)) and show a ≤ v(ψ). Then,
since a ≤ v(ϕ → (ϕ&ψ)) = v(ϕ) → v(ϕ&ψ) = v(ϕ) → (v(ϕ) ∗ v(ψ)) and  = (v(ϕ) →
(v(ϕ) ∗ v(ψ))) → v(ψ) by (CAN F ), we have a ≤ v(ψ).
We leave the proofs for the other cases to the interested reader.
E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics 135

Now, we provide completeness results for Ls. A theory T is said to be linear if, for
each pair ϕ, ψ of formulas, we have T  ϕ → ψ or T  ψ → ϕ. By an L-theory, we
mean a theory T closed under rules of L. By a regular L-theory, we mean an L-theory
containing all of the theorems of L. Since we have no use of irregular theories, by an
L-theory, we henceforth mean an L-theory containing all of the theorems of L.
Let T be a linear L-theory. We define the canonical L frame determined by T as a
structure X = (Xcan , can , ⊥can , ≤can , ∗can ), where can = T , ⊥can = {ϕ : T L ⊥ →
ϕ}, Xcan is the set of linear L-theories extending can , ≤can is ⊇ restricted to Xcan , i.e,
a ≤can b iff {ϕ : a L ϕ} ⊇ {ϕ : b L ϕ}, and ∗can is defined as a ∗can b := {ϕ&ψ : for
some ϕ ∈ a, ψ ∈ b} satisfying integral commutative monoid properties corresponding
to L frames on (Xcan , can , ≤can ). Notice that we construct the base can as the linear
L-theory that excludes nontheorems of L, i.e., excludes any formula ϕ such that L ϕ.
The linearly orderedness of the canonical L frame depends on ≤can restricted on Xcan .
First, we can easily show the following.

Proposition 3. A canonical L frame is connected and thus linearly ordered.

Proof. It is easy to show that a canonical L frame is partially ordered. We show that this
frame is connected and so linearly ordered. Suppose toward contradiction that neither
a ≤can b nor b ≤can a. Then, there are ϕ, ψ such that ϕ ∈ b, ϕ ∈ a, ψ ∈ a, and ψ ∈ b. Note
that, since can is a linear theory, ϕ → ψ ∈ can or ψ → ϕ ∈ can . Let ϕ → ψ ∈ can
and thus ϕ → ψ ∈ b. Then, by (mp), we have ψ ∈ b, a contradiction. The case, where
ψ → ϕ ∈ can , is analogous.

Next, let vcan be a canonical evaluation function from formulas to sets of formulas,
i.e, vcan (ϕ) = {ϕ}. We define a canonical evaluation as follows:

(a) a can ϕ iff ϕ ∈ a.

This definition allows us to state the following lemmas.

Lemma 3. can can ϕ → ψ iff for each a ∈ Xcan , if a can ϕ, then a can ψ.

Proof. By (a), we need to show that ϕ → ψ ∈ can iff for all a ∈ Xcan , if ϕ ∈ a, then
ψ ∈ a. For the left-to-right direction, we assume ϕ → ψ ∈ can and ϕ ∈ a, and show
ψ ∈ a. The definition of ∗can ensures (ϕ → ψ)&ϕ ∈ can ∗can a = a. Since L proves
((ϕ → ψ)&ϕ) → ψ, we have ((ϕ → ψ)&ϕ) → ψ ∈ can and thus ((ϕ → ψ)&ϕ) → ψ ∈
a. Therefore, we obtain ψ ∈ a by (mp). We prove the other direction contrapositively.
Suppose ϕ → ψ ∈ can . We set a0 = {Z : there exists X ∈ can and can  (X&ϕ) → Z}.
Clearly, a0 ⊇ can , ϕ ∈ a0 , but also ψ ∈ a0 . (Otherwise, can  (X&ϕ) → ψ and thus
can  X → (ϕ → ψ); therefore, since can  X, by (mp), we have can  ϕ → ψ, a
contradiction.) Then, by the Linear Extension Property of Theorem 12.9 in [2], we have
a linear theory a ⊇ a0 with ψ ∈ a; therefore ϕ ∈ a but ψ ∈ a.

Lemma 4. (Canonical evaluation lemma) The canonical forcing relation can is an


evaluation.

Proof. We first consider the conditions for propositional variables.


For (AHC), we must show that: for each propositional variable p,
136 E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics

if a can p and b ≤can a, then b can p.


Let a can p and b ≤can a. By (a), we have p ∈ a and a ⊆ b, and thus p ∈ b. Hence, by
(a), we have b can p.
For (min), we must show that: for each propositional variable p,
⊥can can p.
By (a), we need to show that p ∈ ⊥can . Since ⊥can = {ϕ : T L ⊥ → ϕ}, p ∈ ⊥can .
We next consider the condition for the propositional constant ⊥.
For (⊥), we must show that:
a can ⊥ iff a =can ⊥can .
By (a), we need to show that ⊥ ∈ a iff a =can ⊥can . This is obvious since ⊥can = {ϕ :
T L ⊥ → ϕ}.
Now we consider the conditions for arbitrary formulas.
For (∧), we must show
a can ϕ ∧ ψ iff a can ϕ and a can ψ.
By (a), we need to show that ϕ ∧ ψ ∈ a iff ϕ ∈ a and ψ ∈ a. We can prove the left-to-right
direction using the axiom (∧-E) and the rule (mp). We can also prove the right-to-left
direction using the rule (adj).
For (∨), we must show
a can ϕ ∨ ψ iff either a can ϕ or a can ψ.
By (a), we need to show that ϕ ∨ ψ ∈ a iff either ϕ ∈ a or ψ ∈ a. We can prove the left-
to-right direction using the fact that a is linear and linear theories are also prime theories.
We can also prove the right-to-left direction using the axiom (∨-I) and the rule (mp).
For (&), we must show
a can ϕ&ψ iff there exist b, c ∈ Xcan such that b can ϕ, c can ψ, and a ≤can b ∗can c.
By (a), we need to show that ϕ&ψ ∈ a iff there exist b, c ∈ X such that ϕ ∈ b, ψ ∈ c,
and a ≤can b ∗can c. For the right-to-left direction, we assume that there exist b, c ∈ Xcan
such that ϕ ∈ b, ψ ∈ c, and a ≤can b ∗can c. Then, using the definition of ∗can , we obtain
ϕ&ψ ∈ a. For the left-to-right direction, we assume that, for all b, c ∈ Xcan , if ϕ ∈ b and
ψ ∈ c, then a ≤can b ∗can c does not hold true, and we show ϕ&ψ ∈ a. Let ϕ ∈ b and
ψ ∈ c. Since a ≤can b ∗can c does not hold true, we obtain ϕ&ψ ∈ a.
For (→), we must show
a can ϕ → ψ iff for each b ∈ Xcan , if b can ϕ, then a ∗can b can ψ.
By (a), we need to show that ϕ → ψ ∈ a iff for each b ∈ X, if ϕ ∈ b, then ψ ∈ a ∗can b.
For the left-to-right direction, we assume ϕ → ψ ∈ a and ϕ ∈ b, and show ψ ∈ a ∗can
b. The definition of ∗can ensures (ϕ → ψ)&ϕ ∈ a ∗can b. Then, since L proves ((ϕ →
ψ)&ϕ) → ψ, using Lemma 3, we obtain ψ ∈ a ∗can b. We prove the right-to-left direction
contrapositively. Suppose ϕ → ψ ∈ a. We need to construct a linear theory b such that
ϕ ∈ b and ψ ∈ a ∗can b. Let b0 be the smallest regular L-theory extending can with
{ϕ} and satisfying a ∗can b0 = {Z : there is X ∈ a and can  (X&ϕ) → Z}. Clearly,
ϕ ∈ b0 , but ψ ∈ a∗can b0 . (Otherwise, can  (X&ϕ) → ψ and thus can  X → (ϕ → ψ)
for some X ∈ a; therefore, ϕ → ψ ∈ a, a contradiction.) Then, by the Linear Extension
E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics 137

Property, we can obtain a linear theory b such that b0 ⊆ b and a ∗can b = {Z : there is
X ∈ a and can  (X&ϕ) → Z}; therefore, ϕ ∈ b but ψ ∈ a ∗can b.
Let a model M for L be an L model. Using Lemma 4, we can show that the canoni-
cally defined (X , can ) is an L model. Then, by construction, can excludes our chosen
nontheorem ϕ and the canonical definition of |= agrees with membership. Therefore, we
can say that, for each nontheorem ϕ of L, there exists an L model in which ϕ is not
can |= ϕ. This gives us the following weak completeness of L.

Theorem 1. (Weak completeness) For any formula ϕ, if ϕ is valid in every L frame, then
L ϕ.

Furthermore, using Lemma 4 and the Linear Extension Property, we can show the
strong completeness of L as follows.

Theorem 2. (Strong completeness) L is strongly complete w.r.t. the class of all L frames.

4. Concluding Remarks

We investigated set-theoretic Kripke-style semantics for some prominent t-norm (based)


logics. But we have not yet considered such semantics for fuzzy logics based on more
general structures. We will investigate this in a subsequent paper.

Acknowledgments: This work was supported by the Ministry of Education of the Repub-
lic of Korea and the National Research Foundation of Korea (NRF-2016S1A5A8018255).

References

[1] P. Cintula, R. Horčı́k and C. Noguera, Non-associative substructural logics and their semilinear exten-
sions: axiomatization and completeness properties, Review of Symbolic Logic 6 (2013), 394-423.
[2] P. Cintula, R. Horčı́k and C. Noguera, The quest for the basic fuzzy logic, in: Petr Hájek on Mathematical
Fuzzy Logic, F. Montagna, ed., Springer, Dordrecht, 2015, pp. 245-290.
[3] D. Diaconescu and G. Georgescu, On the forcing semantics for monoidal t-norm-based logic, Journal
of Universal Computer Science 13 (2007), 1550-1572.
[4] F. Esteva and L. Godo, Monoidal t-norm based logic: towards a logic for left-continuous t-norms, Fuzzy
Sets and Systems 124 (2001), 271-288.
[5] P. Hájek, Metamathematics of Fuzzy Logic, Kluwer, Amsterdam, 1998.
[6] F. Montagna and H. Ono, Kripke semantics, undecidability and standard completeness for Esteva and
Godo’s Logic MTL∀, Studia Logica 71 (2002), 227-245.
[7] F. Montagna and L. Sacchetti, Kripke-style semantics for many-valued logics, Mathematical Logic
Quarterly 49 (2003), 629-641.
[8] E. Yang, Algebraic Kripke-style semantics for relevance logics, Journal of Philosophical Logic 43
(2014), 803-826.
[9] E. Yang, Two kinds of (binary) Kripke-style semantics for three-valued logic, Logique et Analyse 231
(2015), 377-394.
[10] E. Yang, Algebraic Kripke-style semantics for substructural fuzzy logics, Korean Journal of Logic 19
(2016), 295-322.
This page intentionally left blank
Data Mining
This page intentionally left blank
Fuzzy Systems and Data Mining II 141
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-141

Dynamic Itemset Mining Under Multiple


Support Thresholds
Nourhan ABUZAYED and Belgin ERGENÇ1
Computer Engineering Department, Izmir Institute of Technology, Urla, Izmir, Turkey

Abstract. Handling dynamic aspect of databases and multiple support threshold


requirements of items are two important challenges of frequent itemset mining
algorithms. Existing dynamic itemset mining algorithms are devised for single
support threshold whereas multiple support threshold algorithms assume that the
databases are static. This paper focuses on dynamic update problem of frequent
itemsets under MIS (Multiple Item Support) thresholds and introduces Dynamic
MIS algorithm. It is i) tree based and scans the database once, ii) considers
multiple support thresholds, and iii) handles increments of additions, additions
with new items and deletions. Proposed algorithm is compared to CFP-Growth++
and findings are; in dynamic database 1) Dynamic MIS performs better than CFP-
Growth++ since it runs only on increments and 2) Dynamic MIS can achieve
speed-up up to 56 times against CFP-Growth++.

Keywords. Association rule mining, itemset mining, dynamic itemset mining,


multiple support thresholds

Introduction

Recently, an intensive research focused on association rule mining, which is one of the
main functions of data mining [1] Association rule was first introduced by Agrawal et
al. [2] and is defined as, X% of the customers who buy item A also buy item B and
denoted as AÆB. Association rules are meant to find the impact of a set of items on
another set of items. An itemset (items that co-occur in a transaction) frequency is
referred as the support count, which is the number of transactions that contain the
itemset. An itemset is frequent if its support count satisfies the minimum support
minsup threshold [3]. Confidence of an association rule XÆY is the ratio of
‰Y to the number of transactions that contain X [2 and 4].
transactions that contain X‰
Association rule mining has two main steps; 1) finding frequent itemsets/patterns,
2) generating association rules [5]. The first step is more expensive and several
algorithms have been proposed to find the frequent itemsets from huge databases. The
most classical one is the Apriori algorithm that uses candidate generation and testing
approach. Other subsequent algorithms using Apriori-like technique were introduced in
[6-12]. FP-Growth [4] and Matrix Apriori [13 and 14] are more recent algorithms that
try to overcome the drawback of candidate generation and multiple scans of the
database.

1
Corresponding Author: Belgin ERGENÇ, Computer Engineering Department, Izmir Institute of
Technology, Urla, Izmir, Turkey; Email: belginergenc@iyte.edu.tr.
142 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds

The major disadvantages of these itemset mining algorithms are 1) their


dependence on single user given minsup and 2) their inapplicability on dynamic
databases. Single support is not enough to represent the characteristics of the items and
causes rare item problem [15]. Some algorithms like MSapriori [16], CFP-Growth [17],
CFP-Growth++ [18] and MISFP-Growth [19] are introduced to find frequent patterns
under multiple support thresholds. For the second disadvantage, several algorithms
have been introduced in [20-27]. These algorithms perform faster and use less system
resources since they update frequent association rules by considering only the updates
instead of repeating all mining process from beginning.
All mentioned works handle either the dynamic itemset mining with single support
threshold or static itemset mining with multiple support thresholds. In this paper,
Dynamic MIS (Multiple Item Support) algorithm that provides a solution to the
dynamic itemset mining under multiple support thresholds problem is introduced. This
algorithm is tree based, scans the database only once, avoids the candidate generation
problem, and handles increments of additions, additions with new items and deletions
by using dynamic MIS-tree. The closest work to ours is introduced by incremental
tuning mechanism [28]. Proposed algorithm, Dynamic MIS is compared to CFP-
Growth++ using four datasets. Findings reveal that; in dynamic database, Dynamic
MIS performs better than CFP-Growth++ since it runs only on increments, speed-up
gained by Dynamic MIS can reach up to 56 times with large sparse datasets.
The organization of this paper is as follows: Section 1 introduces Dynamic MIS
algorithm with builder and increment handling parts. Section 2 shows the performance
evaluation. Section 3 is dedicated to the conclusion remarks.

1. Dynamic MIS Algorithm

Dynamic MIS algorithm provides a solution to the dynamic itemset mining under
multiple support thresholds problem by maintaining dynamic MIS-tree and two header
tables that keep the support counts of all items of the database. Frequent pattern
generation from the tree is done by related module of CFP-Growth++ algorithm [18].
Throughout the section, we use the following example illustrated. Table 1 presents
a sample database D and Table 2 illustrates the user given multiple item support (MIS)
for each item in decreasing order and items’ actual support in the database D. In the
right most column of Table 1, the transactions’ items are in an order of support values
as given in Table 2.
Table 1. Transaction database D [17]. Table 2. MIS and actual support of each item in
TID Item bought Item bought (ordered) D [17].
100 D, C,A, F A, C, D, F
Item A B C D E F G H
200 G, C, A, F, E A, C, E, F, G MIS 80 80 80 60 60 40 40 40
300 B, A, C, F, H A, B, C, F, H (%)
400 G, B, F B, F, G Support 60 60 80 20 20 80 40 20
500 B, C B, C (%)

1.1. Building MIS-tree

To build the MIS-tree, the MIS-tree builder algorithm illustrated in Figure 1 is used.
First, the MIS sorted list is created from the MIS values in Table 1 and ordered in de-
N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds 143

creasing order (Line 1) then the root node of the tree is created (Line 2). Primary and
secondary header tables are created (Line 3) as shown in Figure 2.
INPUT: Database D, Minimum item supports MIS
OUTPUT: MISsorted, MIS-tree

BEGIN
1 Build MISsorted list (in decreasing order)
2 Create the root of MIS-tree as null
3 Create primary and secondary header tables
4 Insert items into primary table (count=0)
5 Scan D
6 FOR each transaction T in D do:
7 Sort all items in T (as MISsorted)
8 Add T to the tree
9 END FOR
10 Calculate the support of items in D
11 Update the supports in the tables
12 Relocate items between header tables
END

Figure 1. MIS-tree builder algorithm. Figure 2. MIS-tree by MIS tree builder.


After that, items are ordered as MISsorted then inserted into the primary header table
with item’s count 0 (Line 4). Database D is scanned, and the transactions are added to
the tree (Line 5-9). First, the items in the new transaction are sorted in decreasing order
according to MISsorted list as in the right most column of Table 1.Then transaction is
added to the tree; if the transaction shares prefix with previous transactions, these
prefixes will be incremented by one, otherwise; new nodes will be created starting from
the root node with item’s count equal to one. Item’s count in the node and header table
is incremented. Nodes of same item are linked all through the tree and to the header
table. Supports of all items in D are calculated then updated in the header tables.
Eventually, the items are located in the two header tables; here items with support
more than the MIN MIS value (40%) are inserted into the primary header table, the rest
are inserted in the secondary header table. Likewise the node links are arranged.
Table 3. The incremental database d.
TID Items Items( ordered)
1 C, B, H B ,C, H
2 G, B, F B ,F, G
3 C, D, H C, D, H
INPUT: MIS-tree, MISsorted, increment d
OUTPUT:Dynamic MIS-tree

BEGIN
1 Scan d
2 FOR each transaction T in d do:
3 Sort items in T (like MISsorted )
4 Add T to the tree
5 END FOR
6 Calculate the support of items
7 Update the supports in the tables
8 Relocate items between header tables
END

Figure 3. Update process in Dynamic MIS for additions. Figure 4. Dynamic MIS-tree after adding d.

1.2. Adding Increments of Additions

The pseudo code of the update process for additions is given in Figure 3. When new
transactions (Table 3) arrive, they are scanned to be added to the tree (Line 2-5). First,
144 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds

items in the new transaction are sorted in descending order of MISsorted list then
transactions are added to the tree one by one as seen in Figure 4. Each item’s count in
this transaction is incremented in the primary table. Then, the nodes of same item are
linked all through the tree and to the header tables of the same figure. Supports of
items are calculated then updated in the header tables (Line 6 - 7). Lastly, items are
relocated between header tables by comparing item’s support with MIN MIS value
(Line 8). The items (A and G) are transferred from primary to secondary header table,
because their supports become less than the MIN MIS value (40%).

1.3. Adding Increments of Additions with New Items

The pseudo code of the update process for additions with new items is given in Figure
5. Let us explain this process by using the MIS-tree shown in Figure 2, incremental
database (with new items J, K, L) given in Table 5 and MIS values of new items given
Table 4. The first step is combining the new MIS values in Table 4 with the MIS values
of the old items in Table 2 to get Table 6.
Table 4. MIS values for new items in d. Table 5. The incremental database d with new items J, K, L.
Item J K L TID Item bought Item bought ( ordered)
MIS value 70% 35% 30% 1 C, B, K, J, H, L B ,C, J, H, K, L
2 K,H H, K
3 K, B, C B , C, K
Table 6. MIS values of all items.
Item A B C J D E F G H K L
MIS (%) 80 80 80 70 60 60 40 40 40 35 30

When new items appear, the MISsorted is updated by adding the new MIS values in
descending order as in Line 1. After that, new items in MISnew are appended to the
primary header table with item’s count 0 (Line 2). These two lines are the main
difference between additions and additions with new items.

INPUT: MIS-tree, MISsorted,increment d, MISnew


OUTPUT: MISsorted, Dynamic MIS-tree

BEGIN
1 Build MISsorted (MISsorted + MISnew)
2 Insert new items into primary header
table (count=0)
3 Scan d
4 FOR each transaction T in d do:
5 Sort items in T (like MISsorted )
6 Add T to the tree
7 END FOR
8 Calculate the support of all items
9 Update the supports in the tables
10 Relocate items between header tables
END

Figure 5. Dynamic MIS for additions with new items. Figure 6. Dynamic MIS-tree after adding d.
At the end, some items will be transferred between two header tables. Here, item
(G) is transferred from primary to secondary because its new support (25%) is less than
the new MIN MIS value (30%) and item (H) is transferred from secondary to primary
because with its new support of (37%). Figure 6 presents MIS-tree after adding the new
three transactions.
N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds 145

1.4. Adding Increments of Deletions

Let us explain the pseudo code of the update process for deletions which is shown in
Figure 7 by using the increment of deletions shown in Table 7. This example is applied
on the tree of Figure 2. The new transactions in d are scanned, and then deleted from
the tree as seen in Figure 8. Some items’ counts are decremented. These counts’
supports are calculated and updated in the tables of tree. According to the new items’
supports, some items are relocated between header tables. In this example; the support
of item (G) is 33.3%, which is less than the MIN MIS value (40%) and it is moved into
the secondary header table.

Table 7. Transactional database d with deletions.


TID Item bought Item bought ( ordered)
100 D,C,A,F A ,C, D, F
400 B,F,G B, F,G

INPUT: MIS-tree, MISsorted, increment d


OUTPUT: Dynamic MIS-tree

BEGIN
1 Scan d
2 FOR each transaction T in d do:
3 Sort tems in T (like MISsorted)
4 Delete T from the tree
5 END FOR
6 Calculate the support of all items
7 Update the supports
8 Relocate items between header tables
END

Figure 7. Update process in Dynamic MIS for deletions. Figure 8. Dynamic MIS-tree after deletions.

The nodes with count 1 are decremented and deleted from the tree, but their
records are kept in its specified table. The result Dynamic MIS-tree is illustrated in
Figure 8.

2. Performance Evaluation

Dynamic MIS is compared with the popular tree based algorithm, CFP-Growth++ [2].
Several experiments are executed on 4 datasets with different properties (T: average
size of the transactions, D: number of transactions, N: number of items) as shown in
Table 8. D1 and D4 are real; D2 and D3 are synthetic datasets. Density2 of a dataset
indicates the similarity of the transactions. D3 is generated to be used only in the
experiment additions with new items.
Table 8. Properties of datasets.

Dataset Type T D N Density


(%)3
D1 (Retail) Real 10.3 88162 16470 0.06
D2 (T40I1D100K) Synthetic 40 100K 942 4.25
D3 Synthetic 1.1 100K 5356 0.02
D4 (Kosarak) Real 8.1 990002 41270 0.02

2 Density (%) = (Average Transaction Length / # of Distinct Items) × 100


146 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds

All experiments are implemented on an Intl(R) core i7 -5500u CPU@ 2.40 GHz
with 8GB main memory, and Microsoft Windows 10 operating system. All programs
are implemented on C# environment.
For our experiments, we use two formulas [12] to assign MIS values to items in
the datasets: M (i)= β f (i) and MIS (i)=
{ M (i) M(i) > LS
LS Otherwise
f(i) is the actual frequency of item i in the data. LS is the user-specified lowest
minimum item support allowed. β (0 ≤ β ≤ 1) is a parameter that controls how the
MIS values for items should be related to their frequencies. If β = 0, we have only one
minimum support, LS, which is the same as the traditional association rule mining. If β
= 1 and f(i) ≥ LS, f(i) is the MIS value for i [16]. This formula is used to generate MIS
values to algorithms which use multiple support thresholds as in [16, 17, 18 and 28].

2.1. Complexity analysis of algorithms

Computational complexity of building the initial tree is same for both algorithms. It is
(T * V); where T is the number of transactions, and V the average transaction length.
The complexity of the pruning procedure in CFP-Growth++ is O (N * C) where N is
the number of nodes holding the items to be pruned, C is the number of their children.
The merging procedure in CFP-Growth++ is O (N2 * K) where N is number of nodes
in the tree and K is the node links. However in Dynamic MIS the pruning and merging
procedures are replaced by relocating items between header tables procedure which has
a linear complexity of O (N) where N is the number of items to be transferred. The
complexity of adding increments to the tree is O (T * V) where T is the number of the
incremental transactions, and V the average transaction length.

2.2. Execution time with additions

Execution time of Dynamic MIS and CFP-Growth++ algorithms on the increments of


additions is measured by dividing the dataset is into two parts. The part with D = (100 -
x)% forms the initial dataset and the remaining part with d = x% of the transactions
forms the increments. MIS values are kept constant. D1 has thirteen splits of 1% - 13%,
D2 has ten splits of 5% to 50% and D4 has eighteen splits of 5% - 90%.

Figure 9. Speed-up on Retail with additions. Figure 10. Speed-up with additions
The speed-up by running Dynamic MIS instead of re-running CFP-Growth++
when the database is updated is shown in Figure 9 and Figure 10. Speed-up of Dynamic
MIS is from 22.21 to 55.94 on D1 (Figure 9), from 1.56 to 1.35 on D2, and from 37.67
to 5.69 on D4 respectively as seen in Figure 10. The reasons behind these speed-up are
1) Dynamic MIS runs only on the increment whereas CFP-Growth++ runs from the
N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds 147

beginning, 2) Dynamic MIS generates frequent patterns from the items of primary
header table only whereas CFP-Growth++ requires pruning and merging of MIS-tree.

2.3. Execution time with additions of new items

Execution time performance of increments of additions with new items is measured on


D3 that is generated by IBM_Quest_data_generator [29] to be able to control the
existence of new items that do not exist in the original dataset. Eighteen split sizes in
the range of 5% - 90% are used. LS and β are constant as 0.01 and 0.5 respectively.
The number of new items in each split is constant and equal to 100. As shown in the
Figure 11 speed-up decreases from 5.76 to 1.72 while the split size increases.

Figure 11. Speed-up with additions with new items. Figure 12. Speed-up with deletions.

2.4. Execution time with deletions

The last comparison is to determine how the size of deletions affects the performance
of algorithm. Each split contains 20 % of the transactions of the original dataset. MIS
values are kept constant. The speed-up by running Dynamic MIS instead of re-running
CFP-Growth++ when the database is up-dated with deletions can be seen in Figure 12.
The speed-up increases from 2.26 to 44.88 in D1, from 2.06 to 40.16 in D4 and from
1.12 to 1.25 in D2 while the split size decreases.

3. Conclusion

Single support threshold and dynamic aspect of databases bring additional challenges
on frequent itemset mining algorithms. Dynamic MIS algorithm is proposed as a
solution to dynamic update problem of frequent itemset mining under multiple support
thresholds. It is tree based and handles increments of additions, additions with new
items, deletions and is faster especially with large sparse database.

Acknowledgements

This work is partially supported by the Scientific and Technological Research Council
of Turkey (TUBITAK) under ARDEB 3501 Project No: 114E779

References

[1] M. Chen, J. Han, P. S. Yu, Data mining: An overview from a database perspective. IEEE Transaction
on knowledge and Data Engineering, 8(1996), 866–883.
148 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds

[2] R. Agrawal, T. Imielinski, A. Swami, Mining association rules between sets of items in large databases,
In: ACM SIGMOD International conference on Management of data, USA (1993), 207–216.
[3] J. Han, M. Kamber, J. pei, Data mining concepts and techniques, Morgan Kaufmann Publishers,
Location-Based Services Jochen Schiller, Agnes Voisard (2006), 157–218.
[4] J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation, In: ACM SIGMOD
International Conference on Management of Data, ACM New York, USA (2000), 1–12.
[5] R. Agrawal, R. Srikant, Fast algorithms for mining association rules in large databases, In: The 20th
International Conference on Very Large Data Bases, San Francisco, CA, USA (1994), 487–499.
[6] H. Mannila, H. Toivonen, A.I. Verkamo, Efficient algorithms for discovering association rules, In:
AAAI Workshop on KDD, Seattle, WA, USA (1994), 181–192.
[7] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, A.I. Verkamo, Fast discovery of association rules, In
Advances in KDD. MIT Press, 12(1996), 307–328.
[8] A. Savasere, E. Omiecinski, S.B. Navathe, An efficient algorithm for mining association rules in large
databases, In: The 21st VLDB Conference, Zurich, Switzerland (1995), 432–443.
[9] J.S. Park, M. Chen, P.S. Yu, An effective hash-based algorithm for mining association rules, In: ACM
SIGMOD International Conference on Management of Data, San Jose, CA, USA (1995), 175–186.
[10] R. Srikant, Q. Vu, R. Agrawal, Mining association rules with item constraints, In: ACM KDD
International Conference, Newport Beach, CA, USA (1997), 67–73.
[11] R.T. Ng, L.V.S. Lakshmanan, J. Han, A. Pang, Exploratory mining and pruning optimizations of
constrained associations rules, In: ACM-SIGMOD Management of Data, USA (1998), 13–24.
[12] G. Grahne, L. Lakshmanan, X. Wang, Efficient mining of constrained correlated sets, In: The 16th
International Conference on Data Engineering, San Diego, CA, USA (2000), 512–521.
[13] J. Pavon, S. Viana, S. Gomez, Matrix Apriori: Speeding up the search for frequent patterns, In: The
24th IASTED International Conference on Database and Applications, Austria (2006), 75–82.
[14] B. Yıldız, B. Ergenç, Comparison of two association rule mining algorithms without candidate
generation, In: The 10th IASTED International Conference on Artificial Intelligence and Applications,
Innsbruck, Austria (2010), 450–457.
[15] H. Mannila, Database methods for data mining, Tutorial for the 4th ACM SIGKDD International
Conference on KDD, New York, USA (1998).
[16] B. Liu, W. Hsu, Y. Ma, Mining association rules with multiple minimum supports, In: The 5th ACM
SIGKDD International Conference on KDD, San Diego, CA, USA (1999), 337–341.
[17] Y. Hu, Y. Chen, Mining association rules with multiple minimum supports: a new mining algorithm
and a support tuning mechanism, Decision Support Systems, 42(2006), 1–24.
[18] R.U. Kiran, P.K. Reddy, Novel techniques to reduce search space in multiple minimum supports-based
frequent pattern mining algorithms, In: The 14th International Conference on Extending Database
Technology, ACM, New York, USA, (2011), 11–20.
[19] S. Darrab, B. Ergenç, Frequent pattern mining under multiple support thresholds, In: The 16th Applied
Computer Science Conference, WSEAS Transactions on Computer Research, Turkey, 4(2016), 1–10.
[20] D.W. Cheung, J. Han, V.T. Ng, C.Y. Wong, Maintenance of discovered association rules in large
databases, An incremental updating technique, In: The 12th IEEE International Conference on Data
Engineering, New Orleans, Louisiana, USA, (1996), 106–114.
[21] D.W. Cheung, S.D. Lee, B. Kao, A general incremental technique for maintaining discovered
association rules, In: The 5th International Conference on Database Systems for Advanced
Applications, Melbourne, Australia, (1997), 185–194.
[22] D. Oğuz, B. Ergenç, Incremental itemset mining based on Matrix Apriori, DEXA-DaWaK, Vienna,
Austria, (2012), 192–204.
[23] D. Oğuz, B. Yıldız, B. Ergenç, Matrix based dynamic itemset mining algorithm, International Journal
of Data Warehousing and Mining, 9(2013), 62–75.
[24] Y. Aumann, R. Feldman, O. Lipshtat, H. Manilla, Borders: An efficient algorithm for association
generation in dynamic databases, Journal of Intelligent Information System, 12(1999), 61–73.
[25] S. Shan, X. Wang, M. Sui, Mining Association Rules: A continuous incremental updating technique,
In: International Conference on WISM, IEEE Computer Society, Sanya, China (2010), 62–66.
[26] B. Dai, P. Lin, iTM: An efficient algorithm for frequent pattern mining in the incremental database
without rescanning, In: The 22nd International Conference on Industrial, Engineering and Other
Applications of Applied Intelligent Systems, Tainan, Taiwan (2009), 757–766.
[27] W. Cheung, O.R. Zaiane, Incremental mining of frequent patterns without candidate generation or
support constraint, In: IDEAS, Hong Kong, China (2003), 111–116.
[28] F.A. Hoque, M. Debnath, N. Easmin, K. Rashad, Frequent pattern mining for multiple minimum
supports with support tuning and tree maintenance on incremental database, Research Journal of
Information Technology, 3(2011), 79–90.
[29] Frequent Itemset Mining Implementations Repository, http://fimi.ua.ac.be/data/
Fuzzy Systems and Data Mining II 149
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-149

Deep Learning with Large Scale Dataset


for Credit Card Data Analysis
Ayahiko NIIMI 1
Faculty of Systems Information Science, Future University Hakodate,
2-116 Kamedanakano, Hakodate,
Hokkaido 041-8655, Japan

Abstract. In this study, two major applications are introduced to develop advanced
deep learning methods for credit card data analysis. Credit card information is con-
tained in two data sets; credit approval dataset and card transaction dataset. The
credit card dataset has two problems. One problem is using credit card approval
dataset, it is necessary to combine multiple models, each referring to a different
clustered group of users. The other problem is using card transaction dataset, since
the actual unauthorized credit card use is very small, these imprecise solutions do
not allow the appropriate detection of fraud. To solve these problems, we proposed
deep learning algorithm to apply credit card dataset. The proposed methods are val-
idated using benchmark experiments with other machine learnings. To evaluate our
proposed method, we use two credit card datasets, credit approval dataset by UCI
machine learning repository and credit transaction dataset constructed by random.
The experiments confirm that deep learning exhibits comparable accuracy to the
Gaussian kernel support vector machine (SVM). The proposed methods are also
validated using large scale transaction dataset. Moreover, we apply our proposed
method for the time-series benchmark dataset. Deep learning parameter adjustment
is difficult. By optimizing the parameters, it is possible to increase the learning
accuracy.
Keywords. Data Mining, Deep Learning, Credit Approval Dataset, Card Transaction
Dataset

Introduction

Deep learning is a state-of-the-art research topic in the machine learning field with ap-
plications for solving various problems [1, 2]. This paper investigates the application of
deep learning in credit card data analysis.
Credit card data are mainly used in user and transaction judgments. User judgment
determines whether a credit card should be issued to the user satisfying particular criteria.
On the other hand, transaction judgment refers to whether the validity of a transaction is
correct [3]. We determined the deep learning processes required for solving each of these
problems, and we proposed appropriate methods for deep learning [4, 5].
To verify our proposed methods, we use benchmark experiments with other machine
learnings, which confirm the accuracy of the deep learning methods similar to that of
1 Corresponding Author: Ayahiko Niimi, Faculty of Systems Information Science, Future University

Hakodate, 2-116 Kamedanakano, Hakodate, Hokkaido 041-8655, Japan; E-mail: niimi@fun.ac.jp.


150 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis

the Gaussian Kernel SVM. In the final section of this paper, we provide suggestions for
future deep learning experiments.
We only used a small scale transaction dataset for evaluation experiment, and did
not use a large-scale dataset [6]. In this paper, The proposed methods are also validated
using large scale transaction dataset. Moreover, we apply our proposed method for the
time-series benchmark dataset.
First, in section 1, we will introduce the characteristics of the data-set of the credit
card. Then, in section 2, we will introduce Deep Learning. In section 3, we will discuss
the data processing infrastructure that is suitable for analysis of credit card data. In sec-
tion 4, we describe the experiment, and the results are shown in section 5. We discuss
about the results in section 6. Finally, in section 7, we describe conclusions and future
works.

1. Credit Card Data Set

The datasets of credit card are as follows:


1. credit approval dataset
2. card transaction dataset

1.1. Credit approval dataSet

For each user submitting a credit card creation application, there is a record of the deci-
sion to issue the card or to reject the application. This is based on the user’s attributes, in
accordance with the general usage-trend models.
However, to reach this decision, it is necessary to combine multiple models, each
referring to a different clustered group of users.

1.2. Credit card transaction data

In actual credit card transactions, the data is complex, constantly changing, and conti-
nously arrives online as follows:

(i) Approximately one million transactions are arrived per day.


(ii) Each transaction takes less than one second for completion.
(iii) Approximately one hundred transactions arrive per second during peak time.
(iv) Transactions data arrive continuously.

Therefore, credit card transaction data can be precisely called a data stream. How-
ever, even if we use data mining for such data, an operator can monitor around only 2,000
transactions per day. Therefore, we have to detect suspicious transaction data effectively
by analyzing less than 0.02% of the total number of transactions. In addition, fraud de-
tection is extremely low from analyzing massive amounts of transaction data, because
real fraud occurs at an extremely low rate, i.e., within a 0.02% to 0.05% of all of the
transaction data.
In a precious paper, transaction data in CSV format were described as attributed in a
time order [3]. Credit card transaction data have 124 attributes, 84 are called transactional
A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 151

data, including an attribute used to discriminate whether the data refers to fraud; the
others are called behavioral data, and they refer to the credit card usage. The inflow file
size is approximately 700 MB per month.
Mining the credit card transaction data stream presents inherent difficulties, since it
requires performing efficient calculations on an unlimited data stream with limited com-
puting resources. Therefore many streams mining methods seek an approximate or prob-
abilistic solution instead of an exact one. However, since the actual unauthorized credit
card use is very small, these imprecise solutions do not allow the appropriate detection
of fraud.

2. Deep Learning

Deep learning is a new technology that recently attracted much attention in the field of
machine learning. It significantly improves the accuracy of abstract representations by re-
constructing deep structures such as neural circuitry of the human brain. The deep learn-
ing algorithms were honored in various competitions such as International Conference
on Representation Learning.
Deep learning is a generic term for multilayer neural networks, which were re-
searched for a long time [1, 2, 7]. Multilayer neural networks decrease the overall calcu-
lation time by performing calculation on hidden layers. Thus, they were prone to exces-
sive over training, as an intermediate layer was often used for approximately every single
layer.
However, the technological advances suppressed over training, whereas GPU uti-
lization and parallel processing increased the number of hidden layers.
A sigmoid or a tanh function was commonly used as an activation function (see
Equation 1, 2), although recently, a maxout function was also used (section 2.1).
The dropout technique was implemented to prevent over training (section 2.2).

hi (x) = sigmoid(xT W...i + bi ) (1)


hi (x) = tanh(xT W...i + bi ) (2)

2.1. Maxout

The maxout model is simply a feed-forward architecture such as a multilayer perceptron


or deep convolutional neural network that uses a new type of activation function, the
maxout unit [2].
In particular, given an input x ∈ Rd (x may be v, or it may be a hidden layer ’s
state), a maxout hidden layer implements the function

hi (x) = maxj∈[1,k] zij (3)

where zij = xT W...ij + bij , W ∈ Rd×m×k and b ∈ Rm×k are learned parameters. In
a convolutional network, a maxout feature map can be constructed by taking the maxi-
mum across k affine feature maps (i.e., pool across channels, in addition to spatial loca-
152 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis

tions). When training with dropout, we perform the element-wise multiplication with the
dropout mask immediately prior to the multiplication by weights, in all cases; inputs are
not dropped to the max operator. A single maxout unit can be interpreted as a piecewise
linear approximation of an arbitrary convex function. Maxout networks learn not just the
relationship between hidden units, but also the activation function of each hidden unit.
Maxout abandons many of the mainstays of traditional activation function design.
The representation it produces is not sparse at all, though the gradient is highly sparse,
and the dropout will artificially sparsify the effective representation during training. Al-
though maxout may learn to saturate on one side or another, this is a measure zero event
(so it is almost never bounded from above). Since a significant proportion of parame-
ter space corresponds to the function delimited from below, maxout learning is not con-
strained at all. Maxout is locally linear almost everywhere, whereas many popular acti-
vation functions have significant curvature. Given all of these deviations from standard
practice, it may seem surprising that maxout activation functions work at all, but we find
that they are very robust, easy to train with dropout, and achieve excellent performance.

hi (x) = maxj∈[1,k] Zij (4)


zij = X T W...ij + bij (5)

2.2. Dropout

Dropout is a technique that can be applied to deterministic feedforward architectures that


predict an output y given an input vector v [2].
In particular, these architectures contain a series of hidden layers h= {h(1), . . . , h(L)}.
Dropout trains an ensemble of models consisting of a subset of the variables in both v
and h. The same set of parameters θ is used to parameterize a family of distributions
p(y|v; θ, μ), where μ ∈ M is a binary mask determining which variables to include
in the model. On each example, we train a different submodel by following the gra-
dient log p(y|v; θ, μ) for a different randomly sampled μ. For many parameterizations
of p (usually for multilayer perceptrons) the instantiation of the different submodels
p(y|v; θ, μ) can be obtained by elementwise multiplication of v and h with the mask μ.
The functional form becomes important when the ensemble makes a prediction by
averaging together all the submodels’ predictions. Previous studies on bagging averages
used the arithmetic mean. However, this is not possible with the exponentially many
models trained by dropout. Fortunately, some models easily yield a geometric mean.
When p(y|v; θ) = softmax(v T W + b), the predictive distribution defined by renormaliz-
ing the geometric mean of p(y|v; θ, μ) over M is simply given by softmax(v T W/2 + b).
In other words, the average exponential prediction for many submodels can be computed
simply by running the full model with the weights divided by two. This result holds
exactly in the case of a single layer softmax model. Previous work on dropout applies
the same scheme in deeper architectures, such as multilayer perceptrons, where the W/2
method is only an approximation of the geometric mean. This approximation was not
characterized mathematically, but performed well in practice.
A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 153

3. Data Analysis Platform

In this section, we consider the data processing infrastructure that is suitable for analysis
of credit card data, as well as the applications of deep learning to credit card data analysis.

3.1. R

R is a language and environment for statistical computing and graphics [8]. It is a GNU
project similar to the S language and environment which was developed at Bell Labora-
tories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R
can be considered as a different implementation of S. There are some important differ-
ences, but much code written for S runs unaltered under R. R is available as Free Soft-
ware and is widly used. It includes many useful libraries, such as multivariate analysis,
machine learning, and it is suitable for data mining.
However, R performs processing in memory, therefore, it is not suitable for large
amounts of data processing.

3.2. Google BigQuery, Amazon Redshift

Google BigQuery [9] and Amazon Redshift [10] are systems corresponding to the in-
quiries using large amounts of data. These cloud systems can easily store a large amount
of data and processing it at high speed. Therefore, we can use them to analyze data trends
interactively. However, data processing, such as machine learning, needs to be further
developed.

3.3. Apache Hadoop

Apache Hadoop is a platform for handling large amount of data as well [11]. Apache
Hadoop divides the process into mapping and reducing, wich operate in parallel; the Map
processes data, whereas the Reduce summarizes the results. In combination, these pro-
cesses realize high-speed processing of large amounts of data. However, since process-
ing is performed in batches the Map/Reduce cycle can be completed before all data are
stored. It is difficult to apply separate algorithms for Map/Reduce different batches. In
particular, it is difficult to implement the algorithm repeatedly for the same data, as is
required in machine learning.

3.4. Apache Storm

Apache Storm is designed to process a data stream [12]. For incessantly flowing data,
data conversion is executed. The data source is called the Spout and the part that per-
forms the conversion process is called the Blot. Apache Storm is a model that performs
processing by a combination of Bolt from Spout.

3.5. Apache Spark

Apache Spark is also a platform that processes large amounts of data [13]. Apache Spark
has generalized the Map/Reduce processing. It processes by caching the work memory,
and it is designed to execute efficient iterative algorithms by maintaining shared data,
154 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis

which is used for repeated processing in the memory. In addition, a machine learning
and graphs algorithms library is prepared, and it can be an easily build environment for
stream data mining.
H2O is a library of deep learning for Spark [14, 15].
SparkR is an R package that provides a light weight frontend for Apache Spark
from R [16]. In Spark 1.5.0, SparkR provides distributed data frame implementation that
supports operations, such as selection, filtering, aggregation, similar to R data frames,
dplyr, but on large datasets. SparkR also supports distributed machine learning using
MLlib.
In the present paper, we perform credit card data analysis using R and Spark. It is
possible to use an extensive library with R to gain the high performance by parallel and
distributed processing of Spark.

4. Experiments

We used the credit approval dataset by UCI Machine Learning repository to evaluate the
experimental results [4].
All attribute names and values were reassigned to meaningless symbols to protect
the data confidentiality.
In addition, the original dataset contains missing values. In the experiment, we use
a pre-processing dataset [17], as presented in Table 1.

Table 1. UCI Dataset(Credit Approval Dataset)


Number of Instances: 690
Number of Attributes 15 + class attribute
Class Distribution: +: 307 (44.5%),
-: 383 (55.5%)
Number of Instance for Training: 590
Number of Instance for Test: 100

Deep learning uses the R library of H2O [14, 15]. H2O is a library for Hadoop and
Spark, but it also has an R package.
For comparison, we also use five typical machine learning algorithms. In addition,
the deep learning parameters (activation functions and dropout parameter) are changed
five times. In this experiment, the hidden layer neurons are set to (100, 100, 200) for deep
learning. The parameters used are shown in Table 2.
XGBoost is an optimized general purpose gradient boosting library [18]. The li-
brary is parallelized and provides an optimized distributed version. It implements ma-
chine learning algorithms under the gradient boosting framework, including a general-
ized linear model and gradient boosted decision trees. XGBoost can also be distributed
and scaled to Terascale data.
The activation functions used here are summarized in Table 3 [15].
Moreover, to ascertain whether there is a bias in the results of the training data
and the test data, we perform 10-fold cross-validation using the entire dataset. In this
experiment, the hidden layer neurons are set to (200, 200, 200).
In the experiment, we use the following environment.
A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 155

Table 2. Comparison Algorithms


Deep learning Rectifier with Dropout
Rectifier
Tanh with Dropout
Tanh
Maxout with Dropout
Maxout
Logistic Regression
Support Vector Machine Gaussian Radial Basis Function
Linear SVM
Random Forest
XGBoost

Table 3. Activation functions


Function Formula
eα −e−α
Tanh f (α) = eα −e−α
Rectified Linear f (α) = max(0, α)
Maxout f (·) = max(wi xi + b),
rescale if maxf (·) ≥ 1
Tanh with Dropout Tanh with Dropout
Rectifier with Dropout Rectfied Linear with Dropout
Maxout with Dropout Maxout with Dropout

• AWS EC2 t2.micro


• CPU Intel Xeon 3.3 GHz
• Memory 1GB
• Disk 8GB SSD
• R version 3.2.2
• H2O version 3.0.0.30

4.1. Large-Scale Dataset

In this paper, The proposed methods are also validated using large scale transaction
dataset. We made a dataset from the actual card transaction dataset which contains the
same number of attributes(130 attributes) and the value of each attribute made by random
with the same range. The data set has about 300,000 transactions which include about
3,000 illegal usages. We made a dataset with six months data for experiment. Because
this dataset has random values, it could not use to evaluate accuracy. We used this dataset
to estimate machine specs and calculation times.
The percentage of fraud count is too low in the dataset. We used all illegal usages
(approximately 3,000) and sampling normal usages (approximately 3,000) in the exper-
iment.
We used the Amazon EC2 r3.8xlarge (32 cores, 244GB memory) for experiment. As
a preliminary experiment, deep learning’s parameters of hidden layer nurons (100, 100,
200) and epochs (200) were used, but the learning did not converge. Therefore, in the
experiment, the parameters of the deep learning (hidden layer nurons (2048, 2048, 4096)
156 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis

and epochs (2000) and hidden dropout ratios (0.75, 0.75, 0.7)) were used. The “Maxout
With Dropout” is used for activation function.
The experimental results are currently being analyzed.

4.2. Benchmark Dataset for Time Series Data

For comparison of the proposed method, we evaluate our proposed method by using
public time-series benchmark data. We used the gas sensor dataset from the UCI Machine
Learning repository [4, 19, 20].
We are going to apply for experiment and tune parameters and analyze the obtained
results.

5. Experimental Results

5.1. Comparison of Algorithms Using the UCI Dataset

Teble 4 shows the experimental results. We run each algorithms five times and the Table
4 presents the average. Because the machine learning algorithms that we used have no
initial value dependent, the results of the algorithms are the same, all five times.

Table 4. Result of UCI Dataset


Algorithm Error Rate
Rectifier with Dropout (Deep Learning) 18.4
Rectifier (Deep Learning) 18.8
Tanh with Dropout (Deep Learning) 17.6
Tanh (Deep Learning) 22.8
Maxout with Dropout (Deep Learning) 12.4
Maxout (Deep Learning) 16.2
Logistic Regression 18.0
Gaussian Kernel SVM 11.0
Linear Kernel SVM 14.0
Random Forest 14.0
XGBoost 14.0

The deep learning results depend on the initial parameters. Deep learning of accu-
racy with the Maxout with Dropout produces a result close to the Gaussian kernel SVM.

5.2. Deep Learning: 10-Fold Cross-Validation

Table 5 shows the results of the 10-fold cross-validation. N and Y are class attributes.
Stability results are obtained regardless of the dataset.
A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 157

Table 5. Result of Deep Learning (10-fold cross-validation)


N Y Error Rate
Totals 332 287 0.138934 =86/619
Totals 337 280 0.132901 =82/617
Totals 333 277 0.136066 =83/610
Totals 318 302 0.135484 =84/620
Totals 325 295 0.156452 =97/620
Totals 306 316 0.180064 =112/622
Totals 338 296 0.140379 =89/634
Totals 336 281 0.136143 =84/617
Totals 327 301 0.141720 =89/628
Totals 318 305 0.146067 =91/623
Average of Error Rate 0.144421

6. Considerations

The presently conducted experiments confirm that deep learning has the same accuracy
as the Gaussian kernel SVM.
In addition, the 10-fold cross-validation experiment indicates that it is deep learning
offers higher precision.
In this experiment, we used the H2O library for deep learning, with the deep learning
modules written in Java were activated each time. Therefore, we cannot assessment the
execution time.
Deep learning parameter adjustment is difficult. By optimizing the parameters, it is
possible to increase the learning accuracy.
There are some different approaches for time-series dataset [21, 22]. These ap-
proaches are different from the proposed method, but it is useful to improve our proposed
method.

7. Conclusion

In this paper, we consider the application of deep learning in credit card data analysis.
We introduce two major applications and propose methods for deep learning. To verify
our proposed methods, we use benchmark experiments with other machine learnings.
Through these experiments, it is confirmed that deep learning has the same accuracy as
the Gaussian kernel SVM. The proposed methods are also validated using large scale
transaction dataset.
In the future, we will consider evaluation an experiment using the transaction data
and real datasets.

Acknowledgment

The authors would like to thank to Intelligent Wave Inc. for many comment of credit card
transaction datasets.
158 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis

References

[1] Y. Bengio. Learning Deep Architectures for AI. Foundations & Trends R in Machine Learning,
2(2009):1-127.
[2] I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio. Maxout Networks. ArXiv
e-prints, Feb., 2013.
[3] T. Minegishi and A. Niimi. Detection of Fraud Use of Credit Card by Extended VFDT, in World
Congress on Internet Security (WorldCIS-2011), London, UK, Feb., (2011), 166–173.
[4] M. Lichman. UCI Machine Learning Repository. (2013), (Access Date: 15 September, 2015). [Online].
Available: http://archive.ics.uci.edu/ml
[5] T. J. OZAKI. Data scientist in ginza, tokyo. (2015), (Access Date: 15 September, 2015). [Online]. Avail-
able: http://tjo-en.hatenablog.com/
[6] A. Niimi. Deep Learning for Credit Card Data Analysis, in World Congress on Internet Security
(WorldCIS-2015), Dublin, Ireland, Oct., (2015), 73–77.
[7] Q. Le. Building High-Level Features using Large Scale Unsupervised Learning. in Acoustics, Speech
and Signal Processing (ICASSP), 2013 IEEE International Conference on, May, (2013), 8595–8598.
[8] R: The R project for statistical computing. (Access Date: 15 September, 2015). [Online]. Available:
https://www.r-project.org/
[9] Google cloud platform. what is BigQuery? - Google BigQuery. (Access Date: 15 September, 2015).
[Online]. Available: https://cloud.google.com/bigquery/what-is-bigquery
[10] AWS Amazon Redshift. Cloud Data Warehouse Solutions. (Access Date: 15 September, 2015). [Online].
Available: https://aws.amazon.com/redshift/
[11] Apache Hadoop. Welcome to Apache Hadoop! (Access Date: 15 September, 2015). [Online]. Available:
https://hadoop.apache.org/
[12] Apache Storm. Storm, distributed and fault-tolerant realtime computation. (Access Date: 15 September,
2015). [Online]. Available: https://storm.apache.org/
[13] Apache Spark. Lightning-Fast Cluster Computing. (Access Date: 15 September, 2015). [Online]. Avail-
able: https://spark.apache.org/
[14] 0xdata — H2O.ai — Fast Scalable Machine Learning. (Access Date: 15 September, 2015). [Online].
Available: http://h2o.ai/
[15] A. Candel and V. Parmar. Deep Learning with H2O. H2O, (2015), (Access Date: 15 September, 2015).
[Online]. Available: http://learnpub.com/deeplearning
[16] SparkR (R on Spark) — Spark 1.5.0 Documentation. (Access Date: 15 September, 2015). [Online].
Available: https://spark.apache.org/docs/latest/sparkr.html
[17] T. J. OZAKI. Credit Approval Data Set, modified. (2015), (Access Date: 15 September, 2015).
[Online]. Available: https://github.com/ozt-ca/tjo.hatenablog.samples/tree/
master/r_samples/public_lib/jp/exp_uci_datasets/card_approval
[18] dmlc XGBoost extreme Gradient Boosting. (Access Date: 15 September, 2015). [Online]. Available:
https://github.com/dmlc/xgboost
[19] A. Vergara, S. Vembu, T. Ayhan, M. Ryan, M. Homer, and R. Huerta. Chemical Gas Sensor Drift Com-
pensation using Classifier Ensembles. Sensors and Actuators B: Chemical, 166(1), (2012), 320–329.
[20] I. Rodriguez-Lujan, J. Fonollosa, A. Vergara, M. Homer, and R. Huerta. On the Calibration of Sensor Ar-
rays for Pattern Recognition using the Minimal Number of Experiments. Chemometrics and Intelligent
Laboratory Systems, 130, (2014), 123–134.
[21] S. Yin, X. Xie, J. Lam, K. C. Cheung, and H. Gao. An Improved Incremental Learning Approach for
KPI Prognosis of Dynamic Fuel Cell System. IEEE Transactions on Cybernetics, PP(99), (2015), 1–10.
[22] S. Yin, H. Gao, J. Qiu, and O. Kaynak. Fault Detection for Nonlinear Process with Deterministic Dis-
turbances: A Just-In-Time Learning Based Data Driven Method. IEEE Transactions on Cybernetics,
PP(99), (2016), 1–9.
Fuzzy Systems and Data Mining II 159
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-159

Probabilistic Frequent Itemset Mining


Algorithm over Uncertain Databases with
Sampling
Hai-Feng LI1, Ning ZHANG, Yue-Jin ZHANG and Yue WANG
School of Information, Central University of Finance and Economics, Beijing 100081,
China

Abstract. Uncertain data is the data accompanied with probability, which makes
the frequent itemset mining have more challenges. Given the data size n,
computing the probabilistic support needs O(n(logn)2) time complexity and O(n)
space complexity. This paper focuses on the problem of mining probabilistic
frequent itemsets over uncertain databases and proposed PFIMSample algorithm.
We employ the Chebyshev inequation to estimate the frequency of the items,
which decreases certain computing from O(n(logn)2) to O(n). In addition, we
propose the sampling technique to improve the performance. Our extensive
experimental results show that our algorithm can achieve a significantly improved
runtime cost and memory cost with high accuracy.

Keywords. Uncertain database, probabilistic frequent itemset, data mining,


sampling

Introduction

The restraint of physical factors, the data preprocessing and the data privacy protecting
methods will bring uncertainty to data, which is significant over continuous arrived
data [1]. By introduce the probability of data occurrence, we can improve the robust of
data mining method, and guarantee that the data analysis can achieve exact and precise
knowledge, which is much valuable for user decision. Frequent itemset mining
algorithms over certain databases have achieved many good results [2-4]. Nevertheless,
the uncertainty of data [5, 6] brings new challenges.
According to the different definitions of frequent itemset over uncertain data, the
mining methods can be categorized into two types: one is based on the expected
support and another is based on the probabilistic support [7-22]. The methods based on
the expected support mainly used the expectation of the itemset support to evaluate
whether an itemset is frequent; the methods based on the probabilistic support
considered that an itemset is frequent when its support is larger than the minimum
support with a specified high probability. If the database size is n, then the former have
O(n) time complexity and O(1) space complexity, and the latter have O(n(logn)2) time
complexity and O(n) space complexity[11]. Clearly, the former has a much higher

1
Corresponding Author: Hai-Feng LI, School of Information, Central University of Finance and
Economics, Beijing 100081, China; E-Mail: mydlhf@cufe.edu.cn.
160 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm

performance. The latter, however, can represent the probabilistic characters of frequent
itemsets.
Since the computing of probability support is complicate, it is more challengeable.
In this paper, we focus on this problem and propose a frequency estimating method
based on Chebyshev Inequation and sampling method, to implement the approximately
computing of probabilistic support, and guarantee the accuracy by theoretical analysis.
Also, we use the experiments to testify this method.
This paper is organized as follows. Section 1 introduces the preliminaries of
frequent itemset mining. Section 2 proposed our PFIMSample algorithm in detail.
Section 3 presents the experimental results. Section 4 concludes this paper.

1. Preliminaries

We use * {i , i , , i } to denote the distinct items, in which *


1 2 n
is the size of * .
n

We call an itemset X with size n the k-itemset. Assuming X has item xt (0  t d k ) with
a probability pt , then X is an uncertain itemset, denoted as
X {x1 , p1; x2 , p2 ; ; xk , pk } . For uncertain dataset UD {UT1 ,UT2 , UTv } , each
UTi (i 1 v) denotes a transaction based on * , which has an id and the corresponding
(tid , X )
itemset X, denoted as . Figure 1 is a simple uncertain dataset, which, if using
possible world model, can be converted multiple certain dataset with a probability, and
each certain dataset is called a possible world.
Definition 1 (Count Support): Given the uncertain database UD and itemset X, the
occurrence count is called the count support of X, denoted as /UD ( X ) 㧘
/( X )
for
short.
Definition 2 (Possible World) [9]: Given the uncertain database, the generated
possible world PW has |UD| transactions, each transaction Ti is a subset of UTi ,
, in which Ti Ž UTi .
PW {T1 , T2 , T|UD|}
denoted as
Providing the uncertain transaction is independent, then the probability of the
possible world, p(PW), can be computed by the following method. If an item x exists in
Ti and UTi , then we get the probability of x, the p(x); if x exists in UTi but not in Ti ,

then we get the probability of x , the p( x ). Then we can multiply all the probabilities.
p( PW ) 3 ( 3 p( x ))( 3 p( x))
xUTi xTi xTi
The computing equation is .
Using < to denote the possible worlds generated from UD, then the size of < will
increase exponentially w.r.t. the size of UD. That is, if UD has m transactions, and each
6im1ni
transaction has nm items, then < has 2 possible worlds.
H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm 161

Figure 1. Uncertain database vs. possible worlds

The left image of Figure 1 show the uncertain dataset has 2 transactions, each one
22
has two items. As can be seen in the right image of Figure 1, there are 2 16
possible worlds, each of which has a occurrence probability. As an example, possible
world PW6 has two transactions T1 and T2 , which are all {A}. Then the probability of
PW6 is
p( PW6 ) p{ A}UT1 š{ A}T1 ({ A}) p{B}UT1 š{B}T1 ({B}) p{ A}UT2 š{ A}T2 ({ A})
= * * *
p{C}UT2 š{C}T2 ({C})
=0.6*0.3*0.2*0.7=0.025. As can be seen, the summary of
probability of all the possible worlds is 1.

2. PFIM Sample Algorithm

In the uncertain database UD, the frequent itemset is defined by the possible world
model. If the itemset X has the support / PW ( X ) in each possible world PW, then the
probability pPW ( X ) is the probability of PW, the p(PW). We can use a 2-tuple
< / PW ( X ) , pPW ( X ) > to denote it. In UD, X has 2
6im1ni
such tuples, which can be denoted
P (X )
with the probability summed vector / ( X ) .
Definition 3 (Probabilistic Frequent Itemset) [10, 23]: Given the uncertain
database UD, the minimum support O and the minimum probabilistic confidence W ,
and itemset X is a ( O , W )-probabilistic frequent itemsetif the probabilistic support
/WP ( X ) t O /P (X ) P
, in which W =Max{i| / ( X ) ti > W }.
For an uncertain database with size n, we can use a divide-and-conquer method
[11], which has the time complexity O(n(logn)2) and space complexity O(n). As can be
seen, n is the key factor that determines the computing efficiency. If we can decrease n,
then the runtime cost will decrease linearly.
162 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm

According to the law of large number, when n is large enough, the data tend to fit
the normal distribution, with which we proposed our mining algorithm PFIMSample
based on the sampling method. We described the detail as follows.
1) To scan the database and get the data statistics characteristics, that is, the average
and the variance of the itemset probability.
2) To scan the database and compute the count support and the expected support, in
which the expected support, is the sum of the probabilities.
3) For a given sampling parameter, we use random sampling over the database so that
the acquired data fits the normal distribution. Since we assume that the uncertain
database fits the normal distribution initially if the data is massive enough, we use
the simple system sampling method, which can guarantee the mining efficiency
with a similar distribution. On the other hand, since sampling will decrease the
database size, which may reduce the mining accuracy; thus, we will evaluate it
with our experiments, and we finally find that the accuracy is not relative to the
sampling rate. According to each item, we scan the sampling database and
compute the probabilistic support, which if is larger than the minimum support,
then is a 1-frequent itemset.
4) To match all the n-probabilistic frequent itemsets and generate the (n+1)-
probabilistic itemset, and compute the probabilistic support to determine whether
they are frequent.
5) To repeat the 4) phase until no new probabilistic frequent itemsets are generated,
then output the results.
In the PFIMSample algorithm, when the independent item is generated, to
guarantee the accuracy, we will scan the database but not the sampling database to
compute the probabilistic support. Consequently, the computing cost will be
O(n(logn)2). We use the heuristic rule based pruning strategy in the 3) phase.
According to the Chebyshev Inequation, a given variable X has the expected
support E(X) and the standard variance D(X); for random constant ε>0, we can get P( |
X - E(X) | ≥ ε ) ≤ D2(X) / ε². That is, in a arbitrary dataset, the ratio that it locates
within m D(X) centered by the expected support is at least 1-1/ m2, and m is a positive
number larger than 1. For an example, if m=5, then at least 1-1/25=96% data has the
probability that the support is larger than E(X)-5D(X). Thus, before we can determine
the frequency of a distinct X, we will first compute the expected support and the
standard variance, if E(X)-mD(X) is larger than the minimum support, then X is a
probabilistic frequent itemset with 1-1/m2 probability. Since computing the expected
support is O(n), which is far less than the cost of computing the probabilistic support,
then we can prune the itemsets efficiently. This efficiency will be better follows the
larger n.
To guarantee that the memory cost is low, we use a prefix-tree to maintain the
itemsets, as well the count support, the expected support and the probabilistic support.
Note that our algorithm does not store the probabilistic density function of the itemset,
which is due the fact that the space complexity of a probabilistic density function is
O(n), and many itemsets will result in massive memory usage . Since the probabilistic
support will be computed once, the probabilistic density functions can be deleted once
the probabilistic support is achieved, which can significantly improve the performance.
H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm 163

3. Experimental Results

We compared the performance and the accuracy when the minimum probabilistic
confidence is set 0.9. Our algorithm was implemented with Python 2.7 under Windows
7, and run over i7-4790M 3.6GHZ CPU and 4GB memory. Two uncertain datasets
were used to evaluate our algorithm: One is the GAZELLE that contains the real e-
commerce click stream, another is the synthetic data T25I15D320K generated by the
IBM generator. We assigned a probability generated from Gaussian distribution for
each item, which is widely accepted by the current research over uncertain data [16].
We showed the characteristics of the two datasets in Table 1. Our sampling method is
a framework that can be extensively applied on the existing algorithm, we employed
the state-of-the-art method TODIS [11] as the benchmark algorithm accordingly. That
is, TODIS is actually the condition that our algorithm PFIMSample with sampling rate
1.
Table 1. The Characteristics of Uncertain Datasets

Dataset Size Trans Size Items Count


GAZELLE 59602 3 497
T25I15D320K 320000 26 1000

3.1. Runtime Cost

We first conducted PFIMSample algorithm over two datasets with different sampling
rate. From Figure 2 and 3 we can see, when the minimum support was fixed, the
mining efficiency will reduced in line with the incremental sampling rate. When the
sampling rate is 0.01, the mining cost will be very low, that is, the runtime can be 100
folds better. Furthermore, to reduce the minimum support will result in the same trend
of the performance, which was much significant over T25I15D320K dataset. This is for
the reason that T25I15D320K dataset is denser than GAZELLE dataset.

Figure 2. Runtime VS Sampling rate (GAZELLE) Figure 3. Runtime VS Sampling rate (T25I15D320K)

3.2. Memory Usage

Figure 4 and 5 compared the memory usage over different sampling rates. We can see
that the memory cost turned larger but not significantly, when the sampling rate
increased. On the other hand, the memory usage was not related to the minimum
164 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm

support, which is because we used the relative minimum support. Moreover, the
memory usage is low when mining over the sparse dataset GAZELLE.

Figure 4. Memory cost VS Sampling rate Figure 5. Memory cost VS Sampling rate
(GAZELLE) (T25I15D320K)
3.3. Precision and Recall

We used the Precision and the Recall to evaluate the accuracy of our algorithm. For an
original mining results D and the ones that used the sampling rate D’, we defined
Precision=|D ∩ D’|/|D|, and Recall=|D ∩ D’|/|D’|. As a result, the larger the precision
and the recall, the higher accuracy our algorithm will be. Table 2 shows the Precision
and the Recall using different sampling rates over two datasets. As can be seen, when
the minimum support is 0.08, our algorithm can achieve 100% accuracy; also, it can
achieve more than 90% accuracy over T25I15D320K on most cases. In addition, the
accuracy of our algorithm is not related to the sampling rate since we use the random
samples.
Table 2. Precision and Recall

Dataset Minimum Sampling rate Precision Recall


support
0.01 100% 100%
0.02 100% 100%
0.03 100% 100%
0.04 100% 100%
GAZELLE 0.08 0.05 100% 100%
0.06 100% 100%
0.07 100% 100%
0.08 100% 100%
0.09 100% 100%
0.01 87% 95%
0.02 91% 95%
T25I15D320K 0.08 0.03 95% 92%
0.04 95% 92%
0.05 91% 91%
H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm 165

0.06 91% 91%


0.07 95% 88%
0.08 100% 96%
0.09 91% 95%

4. Conclusions

This paper made a study on probabilistic frequent itemset mining over uncertain
databases. The proposed algorithm PFIMSample employed the Chebyshev inequation
to estimate the count of frequent items, and thus can partly reduce the computing cost
from O(n(logn)2) to O(n). Moreover, we used the sampling method to improve the
performance with a high accuracy. Our extensive experimental results over two
datasets showed that our algorithm was effective and efficient.

Acknowledgement

This research is supported by the National Natural Science Foundation of China


(61100112, 61309030), National Social Science Foundation of China (13AXW010),
Discipline Construction Foundation of Central University of Finance and Economics.

References

[1] B. Babcock, S. Babu, M. Datar, et al. Models and issues in data stream systems. Proceedings of PODS,
2002.
[2] J. Han, H. Cheng, D. Xin, et al. Frequent pattern mining: current status and future directions. Data
Mining & Knowledge Discovery. 15(2007):55-86.
[3] J. Chen, Y. Ke, W. Ng. A survey on algorithms for mining frequent itemsets over data streams.
Knowledge and Information System, 16(2008), 1-27.
[4] C. C. Aggarwal, P. S. Yu. A survey of uncertain data algorithms and applications. IEEE Transaction on
Knowledge and Data Engineering, 21(2009), 609-623.
[5] A. Y. Zhou, C. Q. Jin, G. R. Wang, et al. A survey on the management of uncertain data. Chinese
Journal of Computers, 31(2009).
[6] J. Z. Li, G. Yu, A. Y. Zhou. Challenge of uncertain data management. Chinese Computer
Communications, 5(2009).
[7] J. Xu, N. Li, X. J. Mao, et al. Efficient probabilistic frequent itemsets mining in big sparse uncertain data.
Proceedings of PRICAI, 2014.
[8] Y. Konzawa, T. Amagasa, H. Kitagawa. Probabilistic frequent itemset mining on a gpu cluster. IEICE
Transactions of Information and Systems, E97-D(2014) , 779-789.
[9] Q. Zhang, F. Li, K. Yi, Finding frequent items in probabilistic data. Proceedings of SIGMOD, 2008
[10] T. Bernecker, H. P. Kriegel, M. Renz, et al, Probabilistic frequent itemset mining in uncertain
databases. Proceedings of SIGKDD, 2009.
[11] L. Sun, R. Cheng, D. W. Cheung, et al, Mining uncertain data with probabilistic guarantees.
Proceedings of SIGKDD, 2010.
[12] T. Bernecker, H. P. Kriegel, M. Renz, et al, Probabilistic frequent pattern growth for itemset mining in
uncertain databases. Proceedings of SSDM, 2012.
[13] L. Wang, R. Cheng, S. D. Lee, et al, Accelerating probabilistic frequent itemset mining: a model-based
approach. Proceedings of CIKM, 2010.
[14] L. Wang, D. Cheung, R. Cheng, et al. Efficient mining of frequent item sets on large uncertain
databases. IEEE Transaction on Knowledge and Data Engineering, 24(2012), 2170-2183.
166 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm

[15] T. Calders, C. Garboni, B. Goethals. Approximation of frequentness probability of itemsets in uncertain


data. Proceedings of ICDM, 2010.
[16] Y. Tong, L. Chen, Y. Cheng, et al. Mining frequent itemsets over uncertain databases. Proceedings of
VLDB, 2012.
[17] P. Tang, E. A. Peterson. Mining probabilistic frequent closed itemsets in uncertain databases.
Proceedings of ACMSE, 2011.
[18] E. A. Peterson, P. Tang. Fast approximation of probabilistic frequent closed itemsets. Proceedings of
ACMSE, 2012.
[19] Y. Tong, L. Chen, B. Ding, Discovering threshold-based frequent closed itemsets over probabilistic
data. Proceedings of ICDE, 2012.
[20] C. Liu, L. Chen, C. Zhang. Mining probabilistic representative frequent patterns from uncertain data.
Proceedings of SDM, 2013.
[21] C. Liu, L. Chen, C. Zhang. Summarizing probabilistic frequent patterns: a fast approach. Proceedings
of SIGKDD, 2013.
[22] B. Pei, S. Zhao, H. Chen, et al. FARP: Mining fuzzy association rules from a probabilistic quantitative
database. Information Sciences, 237(2013), 242-260.
[23] P. Y. Tang and E. A. Peterson. Mining probabilistic frequent closed itemsets in uncertain databases,
Proceedings of ASC, 2011.
Fuzzy Systems and Data Mining II 167
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-167

Priority Guaranteed and Energy Efficient


Routing in Data Center Networks
Hu-Yin ZHANG a, b,1 , Jing WANG b, Long QIAN b and Jin-Cai ZHOU b
a
Shenzhen Research Institute of Wuhan University, Shenzhen, Guangdong, China
b
School of Computer Science, Wuhan University, Wuhan, Hubei, China

Abstract. In data center networks, energy consumption accounts for a


considerably large slice of operational expenses. Many energy saving strategies
have been proposed, most of them follow the point of bandwidth or throughput to
complete the design of energy saving model. This paper provides a new
perspective of energy saving in data center networks, which basic idea is to ensure
the higher priority traffics have the shorter routes. Combine the bandwidth
constraints with the aim of energy saving, and keep the balance between energy
consumption and traffic priority demand. Simulations show that our routing
algorithm can effectively reduce the transmission delay of the higher priority
traffics and reduce the power consumption of data center networks.

Keywords. data center networks, priority, energy saving, routing

Introduction

With the change of traffic model, large-scale data center networks (DCNs) are often
deployed in Fat-Tree architecture as the non-blocking network. It has over provisioned
network resources and inefficient power usage. Thus, the goal of network power
conservation is to make the power consumption on networking devices proportional to
the traffic load [1]. Many researchers have investigated energy saving for DCNs from
different aspects. The article [2] proposed energy saving routing based on elastic tree.
In [3], the authors proposed a data center energy-efficient network-aware scheduling.
The article [4] presented an energy efficient routing algorithm with the network load
balancing and energy saving. In [5], the authors proposed a bandwidth guaranteed
energy efficient DCNs scheme from the perspective of routing and flow scheduling.
The article [6] aimed to reduce the power consumption of DCNs from the routing
perspective while meeting the throughput performance requirement.
In the DCNs, the network delay is also an important parameter which can reflect
the network performance [7]. The traffic with high priority usually has a strict demand
of transmission delay. In this paper, a new energy efficient routing algorithm is
proposed with the considering of traffic priority. Its basic idea is to make sure the
higher priority traffics get the shorter routings, combine the bandwidth constraints, and
balance between energy consumption and traffic priority demand.

1
Corresponding Author: Hu-Yin ZHANG, Shenzhen Research Institute of Wuhan University,
Shenzhen, China; School of Computer Science, Wuhan University, Wuhan, China; E-mail:
zhy2536@whu.edu.cn.
168 H.-Y. Zhang et al. / Priority Guaranteed and Energy Efficient Routing in Data Center Networks

1. Network Model and Problem Statement

1.1. Network Model

Figure 1 shows the Fat-Tree architecture, which contains three tiers of switch modules,
the Ck core switches, the Ak aggregation switches and the Sk edge switches. i is from one
to n, it represents the number of switches in the tier. This is conventionally denoted as
a v(c, a, s) network. In order to achieve the goal of efficient energy, it needs to use the
links as little as possible, so that we can make switches work in the sleep mode as
many as possible. The Eq.(1) describes the minimum link number, which is intended to
use a minimum number of switches in the v(c, a, s) network.

Figure 1.Fat-Tree DCNs.

min L = Rw v(c, a, s) (1)


z
w
R 㨪 y R(Ck , Ak , Sk ) , i = 1,2, ⋯ n (2)
k~

Rw in Eq.(1) is an array which array R sum the nodes and then take a linear
transform in Eq.(2). It expresses the number of active switches in each tier. In the array
R, Ck , Ak , Sk represent the node name of active switches in each tier respectively. The
problem is how to obtain the optimal array R with the priority guaranteed traffics, and
establish the least links.

1.2. Problem Statement

In order to establish the least links, the bandwidth utilization of the used links needs to
get a maximum value. In array R, the higher priority traffic will choose the shorter
routing path. However, we will encounter the problem in figure 2.
When traffic 1 (higherpriority) used the A–>B–>E path, traffic 2 and 3 have no
path to use, and the failure bandwidth (FB) occurs. If we analyze the traffic
requirements from the overall situation, optimize the routing and change the traffic 1
into the path A->D->E, then traffic 2 and 3 can both have their paths, and the FB is 0.
Although the higher priority traffic needs to choose a new route, it did not increase its
forwards, so we can regarded this change as no increase of transmission delay.
H.-Y. Zhang et al. / Priority Guaranteed and Energy Efficient Routing in Data Center Networks 169

Figure 2. Failure bandwidth.


The goal of our priority guaranteed and energy efficient routing (PER) algorithm is
no FB with the priority guarantee, and obtain the optimal array _ which can make the
DCNs topology to gained the maximum bandwidth utilization and the minimum links,
then we turn idle switches into sleep mode for energy saving.

2. PER Algorithm

2.1. Network Model

The scheme is to compute transmission paths for all flows in the DCNs topology and
reduce the energy consumption of switches in this topology as little as possible.

Any Failure
Bandwidth ?

Figure 3. PER scheme.


As shown in figure 3, the PER algorithm works in the following steps:
x Step 1, according to the priority level, the highest priority traffics obtains the
shortest routing configuration.
170 H.-Y. Zhang et al. / Priority Guaranteed and Energy Efficient Routing in Data Center Networks

x Step 2, update priority parameter, and then configure the lower priority traffic.
x Step 3, see if there is any failure bandwidth, if yes jump to step 4, else jump to
step 6.
x Step 4, the priority guaranteed optimization algorithm.
x Step5, is the optimized new routing of the higher priority traffic longer than
the existing one? If yes, the routing of the higher priority traffic maintain the
existing one, the lower priority traffic choose the longer path. If no, the
optimized new routing will be executed. Then repeat step 3.
x Step 6, judge that if all configurations are completed, if not completed, repeat
the step 3, if all configurations are completed, generate the energy efficient
routing topology, and then turn the idle switches into sleep mode.

2.2. Priority Guaranteed Optimization Algorithm

This algorithm is designed for selecting route path for the flows with different priorities.
Each selected path can eliminate the failure bandwidth and make the link bandwidth
utilization rate as high as possible. If there are many available paths for a flow, the
problem can be converted to an undirected graph G=(S, E). Assume that the weight is
the bandwidth left in the link. We need to find out the shortest path from node b6 to b€ ,
and maximize the link bandwidth utilization. So the more bandwidth left, the bigger
weight link has. We use these path selection rules as follow:
x Rule 1, set an accessorial vector SA, each of its components SA[i][j]
represents the weight of the link from source node Sk to node A .
x Rule 2, the state of the SA: if there is a link from node Sk to A , SA[i][j]
represents the weight of this link, if no link, SA[i][j]= -1. So, we choose (Sk ,A )
for the SA[i][j]= Min{ SA | A V }. If there are links with same weight, we
choose the node which has the minimum subscript.
x Rule 3, another accessorial vector AC, each of its components AC[j][k]
represents the weight of the link from source node A to C‚ .
x Rule 4, the state of the AC: if there is a link from node A to C‚ , AC[j][k]
represents the weight of this link, if no link, AC[j][k]= -1. So, we choose
(A ,C‚ ) for the AC[j][k] = Min{ AC | C‚ V }. If there are links with same
weight, we choose the node which has the minimum subscript.
x Rule 5, we store the nodes which be selected from the rule 1 to 4.
x Rule 6, if there is any failure bandwidth, all flows in the links which related to
failure bandwidth should be reconfigured according to the Rule 1 to 4. We
should select the route path for the flow which has caused the
failure bandwidth firstly, and then for other flows according to
priorities in descending order.
x Rule 7, all the nodes including in the routing that generated from Rule 6 are
stored in the accessorial array D, then we do a comparison between the array
D and R.
x Rule 8, if the nodes of higher priority traffics in array D is more than R, the
higher priority traffics will preserve the status inarray R and then copy R to D,
or else the higher priority traffics will choose the status inarray D and then
copy D to R.
H.-Y. Zhang et al. / Priority Guaranteed and Energy Efficient Routing in Data Center Networks 171

When the path selections for all traffics are completed, the higher priority flows
are configured with the less number of routing nodes, and the array R stores switch flag
nodes which we used in the links. Therefore, we can sleep the idle switches in order to
save the data center energy consumption.

3. Evaluations

We evaluate our PER algorithm by using Fat-Tree topologies and Matlab7.11 platform.
We compare the results to the random routing without priority guaranteed. We use
simulation model with the network /(16,32,32) which includes eighty nodes of
switches. The available bandwidth of each link is randomly generated, and it does not
exceed 10M. We select twelve traffics, and set their priorities and flow capacities
randomly. To simplify the simulation system, we assume that the data processing
abilities of each layer are same. We set the transmission delay that begin from the
current node and arrive at the next node from 30 to 50ms randomly.

Figure 4. Transmission delay.


Figure 4 shows the transmission delay of twelve traffics with different priorities.
The average values of the results of three times are adopted. The dotted line with
blocks shows the value of random routing transmission delay, the solid line with dots
represents the PER algorithm transmission delay. From this figure we can see that the
transmission delay of the PER algorithm is less than the random routing, and the
fluctuation of the PER algorithm is small.
100
Active switches

80
60
40
20
0
Fat-Tree Random PER

Figure 5. Energy consumption.


172 H.-Y. Zhang et al. / Priority Guaranteed and Energy Efficient Routing in Data Center Networks

Figure 5 shows the energy consumption of three kinds of topologies based on the
network /(16,32,32). In the Fat-Tree topology, eighty switches remain in the active
state even if no traffic in some of them. In the random routing, we used almost half of
the switches in the same network, so nearly half of the number of switches can be turn
into the sleep mode. In the PER, because of the increasing utilization of link bandwidth,
about 75% of the switches can be turn into sleep mode in this network, it reduce the
energy consumption greatly.

4. Conclusion

In this paper, we address the power saving problem in DCNs from a routing
perspective. We establish the network model, and introduce the priority guaranteed and
energy efficient routing problem. Then we propose a routing algorithm to solve the
problem of improving energy consumption in DCNs with the guarantee of traffic
priorities. The evaluation results demonstrate that our algorithm can effectively reduce
the transmission delay of the higher priority traffics and the power consumption of
DCNs compared with the random routing.

Acknowledgements

This work was supported by the National Natural Science Foundation of China under
Grant No. 61540059, and the Shenzhen science and technology projects under Grant
No. JCYJ20140603152449639.

References

[1] L.A. Barroso, U. Hlzle, The case for energy-proportional computing, Computer, 40(2010):33–37.
[2] B. Heller, S. Seetharaman, P. Mahadevan, et al.., Elastic Tree: Saving energy in datacenter networks.
Proc of the 7th USENIX Symp on Networked Systems Design and Implementation (NSDI 10). New
York: ACM, 2010:249–264.
[3] D Kliazovich, P Bouvry, S.U. Khan, DENS: Data Center Energy-Efficient Network-Aware Scheduling,
Cluster Computing, 16(2013):65–75.
[4] S Dong, R Li, X Li, Energy Efficient Routing Algorithm Based on Software Defined Data Center
Network, Journal of Computer Research and Development, 52(2015): 806–812.
[5] T Wang, B Qin, Z Su,Y Xia, M Hamdi, et al.., Towards bandwidth guaranteed energy efficient data
center networking, Journal of Cloud Computing, 4(2015):1–15.
[6] M Xu, Y Shang, D Li, X Wang, Greening data center networks with throughput-guaranteed power-aware
routing, Computer Networks, 57(2013):2880–2899.
[7] W. Lao, Z. Li, Y. Bai, Methodology and Realization of Measure on Network Performance Parameter,
Computer Applications & Software, 21(2004).
Fuzzy Systems and Data Mining II 173
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-173

Yield Rate Prediction of a Dynamic


Random Access Memory Manufacturing
Process Using Artificial Neural Network
Chun-Wei CHANG and Shin-Yeu LIN1
Department of Electrical Engineering, Chang Gung University, Taiwan

Abstract. To provide a reference for the fault detection of a dynamic random


access memory (DRAM) manufacturing process, we propose a yield-rate predictor
using an artificial neural network (ANN). The inputs to the ANN are the
machining parameters in each step of the manufacturing process for a DRAM
wafer, and the output is the yield rate of the corresponding wafer. In this study, a
three-layer feed-forward back propagation ANN is used and trained by
input-output pairs of data collected from real manufacturing process. We have
tested the proposed ANN for five cases, and each case has different size of training
data set. The test results show that the average of the absolute prediction errors in
all five cases are very small, and as the size of the training data set increases, the
prediction accuracy increases and the associated standard deviation decreases.

Keyword̆̆. data mining, DRAM, yield analysis, artificial neural network, fault
detection.

Introduction

Due to the lengthy manufacturing process for a dynamic random access memory
(DRAM) chip [1-2], it would be beneficial if any manufacturing error can be detected
earlier before the whole process is completed. To do so, on-line machine-fault
detection should be performed to prevent further damage to the wafers in process [3-5].
In general, a yield rate that is much lower than average indicates a possible fault with
high probability. Therefore, an on-line prediction of the yield rate would be helpful to
the machine-fault detection.
For any integrated circuitry, the machining parameters in each step of the
manufacturing process are usually specified. However, no physical or mathematical
model exists to relate the machining parameters with the yield rate. To cope with this
modeless problem, data mining technique can be used to investigate this relationship by
extracting the information from the manufacturing data [6-7]. Therefore, in this paper,
we propose using an ANN to build up the functional relationship between the
machining parameters and the yield rate, and use the constructed ANN to serve as a
yield rate predictor [8-10]. The training algorithm for the proposed ANN will be
introduced, and the ANN will be trained by real manufacturing data. To investigate the

1
Corresponding Author: Chun-Wei CHANG, Department of Electrical Engineering, Chang Gung
University, Kwei-Shan, Tao-Yuan 333, Taiwan; E-mail: shinylin@mail.cgu.edu.tw.
174 C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process

effect of the size of the data set on training the ANN, the prediction accuracy of the
ANN trained by different size of data sets will be investigated in this paper.
This paper is organized in the following manner. Section 1 presents the proposed
ANN. Section 2 presents the test results of the proposed ANN. Section 3 draws a
conclusion.

1. Construction of ANN

There are two parts for constructing an ANN as a yield rate predictor. The first part is
to collect the data of machining parameters and the corresponding yield rate of DRAM
wafers to serve as a training data set. The second part is using the training data set to
train the ANN.

1.1. Data Collection

There are hundreds to thousands of processing steps for manufacturing a DRAM chip.
Each DRAM wafer may repeatedly visit the same machine but with different setup of
machining parameters. To train the ANN, a pair of input and output data for each wafer
is collected. The collected input data are the machining parameters, which consist of
the following types, average thickness of oxide coating, range of thickness of oxide
coating, average Nitride thickness, range of Nitride thickness, polish time of chemical
mechanical planarization, photo dose, photo focus, etc. The output data is the yield rate
of the wafer, e. g. 90%. Therefore, the input-output pair of data is formed by a multiple
input data and a single output data, and the collected input-output pairs of data will
serve as the training data set for the ANN.

1.2. Training ANN

T
x [ x1 , , xN ]
Let , where x1 , , xN represent the N machining parameters. Let
y ( x) represent the yield rate of the wafer, which is a function of the vector of
machining parameters x . Let M denote the number of input-output pairs of
collected training data set. We employ a feed-forward back propagation ANN that
consists of an input layer, one hidden layer and an output layer [11]. Fig. 1 shows the
three-layer ANN consisting of N input neurons, q hidden-layer neurons, and 1
Z
output neuron, where i , j , i 1, …, q , j 1,…, N , and E k , k 1,…, q
represent the arc weights.
The N neurons in the input layer correspond to x , and the single output neuron
is for y ( x ) . The input layer neurons directly distribute each component of x to
neurons of hidden layer. Hyperbolic tangent sigmoid function shown in Eq. (1) is used
as the activation function of the hidden layer neurons.

e x  e x
tanh( x) (1)
e x  e x
C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process 175

The activation function of the output layer is a linear function as described in


Figure 1.

Z 1 ,1 tanh(˜)
E1
x1 Z 2,1 2
Zq ,1 tanh(˜) E2
Z1,2 y( x)
¦
3
x2 E3
Zq ,2 tanh(˜)
..
. Z1,N ..
.
Eq
Z3,N q
xN
Zq , N tanh(˜)
ʳʳʳʳʳʳʳʳʳʳ
Figure 1. A three-layer feed-forward back propagation ANN

The procedures to train the ANN using the training data set, which are the M
input-output pairs of collected data, can be stated in the following. For a given input
xi to the ANN that is presented in Fig. 1, we let the corresponding output of the ANN
be denoted by yˆ (xi | ω, β) , which can be calculated by the following formula:
q N
yˆ (xi | ω, β) ¦ E tanh(¦Z , jθij )
1 j 1
(2)

where β [E1 , , Eq ]T and ω [Z1,1 , , Zq, N ]T are the vectors of arc


weights of the ANN; xij is the jth component of xi . The training problem for the
considered ANN is to find the vectors of arc weights ω and β that will minimize
the following mean square error (MSE) problem:

M
1
min
ω ,β M
¦{ y(x )  yˆ (x
i 1
i i | ω, β)}2 (3)

Based on the collected M pairs of (xi , y(xi )), i 1,..., M . We employed


Levenberg-Marquardt algorithm [12] as the iterative training algorithm to solve (3).
Stopping criteria of the employed training algorithm are when any of the following two
conditions occurs: (i) the sum of the mean squared errors, i.e. the objective value of the
MSE problem, is smaller than 0.01, and (ii) the number of epochs exceeds 1000.

2. Test Results

In this section, the prediction accuracy of the trained ANN is investigated. In addition,
the relationship between the size of training data set and the prediction accuracy is also
investigated. Therefore, five test cases with various size of training data set are set up.
In all test cases, the number of machining parameters is set to N =78. The employed
three-layer ANN consists of 78 input neurons, 150 (= q ) hidden layer neurons and one
176 C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process

output neuron. The number of epochs exceeding 1000 is chosen as the termination
criteria for training the ANN. The value of M , which is the size or the number of
input-output pairs of training data set, is set to M =50 for case 1, 150 for case 2, 400
for case 3, 700 for case 4 and 1000 for case 5. For each case, we collect 2M pairs of
input-output data from real manufacturing process and separate them into two sets, the
training and testing data sets. The training data set is used to train the employed
three-layer ANN, and the testing data set is utilized to test the prediction accuracy of
the trained ANN.
The prediction accuracy of the trained ANN is defined by the average of the
percentage of the absolute error between the actual and the predicting yield rate, which
is denoted by e and can be calculated by the following equation

1 M
y(xi )  yˆ (xi | ω, β)
e
M
¦|
i 1 y ( xi )
| u100% (4)

The corresponding standard deviation Ve is defined as

M
1
Ve
M
¦ (e  e )
i 1
i
2
(5)

y (xi )  yˆ (xi | ω,β)


where ei | | u100%
y ( xi )
For each of the five cases, after the ANN is trained by the corresponding training
data set, we test the trained ANN using the corresponding testing data set. The
prediction accuracy e for the trained ANN in all five cases is presented in Table 1,
and the associated standard deviation V e in all five cases is reported in Table 2. From
Table 1, we see that e =4.72 for case 5, and as the size of the training data set
increases, the prediction accuracy increases. From Table 2, we see that V e =4.73 for
case 5, and as the size of the training data set increases, V e decreases, which implies
that the prediction accuracy is more stable. Therefore, from the results presented in
Tables 1 and 2, we see that larger training data set enhances the prediction accuracy of
the ANN before exceeding the size that causes overtraining. To give a more insightful
prediction results of the trained ANN, a histogram of the number of tested input-output
pair with respect to the percentage of the prediction error, which is defined as
y (xi )  yˆ (xi | ω,β)
u 100% , is presented in Figure 2.
y ( xi )
Table 1. Prediction accuracy of the trained ANN for the five cases

case 1 2 3 4 5
Size, M 50 150 400 700 1000

e 45.30 17.79 10.87 6.29 4.72


C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process 177

Table 2. Standard deviation of the prediction accuracy of the trained ANN for the four cases

case 1 2 3 4 5
Size, M 50 150 400 700 1000
36.00 15.13 9.08 5.55 4.73

180
Number of input-uotput pairs

160
140
120
100
80
60
40
20
0
-30 -26 -22 -18 -14 -10 -6 -2 2 6 10 14 18 22 26 30
Percentage of prediction error

Figure 2. Histogram of the prediction error of case 5.

From Figure 2, we see that most of the tested input-output pairs of data are with
very small prediction error, which confirms that the proposed ANN can serve as a good
yield rate predictor for DRAM manufacturing process.

3. Conclusion

In this paper, a three-layer feed-forward and back propagation ANN is presented and is
used to serve as a predictor for the yield rate of a DRAM manufacturing process. The
proposed ANN is trained and tested using real manufacturing data. The test results
reveal that the prediction errors are very small, and as the size of the training data set
increases, the prediction accuracy of the ANN increases and the associated standard
deviation decreases. Therefore, the presented ANN is qualified to serve as a yield rate
predictor for the future purpose of fault detection.

Acknowledgments

This research work is supported in part by Chang Gung Memorial Hospital under grant
BMRP29.
178 C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process

References

[1] K. Chandrasekar, S. Goossens, C. Weis, M. Koedam, B. Akesson, N. Wehn and K. Goossens, Exploiting
expendable process-margins in DRAMs for run-time performance optimization, Design, Automation &
Test in Europe Conference & Exhibition, 2014, 1-6.
[2] P. S. Huang, M. Y. Tsai, C. Y. Huang, P. C. Lin, L. Huang, M. Chang, S. Shih and J. P. Lin, Warpage,
stresses and KOZ of 3D TSV DRAM package during manufacturing process, 14th International
Conference on Electronic Materials and Packaging, 2012, 1-5.
[3] S. Hamdioui, M. Taouil and N. Z. Haron, Testing open defects in memristor-based memories, IEEE
Trans. on Computers, 64(2015), 247-259.
[4] R. Guldi, J. Watts, S. PapaRao, D. Catlett, J. Montgomery and T. Saeki, Analysis and modeling of
systematic and defect related yield issues during early development of a new technology, Advanced
Semiconductor Manufacturing Conference and Workshop, 4(1998), 7-12.
[5] L. Shen and B. F. Cockburn, An optimal march test for locating faults in DRAMs, Records of the 1993
IEEE International Workshop on Memory Testing, 1993, 61-66.
[6] A. Purwar and S. K. Singh, Issues in data mining: a comprehensive survey, IEEE International
Conference on Computational Intelligence and Computing Research, 2014, 1-6.
[7] J. Han and M. Kamber. Data mining concepts and techniques. 2nd ed. Morgan Kaufmann Publishers,
2006.
[8] B. Dengiz, C. Alabas-Uslu and O. Dengiz, Optimization of manufacturing systems using a neural
network metamodel with a new training approach, Journal of the Operational Research Society,
60(2009), 1191-1197.
[9] N. Alali, M. R. Pishvaie and V. Taghikhani, Neural network meta-modeling of steam assisted gravity
drainage oil recovery processes, Journal of Chemistry & Chemical Engineering, 29(2010), 109-122.
[10] T. Chen, H. Chen and R. Liu, Approximation capability in C(Rn) by multilayer feed-forward networks
and related problems, IEEE Transactions on Neural Networks, 6(1995), 25-30.
[11] J. A. Anderson. An introduction to neural network. MIT Press, Boston, USA, 1995.
[12] B. M. Wilamowski and H. Yu, Improved computation for Levenberg-Marquardt training, IEEE Trans.
On Neural Network, 21(2010), 930-937.
Fuzzy Systems and Data Mining II 179
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-179

Mining Probabilistic Frequent Itemsets


with Exact Methods
Hai-Feng LI 1 and Yue WANG
School of Information, Central University of Finance and Economics,
Beijing, China, 100081

Abstract. Probabilistic frequent itemset mining over uncertain databases is a chal-


lenging problem. The state-of-the-art algorithm uses O(nlog 2 n) time complexity
to conduct the mining. We focus on this problem and design a framework, which
can discover the probabilistic frequent itemsets with traditional exact frequent item-
set mining methods; thus, the time complexity can be reduced to O(n). In this
framework, we supply a minimum confidence to convert the uncertain database to
exact database; furthermore, a sampling method is used to find the reasonable min-
imum confidence so that the accuracy is guaranteed. Our experiments show our
method can significantly outperform the existing algorithm.
Keywords. Uncertain Database; Exact Database; Probabilistic Frequent Itemset
Mining; Exact Frequent Itemset Mining; Data Mining

Introduction

Frequent itemset mining is one of the important techniques in data mining, which dis-
covers the patterns from databases to support the commercial decisions. Recently, new
applications have been developed in web site, Internet and wireless networks, which will
generate many uncertain data, that is, each data will be adhered with a probability to show
the existence of the data[3], Table 1 shows an example of the uncertain database with
4 items {a, b, c, d}. In such cases, traditional exact frequent itemset mining algorithms
were studied in the recent years[1] were not effective yet since new feature brings us
new challenges; thus, new methods need to be designed to handle this data environment.
The existing uncertain frequent itemset mining methods can be split into two categories.
One is based on the expected support to achieve the results[2], another is to discover
the probabilistic frequent itemsets according to the definition of probabilistic support[4].
The probabilistic frequent itemsets, in comparison to the expected frequent itemsets, can
better represent the probability of the itemsets; thus, the mining problem obtain more
focus. Nevertheless, the mining is hard since converting the uncertain database to exact
database is an NP-hard problem. One will use O(nlog 2 n) time complexity and O(n)
space complexity to conduct the probabilistic support computing for an itemset. Clearly
to see, when the database size n is large, the mining cost will be huge.

1 Corresponding Author: Hai-Feng Li, School of Information, Central University of Finance and Economics,

Beijing, China, 100081; E-mail:mydlhf@cufe.edu.cn.


180 H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods

Table 1. An Example of Uncertain Database


ID Uncertain Transaction
1 a:0.6 b:0.4 d:1
2 a:0.8 c:0.6
3 b:1 c:0.9
4 a:0.8 b:0.8 c:0.6 d:0.6
5 d:0.7

In this paper, we will focus on the mining problem and present an approximate
method to convert the uncertain database to exact database so that the runtime can be re-
duced. The rest of the paper are organized as follows. section 1 presents the preliminaries
and then present the challenge of the problem. Section 2 introduces our method. Section
3 evaluates the performance with our experimental results. Finally, section 4 concludes
the paper.

1. Preliminaries and Problem Definition

1.1. Preliminaries

Given a set of distinct items Γ = {i1 , i2 , · · · , in }, in which we use |Γ| = n de-


notes the size of Γ. A subset X ⊆ Γ is an itemset. Suppose each item xt (0 <
t ≤ |X|) in X is annotated with an existing probability p(xt ), we call X an un-
certain itemset, which is denoted as X = {x1 , p(x1 ); x2 , p(x2 ); · · · ; x|X| , p(x|X| )},
|X|
and the probability of X is p(X) = Πi=1 p(xi ). An uncertain transaction U T is an
uncertain itemset with an ID. An uncertain database U D is a collection of uncertain
transactions U Ts (0 < s ≤ |U D|). If X ∈ U Ts , then we use p(X, U Ts ) to de-
note the probability that X occurs in U Ts . As a result, in U D, X occurs t times
with a probability pt (X, U D) = ΣΠX∈U Ts ,count(U Ts )=t (p(X, U Ts ))ΠX∈U Ts (1 −
p(X, U Ts )). The list {p1 (X, U D), p2 (X, U D), · · · , p|U D| (X, U D)} is the probabil-
ity density function.Given an itemset X, the number it occurring in an uncertain
database is called the support of X, denoted Λ(X). Consequently, we use ΛP τ (X) ≥ i
to denote the probability that X occurs more than i times, which is actually the
{pi (X, U D), pi+1 (X, U D), · · · , p|U D| (X, U D)}.

1.2. Problem Definition

Probabilistic Frequent Itemset[9] Given minimum support λ, minimum probabilistic


confidence τ and uncertain database U D, itemset X is a probabilistic frequent itemset
τ (X) ≥ λ, in which the Λτ (X) is the maximal support of
iff the probabilistic support ΛP P

itemset X that has probabilistic confidence τ , that is,

τ (X) = M ax{i|PΛ(X)≥i > τ }


ΛP (1)

In this paper, we will discover all the probabilistic frequent itemsets from the uncer-
tain databases for the given λ and minimum probabilistic confidence.
H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods 181

Table 2. The Impact of Minimum Probabilistic Confidence τ


Mininum Probabilistic Confidence
Dataset
0.009 0.09 0.9 0.1 0.01 0.001
Runtime Cost(Sec) 34.5 35.4 34.8 35.2 35.6 34.7
GAZELLA(λ=0.08)
Memory Cost(MB) 66.8 67.4 67.4 67.1 67.4 66.9
Runtime Cost(Sec) 1979 1931 1882 1894 1873 1989
T25I15D320K(λ=0.1)
Memory Cost(MB) 1275 1275 1275 1275 1275 1275

To address this problem, many research have be studied. Zhang et al. firstly intro-
duced the conception of probabilistic frequent items[4], and employed the dynamic pro-
gramming(DP) technique to perform mining, which was improved by Bernecker et al.[5]
with using the a priori rule for further pruning. With this method, the time complexity is
O(n2 ) and the space complexity is O(n). Sun et al. improved the method by regarding
the probability computation as the convolution of two vectors and thus used the divide-
and-conquer method(DC)[6] to conduct mining, in which the fast fourier transform can
reduce the computing complexity from O(n2 ) to O(nlog 2 n). The probabilistic frequent
itemset and the expected frequent itemset were proved having relationships in [7] based
on standard normal distribution. Tong et al. surveyed all the methods in [8].

2. Probabilistic Frequent Itemset Mining Method

As can be seen, the most efficient method to computing the probabilistic support has a
significantly high cost, which, as a result, will reduce effective of the mining method in
real applications. We develop a novel method, which does not consider the method of
improving the mining method but design a framework to mining probabilistic frequent
itemsets with traditional exact frequent itemset mining methods. In this framework, we
build a relationship between uncertain data and exact data with a supplied parameter,
called the minimum confidence
.
With the minimum confidence
, we can convert the uncertain database to the exact
one as follows. We will scan the uncertain database, if the probability of an item is
smaller that
, then we will consider it as not existing in the exact database, otherwise
existing. The reason behind it is based on an instinctive consideration, that is, an item
with a small probability contributes little in getting an effective probability of its high
occurrences. Once an exact database is generated, we can employ the traditional frequent
itemset mining algorithm to discover the results. The pseudocode is shown in Algorithm
1. As an example, when we set
= 0.5, then the uncertain database in Table 1 can be
converted to the database in Table 3, in which all the items with probability smaller than
0.5 are removed directly.
In this paper, we ignored τ for two reasons. On the one hand, in [8], Tong et. al
evaluated that τ has little impact over the mining results with their experiments; we
also conducted experiments with the state-of-the-art algorithm TODIS, whose results are
shown in Table 2. As can be seen, when we fix the minimum support, the runtime cost
and the memory cost kept almost unchanged no matter how τ changes. On the other
hand, we employ a novel framework to convert the uncertain database to exact one, over
which the traditional mining methods can be used, and thus τ is not useful and can be
ignored accordingly.
182 H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods

Table 3. The Database Converted from the Uncertain Database when  = 0.5
ID Transaction
1 ad
2 ac
3 bc
4 abcd
5 d

Algorithm 1 Probabilistic Frequent Itemset Mining Method


Require: UD: an initial uncertain database;
D: the converted exact database;
T: the transaction in D;

: minimum confidence;
λ: minimum support;
1: for each uncertain transaction U Ti in U D do
2: for each uncertain item U I in U Ti do
3: if U I.prob ≥
then
4: add U I in transaction Ti ;
5: add Ti in D;
6: perform exact frequent itemset mining algorithm with λ;

Analysis: Our method is in line with the database size n, that is, the conversion
from uncertain data to exact data need O(n) time complexity; furthermore, an other
advantage is that it can read the data into the memory synchronously, which can almost
be ignored. On the other hand, since our final mining is based on the exact database, the
mining speed can be much improved. In comparison to the uncertain database mining,
the time complexity will be reduced to O(n) at least. Suppose the count of itemsets that
need to be computed is m, then our method has the time complexity O(mn), the most
effective method to directly discover the probabilistic frequent itemsets, however, needs
o(mnlog 2 n). Clearly to see, when the database size n is large, the mining speed will be
improved significantly.
Even though the performance can be improved, the mining results will be approx-
imate. The minimum confidence
is the key parameter to determine how approximate
the mining results. Consequently, how to decide
is the main problem so far. Table 4 is
the precision and recall of our method when we set the minimum support to 0.1, 0.08
and 0.06; also, we set the minimum confidence to 0.9, 0.8 and 0.7. As can be seen, the
precision and the recall will reach to their highest value when for a special minimum
confidence. That is to say, if we find this special minimum confidence, the accuracy will
be high.
To address this problem, we employed a sampling method to find this special param-
eter. Before we convert the uncertain database, we will take samples from the database as
a sub-database, which will firstly be converted and we can use our method to determine
the minimum confidence; then, the mining can be conducted over the entire database.
H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods 183

Table 4. Precision and Recall

Precision Recall
Data Minsup
0.9 0.8 0.7 0.9 0.8 0.7
0.1 100% 100% 100% 100% 75% 75%
GAZELLE 0.08 80% 100% 100% 100% 100% 100%
0.06 85% 100% 100% 100% 70% 70%
0.1 11% 88% 100% 100% 72% 56%
T25I15D320K 0.08 25% 95% 100% 100% 74% 64%
0.06 24% 98% 100% 100% 81% 70%

Table 5. Uncertain DataSet Characteristics

uncertain data data size avg. size min. size max. size item count mean variance item corr.
T25I15D320K 320,002 26 1 67 994 0.87 0.27 38
GAZELLE 59,602 3 2 268 497 0.94 0.08 166

(a) GAZELLE (b) T25I15D320K

Figure 1. Running Time Cost for Minimum Confidence

3. Experiments

We evaluate the performance of our framework in comparison to the state-of-the-art al-


gorithm UMiner[6]. The minimum confidence is the main parameter in our evaluations.
The method was implemented with Python 2.7 running on Microsoft Windows 7. The
experimental computer has a 3.60GHZ Intel Core i7-4790M CPU and 12GB memory.
We employed 2 datasets as the evaluation datasets. One is created with the IBM data
generator and another is a real-life dataset. The data characteristics are presented in Ta-
ble 5. Given the item number u and the average transaction size v, we demonstrate the
approximate correlation among transactions with uv .
We presented the runtime cost of our method in comparison to the UMiner algo-
rithm. As can be seen in Figure1(minsup={0.1, 0.08, 0.06}), the minimum confidence
was set from 0.1 to 0.9, the runtime cost was significantly lower than UMiner. The
smaller the minimum confidence, the higher the mining cost; thus, when we set the small-
est minimum confidence 0.1, the highest mining cost can achieve speedup of one hun-
dred over GAZELLE dataset, as well 30 folds faster over T25I15D320K dataset. This
184 H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods

(a) GAZELLE (b) T25I15D320K

Figure 2. Memory Cost for Minimum Confidence

showed that our method was more efficient over the sparse datasets. Moreover, we pre-
sented the memory cost of our method. In Figure2, the memory cost was not impacted
by the minimum support. When the minimum confidence was small, the memory usage
was high, which, however, still smaller that the one of UMiner algorithm.

4. Conclusions

We focused on the probabilistic frequent itemset mining problem over uncertain


databases and proposed a novel method. We did not directly improve the current mining
algorithm but converted the uncertain databases to exact ones, in which a sample method
was used to find a reasonable parameter for accuracy guarantee. With such a method,
many traditional efficient algorithms over exact databases can be employed directly for
probabilistic frequent itemset mining. Our experiments showed our method was efficient.

Acknowledgement

This research is supported by the National Natural Science Foundation of China(61100112,


61309030), Beijing Higher Education Young Elite Teacher Project(YETP0987), Disci-
pline Construction Foundation of Central University of Finance and Economics, Key
project of National Social Science Foundation of China(13AXW010).

References

[1] J.Han, H.Cheng, D.Xin, and X.Yan, Frequent pattern mining: current status and future directions, Data
Mining and Knowledge Discovery,Vol.15(2007),55-86
[2] C.K.Chui, B.Kao, and E.Hung, Mining Frequent Itemsets from Uncertain Data, Proceedings of
PAKDD’2007
[3] C.C.Aggarwal, and P.S.Yu. A survey of uncertain data algorithms and applications. Transaction of
Knowledge and Data Mining, Vol.21(2009), 609-623
[4] Q.Zhang, F.Li, and K.Yi, Finding Frequent Items in Probabilistic Data, Proceedings of SIGMOD’2008
H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods 185

[5] T.Bernecker, H.P.Kriegel, M.Renz, F.Verhein, and A.Zuefle, Probabilistic Frequent Itemset Mining in
Uncertain Databases, Proceedings of SIGKDD’2009
[6] L.Sun, R.Cheng, D.W.Cheung, and J.Cheng, Mining Uncertain Data with Probabilistic Guarantees,
Proceedings of KDD’2010
[7] T.Calders, C.Garboni, and B.Goethals. Approximation of Frequentness Probability of Itemsets in Uncer-
tain Data, Proceedings of ICDM’2010
[8] Y.Tong, L.Chen, Y.Cheng, and P.S.Yu. Mining Frequent Itemsets over Uncertain Databases, Proceed-
ings of VLDB’2012
[9] P.Tang, and E.A.Peterson. Mining Probabilistic Frequent Closed itemsets in Uncertain Databases, Pro-
ceedings of ACMSE’2011
186 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-186

Performance Degradation Analysis Method


Using Satellite Telemetry Big Data
Feng ZHOU a, De-Chang PI a,1, Xu KANG a and Hua-Dong TIAN b
a
College of Computer Science and Technology, Nanjing University of Aeronautics and
Astronautics, Nanjing, Jiangsu, China
b
China Academy of Space Technology, Beijing, China

Abstract. Satellites have features of high control integration, various working


modes, and complex telemetry big data, which make it difficult to evaluate their
performance degradation. In this paper, a novel data mining analysis method is
proposed to analyze the satellite’s telemetry big data, in which sample entropy is
calculated to characterize states and the support vector data description is utilized
to analyze the satellite performance degradation process. The experimental results
show that our proposed method could generally describe the performance degrada-
tion process of satellites. Meanwhile, it also provides an important approach for
the ground-station-monitor to analyze the performance of satellites.

Keywords. performance degradation, telemetry big data, sample entropy, support


vector data description

Introduction

With more and more satellites being sent into space these years, the ground in-orbit
managements have to handle such challenges as satellite’s high control precision, vari-
ous working modes, and high complexity. As advanced technologies and new materials
are utilized in satellites [1, 2], the sudden failure is not the primary failure mode for
most satellite failures, which is replaced by performance degradation. The theory of
analyzing satellite performance degradation only focuses on the overall performance of
equipment, regardless of failure modes, which is different from analyzing sudden fail-
ures.
In 2001, the University of Wisconsin and the University of Michigan, together
with other 40 industry partners, were united to establish the Intelligent Maintenance
Systems (IMS) research center under the U.S. National Science Foundation. After then,
many methods of performance degradation assessment have been proposed, such as the
pattern discrimination model (PDM) based on a cerebellar model articulation controller
(CMAC) neural network [3], self-organizing map (SOM) and back propagation neural
network methods [4], hidden Markov model (HMM) and hidden semi-Markov model
(HSMM) [5], etc. However, these methods are deficient in some aspects. For example,
the results of CMAC assessment method are greatly influenced by parameter setting

1
Corresponding Author: De-Chang PI, College of Computer Science and Technology, Nanjing Univer-
sity of Aeronautics and Astronautics, 29 Yudao Street, Nanjing, Jiangsu, 210016, China. E-mail:
dc.pi@nuaa.edu.cn.
F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 187

and the assessment results of the SOM, neural network method and HMM cannot di-
rectly reflect degradation degree. In order to accommodate the characteristics of as-
sessment for different key components, the analysis theory of performance degradation
has been developed from single degradation variable to a more diverse practical direc-
tion. Although some new theories and methods have emerged, the researches on the
performance degradation of satellite are still limited. M Tafazoli [6] studied in-orbit
failures for more than 130 different spacecraft and revealed that the spacecraft are vul-
nerable to failures occurring in key components. MAW [7] analyzed the space radiation
environment of thermal coatings and proposed degradation models for the optical prop-
erties of thermal coatings. However, these methods mainly focus on failure data and
also require relevant experience.
The conventional analysis methods for satellite performance degradation have
some shortcomings such as experimental difficulties and high cost. Satellites telemetry
big data contain monitoring information, abnormal states, space environment, and oth-
ers, which reflect the operational status and payload of satellites. A novel analysis
method for satellite performance degradation with telemetry big data is proposed in this
paper. This method uses data mining techniques and provides a quantitative description
for satellite performance degradation process.
Recently, the presented performance degradation methods are based on physical
rules or models [8, 9], this methods need to understand the internal structure of the sat-
ellite which is a difficult work for analyst. However, our proposed method uses the data
sampled in satellite operation process to analyze satellite performance degradation pro-
cedure without needing to determine the relationship of equipment accurately. What’s
more, our proposed method studies the characteristics of historical data, summarizes
the regulation of change, and analyzes the performance degradation process automati-
cally. To the best of our knowledge, a similar approach to performance degradation of
satellite has not appeared yet. Furthermore, it also can be extended to apply to failure
prediction.

1. Related Concepts

1.1. Sample Entropy

The sample entropy [10] (SamEn) is an improved algorithm of approximate entropy


(ApEn) proposed by Pincus [11]. The advanced algorithm is able to quantify the com-
plexity rate of a nonlinear time series.
For a data series X N x 1 , x 2 ,...x n , where N is the length of the series, two
parameters are defined: m is the embedded dimension of the vector to be formed and
r is the threshold that serves as a noise filter. The steps to calculate SamEn are shown
as follows:
1) N  m  1 patterns (vectors) are generated, and each pattern owns m dimensions.
The pattern is represented as following:

X m i > x i , x i  1 , , x(i  m  1)@ i 1, , N  m 1 (1)


188 F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data

2) The distance, d ª¬ X m i , X m j º¼ between each two patterns can be computed by


using Eq. (2).


¬X
m
i , X j º¼
m
max[| x i  k  x( j  k ) |]
(2)
k 0, , m  1, j 1,
1 , N  m  1, j z i

3) For each pattern X m i , Crm i N m i / N  m is the probability that other patterns


X m j matches i where the number of matching pattern N m i is the number
pattern X
m

satisfied the condition d ¬ª X m i , X m j ¼º d r . And the matching probability of two se-


quences with m points can be achieved by using Eq. (3):

N  m 1
1
)
m
r
N  m 1
¦ C i m
r
(3)
i 1

4) When the dimension expands to m  1 , steps 1-3 are repeated to find out ) m +1 r .
The theoretical value of the SamEn is defined as follows:

SamEn m, r
N of
^
lim  ln ª¬) m 1 r ) m r º¼ ` (4)

Experiments conducted by Pincus [8] indicate that a reasonable statistical character


can be achieved when m 2 , r 0.1 ~ 0.25 ˜ std X , where std X denotes the standard
deviation of X ^x 1 , x2 , xN ` .

1.2. Support Vector Data Description

Support Vector Data Description [12] (SVDD) is inspired by the Support Vector Clas-
sifier. The method is robust against outliers in the training set and is capable of tighten-
ing the description by using negative examples.
A hypersphere that contains all or most samples of the target class is defined
as X = ^x1 ,x2 , xn ` . The hypersphere is bounded by the core of the hypersphere a and
radius R . If the hypersphere covers all the training samples of target class, the classifi-
cation is established by the empirical error which is equal to zero, and the structural
error is defined as H a,R =R 2 .
As the distance from xi to the core a should not be larger than radius R for all the
samples of the target class X , the constraint of the minimization problem can be de-
2
scribed as xi -a d R 2 .
To account for the possibility of outliers in the training set, the distance between xi
and the core a should not be strictly smaller than R , but larger distances should be
penalized. Therefore, slack variable [ i is brought in, and the minimization problem is
transformed into
F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 189

N
min H R,a,[ =R 2  C ¦ [i s.t xi -a d R 2  [i [i t 0 i =1,2, N
2
,,N (5)
i 1

The penalty factor C makes a trade-off between the volume and errors. The minimi-
zation problem in Eq. (5) can be calculated by using Eq. (6).
L R,a,Di ,[i =R2  C ¦[i  ¦Di ^R2  [ 2  xi  2axi  a 2 `  ¦ J i[i Di t 0,J i t 0 (6)
i i i

In Eq. (6), D i and J i are the Lagrange multipliers. L should be minimized with re-
spect to R , a , and [ i , and maximized with respect to Di and J i . Respectively taking
their partial derivatives equal to zero, and then get the following constraint Eq. (7):

¦Dx = Dx
¦D =1 ¦ C  D i  J i =0 i (7)
i i i
a=
¦D
i i i
i i i i

Substituting (7) into (6),we obtain max L :

max L = ¦ D i xi ˜ xi  ¦ D iD j xi ˜ x j (8)
i =1 i ,j

Thus, the optimization problem can be further transformed into Eq. (9):

max L =1  ¦ D iD j K G xi ,x j , V s.t 0 d D i d C
i ,j
(9)
K G x,y, V = exp  x -y 2
V2
Eq. (9) shows that the core of the hypersphere is a linear combination of the objects.
Only objects xi with D i t 0 are needed in the description. Therefore, these objects are
called the support vectors of the description (SVs). To test an object z , the distance to
the core of the hypersphere and the radius R are respectively calculated by Eq. (10).

d = z -a =K G z , z -2¦ D i K G z , xi + ¦ D iD j K G xi , x j
i i ,j
(10)
R 2 = xsv -a =1-2¦ D i K G xi , xsv  ¦ D iD j K G xi , x j
2

i i ,j

The test object z is accepted when this distance is not greater than the radius
(i.e. d d R ).

2. Method to Analyze the Performance Degradation of Satellite

2.1. Definition Description

Definition 1 (Performance Eigenvector)


190 F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data

The SamEn of a time period is taken as its performance feature. And the vector
composed of the performance features of parameters within the same time period is
called performance eigenvector.
In this study, parameters are not limited to those of the objective equipment, but
they also contain a number of closely related equipment parameters. As parameters are
relative to specialized knowledge, their selections are conducted based on the domain
and expert knowledge.
Definition 2 (Health Model)
With SVDD method, the model obtained by training the performance eigenvector
of satellite in the healthy status is called health model (model).
According to the theory of SVDD, the model described in definition 2 is composed
of the support vectors of healthy state vector (model.SV), corresponding coefficients
( model.V ), number of support vectors (model.len), hypersphere bounded by the core
(model.a) and the radius (model.R)
Definition 3 (Performance Degradation Degree)
Here, dec denotes the distance between the performance eigenvector of satellite
and the core of hypersphere. The performance degradation degree deg which reflects
the “health condition” [13] is defined by the difference between dec and the radius of
hypersphere model.R, that is, deg = dec – model.R (in Figure 1).
It means that performance degradation process of the objective equipment may oc-
curs when the value of deg is larger than 0. When the value increases monotonously,
the speed of performance degradation process of the objective equipment increases
accordingly. As the degree cannot be negative, set deg = 0 when dec – model.R <0.

a) Performance states and eigenvectors b) Performance degradation degree


Figure 1. Principle of the performance degradation degree
Figure 1 shows the principle of performance degradation degree. However, the mod-
el cannot contain all the health status features of the satellite for the operating mode of
satellite is complex, and the training sets in healthy status of each operating mode are
limited. A satellite may remain in the healthy status under other operating modes, espe-
cially when deg is positive.

2.2. Framework Description

Figure 2 shows the overall framework of the analysis for satellite performance degrada-
tion presented in this study, which has four main steps.
Step 1. Select parameters of the satellite according to expert knowledge. Then, medi-
F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 191

an filter method is used to reduce the noise in satellite telemetry big data so as to gen-
erate a new clean dataset.
Step 2. Extract the performance features from the selected parameters through Step 1
according to Definition 1. And compose the final set of the performance eigenvectors.
Step 3. Select the performance eigenvectors in the healthy status as the training set,
and build a health model with SVDD method.
Step 4. To measure the degradation status of the new performance eigenvector, cal-
culate the performance degradation degrees according to Definition 3 and the results of
the model obtained in Step 3.
Satellite
Satellite telemetry
telemetry
r
data
data

Parameter
Parameter selection
selection Expert
Expert knowledge
knowledge

Telemetry
r data
elemetry data
processing
processing
Processing

Local Performance Eigenvector Median


Median Filter
Filter

Eigenvectors
Eigenvectors in
in Eigenvectors
Eigenvectors for
f r
fo Sample
Sample Entropy
Entropy
healthy
healthy states
states analysis
analysis extraction
extraction

Support
u port Vector
Sup Vector Data
Data Health
Health Model
Model
Description
Description

Performance
Perfo
f rmance
Degradation
Degradation Degree
Degree

Figure 2. Framework of the satellite performance degradation analysis

3. Experimental Results and Analysis

The telemetry big data of one satellite is used as experimental data, which recorded
from 2011-05-01 00:00:00.0 to 2011-12-29 18:16:59.987, 14 million data frames that
contain several failures and performance degradation information. In our experiments,
seven important parameters in this dataset are selected by expert knowledge.
The telemetry big data is stored in Oracle 11g, and the algorithms are coded by Java.
The operating system used is Windows Server 2008 R2 Standard with the Intel (R)
Xeon (R) Eight-core E5606 processor with 8 G RAM.

3.1. Telemetry Big Data Processing

The experimental dataset is processed as the following steps:


(1) The outliers caused by decoding or other errors are removed according to the
ranges of the seven parameters. And further, the median filter method in every 30s is
used to reduce the noise in the dataset. Finally, a new dataset is achieved.
(2) The values of time series are normalized into the range [-1, 1] for each parameter
and each time series are equally divided into 800 groups. The performance features of
192 F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data

each group are extracted by Definition 1. Finally, seven performance feature sequences
are obtained with a length of 800. The performance eigenvector is composed of the
features of seven parameters in the group with same number.

3.2. Modeling and Degradation Analysis

(1) Performance eigenvector under healthy status are selected as the training data,
SVDD method is used by setting V =1 in this experiment, and then the health model of
satellite is established.
(2) The remaining dataset is used as test data to verify the obtained health model,
and the degradation degree is calculated according to Definition 3. Figure 3 shows the
final results.
The degradation degrees are unsteady, and the curve is not smooth but fluctuant.
This is mainly due to the recognition accuracy of SVDD and cyclical factors of original
data that does not affect the overall reaction on the degradation process of satellite. In
order to reduce the interference of these factors, a relative algorithm is employed, and
the wavelet denoising sequence is obtained as Figure 3 shows. Overall, the average
degradation degree presents an increasing trend. Given the long period, the accidental
factors cannot influence the degradation degree all the time. Therefore, we conclude
that the satellite has entered the performance degradation state based on Definition 3.
0.35

degradation degree sequence


0.3 wavelet denoise sequence

0.25
degradation degree

0.2

0.15

0.1

0.05

0
1 100 200 300 400 500 600 700 800
group number

Figure 3. Degradation degree


Aerospace experts confirm that two major failures of satellite did occur from late Ju-
ly to late August (between the 246th group and 370th group) for unknown reasons, and
these two failures are corresponding to the two peaks nearby. That proves the correct-
ness of our proposed definition, especially explaining the degradation peak and the
high degradation degree level after the peak. In conclusion, the proposed method can
efficiently describe the performance degradation process of satellite.
By the way, our proposed method as a data-driven method to performance degrada-
tion of satellite has not appeared yet, it also can be used for failure prediction in a varie-
ty of engineering applications, such as aircraft engines.
F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 193

4. Conclusions

A method for satellite performance degradation with telemetry big data is proposed in
this paper while studies for solving this problem are limited. The experimental analysis
shows that the proposed method can extract effective state information from the param-
eters and provide a quantitative description for satellite performance degradation.
Moreover, the analysis on the performance degradation of satellite with telemetry big
data has a significant meaning in in-orbit research and management for satellites.
In our study, the definitions may have some limitations; for example, the degradation
degree of the experiment is unstable but fluctuant. The sample entropy algorithm may
take much time to trim redundant parameters in massive data, which will be improved
in our future work.

Acknowledgment

This paper is supported by the National Natural Science Foundation of China (Grant
No. U1433116).

References

[1] Z.Z. Zhong, D.C. Pi D. Forecasting Satellite Attitude Volatility Using Support Vector Regression with
Particle Swarm Optimization. Iaeng International Journal of Computer Science, 41(2014), 153-162.
[2] F. Zhou, D.C. Pi. Prediction Algorithm for Seasonal Satellite Parameters Based on Time Series Decom-
position. Computer Science, 43(2016), 9-12 (in Chinese).
[3] J. Lee. Measurement of machine performance degradation using a neural network model. Computers in
Industry, 30(1996), 193-209.
[4] R. Huang, L. Xi, et al. Residual life predictions for ball bearings based on self-organizing map and back
propagation neural network methods. Mechanical Systems and Signal Processing, 21(2007), 193-207.
[5] X.S. Si, W. Wang, C.H. Hu, et al. Remaining useful life estimation–A review on the statistical data driv-
en approaches. European Journal of Operational Research, 213(2011), 1-14.
[6] M. Tafazoli. A study of on-orbit spacecraft failures. Acta Astronautica, 64(2009), 195-205.
[7] W. Ma, Y. Xuan, Y. Han, et al. Degradation Performance of Long-life Satellite Thermal Coating and Its
Influence on Thermal Character . Journal of Astronautics, 2(2010), 43-45.
[8] G. Jin, D.E. Matthews, Z. Zhou. A Bayesian framework for on-line degradation assessment and residual
life prediction of secondary batteries inspacecraft. Reliability Engineering &System Safety, 113(2013),
7-20
[9] X. Hu, J. Jiang, D. Cao, et al. Battery Health Prognosis for Electric Vehicles Using Sample Entropy and
Sparse Bayesian Predictive Modeling. IEEE Transactions on Industrial Electronics, 63(2015), 2645-
2656.
[10] S.M. Pincus. Assessing serial irregularity and its implications for health. Annals of the New York Acad-
emy of Sciences, 954(2001), 245-267.
[11] D. Weinshall, A. Zweig, et al. Beyond novelty detection: Incongruent events, when general and specific
classifiers disagree. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(2012), 1886-
1901.
[12] G. Yan, F. Sun, H. Li, et al. CoreRank: Redeeming “Sick Silicon” by Dynamically Quantifying Core-
Level Healthy Condition. IEEE Transactions on Computers, 65(2016), 716-729.
194 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-194

A Decision Tree Model for Meta-


Investment Strategy of Stock Based on
Sector Rotating
Li-Min HE a, Shao-Dong CHEN a, Zhen-Hua ZHANG b,1, Yong HU c and Hong-Yi
JIANG a
a
School of Finance, Guangdong University of Foreign Studies, China
b
School of Economics and Trade, Guangdong University of Foreign Studies,
Guangdong, China
c
Institute of Big Data and Decision Making, Jinan University, Guangdong, China

Abstract. This study firstly proposes Meta-Investment Strategy, derived from the
concept of Meta-Search in network and Meta-Cognition in psychology. We
compare enormous web information to all A shares in China, process of searching
information to stock selection and search engines to equity funds. Based on the
sector rotation theory and decision tree model, through the construction of
indicator system and the statistical model, some stock selection rules according to
funds information can be extracted. After classifying the period from 2016.02 to
2016.04 as recovery, we selected finance industry. By importing 12 stock
indicators of all the component stocks in finance industry as input variables and
whether it is heavily held by stocks funds as target variable, a decision tree model
is constructed. Finally, by entering data of the last quarter in 2015, the predictive
classification results are obtained. Result shows that Meta-Investment Strategy
outperformed CSI300 and CSI300 of Finance Sector (000914) and obtained
significant excess return from 2016.02.01 to 2016.04.30.

Keywords. Meta-Investment strategy, sector rotation theory, decision tree model,


data mining, stock selection model

Introduction

In each surge of stock market in China, there are always hot industries which lead the
upward trend periodically. If investors can seize these fleeting investment opportunities
of hot industries, their portfolios can acquire excess return. Sector rotation has become
one of the most important means in investment research of stock market.
Sector rotation refers to a phenomenon that in every phase of business cycle and
stock market cycle, different industries take turns to outperform the market. The
research on sector rotation theory abroad is more mature than domestic ones. It
originated from the famous “The Investment Clock” [1], which classified the business
cycle into four phases and concluded the performance of different industries. Sassetti
and Tani [2] outperformed market returns by using 3 market-timing techniques on 41

1
Corresponding Author: Zhen-Hua Zhang, School of Economics and Trade, Guangdong University of
Foreign Studies, Guangzhou 510006, China; E-mail: zhangzhenhua@gdufs.edu.cn.
L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 195

funds of the Fidelity Select Sector family over the period 1998 to 2003. The domestic
research of sector rotation focuses on the phenomenon itself and its underlying causes,
including business cycle, monetary cycle, industrial cycle and behavioral finance.
However, only a few researches probe into sector rotation as an investment
strategy. Peng and Zhang [3] empirically analyzed the sector rotation effect in Chinese
stock market and proved the feasibility of sector investment strategy. By adopting
association rules algorithm, many strong association rules of stock market were mined
from a massive amount of data [4]. In this research, manufacturing and petrochemical
industry stock indexes (the core of association rules) are closely related to other sector
indexes (except for finance, real estate, food & beverage and media).
In addition, because of the immature Chinese capital market, irrational investments
contribute to instability of stock market. This leads to the divergence of market
performance and economic fundamentals. In this case, stock fund, representative of
professional investors, can often forecast directions of financial market. That is why we
put forward the concept of “meta-investment”.
We first present the concept of Meta-Investment based on Meta-Search and Meta-
Cognition. And then, by fusing Meta-Investment and Sector Rotation Strategy, we
apply this concept to stock investment according to the investment results of some
funds and institutions. In order to get comprehensible rules, we adopt decision tree
model to construct the final investment strategy. Simulation results show the
advantages of the present method.

1. Sector Rotation Strategy

1.1. Interpretation Based on Business Cycle

Yang [5] proposed that the essence of sector rotation is an economic phenomenon.
Namely, factors influencing business cycle also induce sector rotation in capital market.
These factors include investment, monetary shock, external shock and consumption of
durable goods. In his study, by introducing phases of business cycle as dummy variable
to the classical CAPM model, the sector rotation strategy gains 0.2% excess Jensen
Alpha returns. Dai & Lin [6] put forward the innate logic of sector rotation and
business cycle. Business cycle is determined by external shock while industrial
structure decides internal forms of business cycle. The process is shown in Figure 1.

Industrial Structure External Shock

Economic Conduction Route

Financial Situation of
Business Cycle Different Industries

Relative Evaluation Level

Sector Rotation

Figure 1. The Conduction Route


196 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

1.2. Interpretation Based on Monetary Shock

Monetary policy is an important contributory factor of stock market. In the long run,
the performance of stock market is based on real economy. However, the change in
liquidity resulting from the conversion of monetary policy can influence the stock
market in the short run. The interpretation of sector rotation based on monetary policy
is that different industries have different sensitivity to liquidity. Conover, Jenson,
Johnson and Mercer [7] used federal FED discount rate as an indicator of monetary
policy to build a sector rotation strategy based on monetary environment. After
classifying the monetary phases, sensitivity to liquidity of different industries was
tested. Subsequently, cyclical industries sensitive to liquidity were invested during
monetary easing while noncyclical industries were invested during monetary tightening.
This strategy gained excess return.

1.3. Interpretation Based on Lead-Lag Relationship

Lead-Lag relationship refers to the horizontal or vertical profit transmission


relationship among different industries. Therefore, investment logic is formed to gain
excess return—selling industries that outperformed the market early and buying
industries that outperformed the market lately. Chen used the DAG method to conduct
an empirical analysis of the relationship of price index among different industries. He
proposed three explanations for sector rotation, which included associations among
different industries formed by business cycle, upstream & downstream relationship and
investment characteristics [8].

1.4. Conclusion

If we use a top-down method to interpret sector rotation phenomena from the


perspective of real economy, sector rotation originates from the changes of business
cycle and monetary shock. In addition, industrial structure determines the expression
form of sector rotation. Namely, the difference of income elasticity of demand, cost
structure [9], sensitivity to liquidity and profit transmission relationship among
different industries decide the form of sector rotation. Moreover, sector rotation can be
interpreted from a perspective of behavioral finance, which views sector rotation as a
market speculation. The proportion of retail investors in China Capital market is
relatively high. Thus, there are a lot of noise traders (According to Shiller, they pursue
fashion and fanaticism and incline to overreact to changes of stock prices.). Meanwhile,
informed traders among institutional investors joint together to lure retail traders to
gain excess profit by manufacturing the stock market [10].

2. Meta-Investment Strategy

Meta-Investment Strategy is an extension from the concept of Meta-Search and Meta-


Cognition. A Meta-search Engine is a search tool that uses other search engines' data to
produce their own results from the Internet [11]. Meta-search engines take input from a
user and simultaneously send out queries to third party search engines for results.
L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 197

Therefore, sufficient data is gathered, formatted by their ranks and presented to the
users.
It is well known that Meta-Cognition is "cognition about cognition", "thinking
about thinking", or "knowing about knowing". The term Meta-Cognition literally
means cognition about cognition, or more informally, thinking about thinking defined
by American developmental psychologist Flavell [12]. Flavell defined Meta-Cognition
as knowledge about cognition and control of cognition. It comes from the root word
"meta", meaning beyond. It can take many forms; it includes knowledge about when
and how to use particular strategies for learning or for problem solving. There are
generally two components of Meta-Cognition: knowledge about cognition, and
regulation of cognition.
Meta-Memory, defined as knowing about memory and mnemonic strategies, is an
especially important form of Meta-Cognition. Differences in Meta-Cognitive
processing across cultures have not been widely studied, but could provide better
outcomes in cross-cultural learning between teachers and students.
Some evolutionary psychologists hypothesize that Meta-Cognition is used as a
survival tool, which would make Meta-Cognition the same across cultures. Writings on
Meta-Cognition can be traced back at least as far as On the Soul and the Parva
Naturalia of the Greek philosopher Aristotle.
As representatives of professional investors, stock funds can explore intrinsic
values of investment objectives before the market. Therefore, through the application
of the stock funds investment result, investing by “standing on the shoulders of giants”
can be a brand-new idea. Meta-Investment Strategy in this study is based on funds. It
compares enormous web information to all A shares, stock selection to search process
and equity funds to search engines. Through the construction of indicator system and
building of statistical modeling, the stock selection rules of stock funds can be
extracted for portfolio construction.

3. Application of Meta-Investment Strategy: Based on Sector Rotation theory and


Decision Tree C5.0 Algorithm

3.1. Model Selection in Data Mining Methods

Increasing methods of Data mining and machine learning have been applied to the
financial field. There have been many models of stocks selection, such as Neural
Network, Random Forest, Support Vector Machine (SVM), Genetic Algorithm (GA),
Rough Set Theory and Concept Lattices etc.
The aim of this research is to probe into Meta-Investment Strategy (based on sector
rotation theory) by searching for proper data mining and machine learning algorithm.
To realize the goal, firstly the comprehensibility of investment strategies has to be
considered. Therefore, algorithm which can be used to extract understandable rules is
the main approach in this study.
However, Neural Network and Random Forest are more suitable for large sample
of data. Besides, Neural Network cannot be used for rule extraction. SVM is applicable
to relatively small sample, whereas it is difficult to extract rules. In summary, Neural
Network, SVM) and Genetic Algorithm (GA) are suitable for prediction instead of rule
extraction. Thus, Decision Tree, Rough Set and Concept Lattices methods are more
suitable than the other prediction methods for the research purpose.
198 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

Some researchers had put forward the decision tree [13, 14, 15] and random forest
[16] used in the field of investment decisions. For example, Hu and Luo [14] applied
the decision tree model to sector selection, Sorensen and Miller et al. [15] utilized the
decision tree approach for stock selection. Liu et al. [16] proposed a random forest
model applied to bulk-holding stock forecast.
However, no available research directly applied stock funds’ investment result to
investment practice currently. In addition, although there were some related researches
on bulk-holding stock [16], they were not directly combined with investment practice.
Most importantly, because Meta-Investment Strategy is firstly launched in this
study, there are no specialized algorithms for meta-investment Strategy at present.
After comparison, C5.0 Decision Tree which is easy to extract understandable rules is
preferred.
Secondly, conditional attributes are continuous in data set. In terms of applying
Rough Set and Concept Lattices for extracting rules, discretization process is necessary,
which requires proper discretization model. C5.0 Decision Tree method, without the
discretization process, is comparatively easier to implement than the former.
Moreover, traditional extraction methods of comprehensible rules, which are used
to extract information from massive original data directly, are difficult for this research
because of several problems: (1) Massive data, large number of indicators and scattered
information make it difficult to extract rules; (2) Implicit rules of investment vary from
different periods because of various financial conditions and policies. Therefore, the
prediction accuracy is limited and rules are likely to contradict with each other; (3)
Operational speed is relatively slow when coping with massive data.
We aim to use Meta-Investment Strategy and rule extraction algorithm to solve the
aforesaid problems. Relevant research is limited in this field. It is built on existing
investment strategies and thus accuracy is improved. In this way, data size and
conflicting information are relatively less, which makes the extracted rules more
reasonable. Therefore, C5.0 is chosen for rule extraction.

3.2. Decision Tree Modeling and Preliminaries

In this study, Meta-Investment Strategy is used for portfolio construction by statistical


modeling and rule extraction. Therefore, the decision tree model is used for rule
extraction.
For one thing, decision tree model is one of supervised learning methods. In
decision tree model, each example is a pair consisting input objects and target variables.
Through analyzing the training data, a set of inference rules is produced for mapping
new examples. For another, the goal of this study, namely, extracting stock funds’
stock selection funds is in accordance with the output of decision tree, which is a
inference rule set.
This study uses C5.0 Decision Tree Model and SPSS Modeler for rule extraction.
The splitting criterion is the normalized information gain (difference in entropy). The
attribute with the highest normalized information gain is chosen to make decision. C5.0
Decision Tree introduced boosting algorithm to enhance the accuracy [13, 14, 15].
Based on sector rotation theory and C5.0 Decision Tree Model, the modeling
process is shown by Figure 2.
L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 199

Classification of Business Cycle

Selection of Industry

Construction of Training Samples

Identifying Target Variable—Whether It


Is Heavily Held by Stock Funds

Identifying Input Variables—Indicator


System Constrction

C5.0 Decision Tree Model

Classification & Portfolio Construction

Data Back-testing

Figure 2. Flow Chart

3.3. Classification of Business Cycle

The method of classifying business cycle includes two-stage method, four-stage


method, and six-stage method and so on. Merrill Investment Clock divided the business
cycle into four phases by using OECD “output gap” estimates and CPI inflation data.
Zhang & Wang [17] proposed that because traditional “output gap” and CPI inflation
data were quarterly released, it’s difficult to identify the economic inflection point.
Therefore, monthly figures including Macroeconomic Prosperity Index and Consumer
Price Index (The same month last year=100), shown in Figure 3, can be used as the
main indicators for classification of business cycle.

Figure 3. Trend of Macroeconomic Prosperity Index and CPI (2009.03 to 2015.12)


Source: CSMAR
For these reasons, this study adopt the four-stage method with Macroeconomic
Prosperity Index and Consumer Price Index (The same month last year=100) (in Table
200 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

1), by which the business cycle is divided into four stages: recovery, overheat,
stagflation and recession. Classification results of business cycle are shown in Table 2.
Table 1. Classification of the Four Stages in a Business Cycle

Phase
Recession Recovery Overheat Stagflation
Indicator

Macroeconomic Prosperity Index ω χ χ ω


Consumer Price Index
ω ω χ χ
(The same month last year=100)

Table 2. Classification Results (Four-Stage Method)

Time Phase CSI300 Time Phase CSI300


2009.03-2009.07 Recovery 67.93% 2012.08-2012.10 Recovery -4.40%
2009.08-2010.02 Overheat -13.34% 2012.11-2013.07 Overheat 1.52%
2010.03-2010.08 Stagflation -12.67% 2013.08-2013.10 Stagflation 0.64%
2010.09-2010.12 Overheat 8.47% 2013.10-2015.01 Recession 44.00%
2011.01-2011.07 Stagflation -6.82% 2015.02-2016.01 Stagflation -12.16%
2011.08-2012.07 Recession -21.65% 2016.02-2016.04 Recovery 8.81%

3.4. Industry Selection

According to researches of investment clock of mainstream securities, we find that


finance is one of the most recommended industries to invest on, which is strong
focused on by three of the four major funds (Guotai Junan Securities, Shenyin &
Wanguo Securities, and Orient Securities) in Recovery Stage (Table 3).
Table 3. Industry Selection of Different Securities

Industry Recovery Overheat Stagflation Recession


Guotai Energy, Finance, Energy, Materials, Telecom, Consumer Health Care, Utilities,
Junan Consumer Finance Goods, Health Care Consumer Goods
Securities Discretionary
Shenyin Nonferrous Metals, Nonferrous Metals, Agriculture & Utilities, Health Care,
& Real Estate, Mining, Real Estate, Fishing, Health Finance,
Wanguo Finance, Ferrous Metals Care, Network Transportation
Securities Information Equipment,
Technology Electrical
Components
Orient Food & Beverage, Mining, Nonferrous Health Care, Food & Finance, Ferrous
Securities Nonferrous Metals, Metals, Beverage, Metals, Chemicals,
Real Estate, Transportation, Machinery, Utilities, Real Estate, Food &
Restaurant & Food Ferrous Metals Construction Beverage
Services, Tourism, Materials
Finance
Guoxin Real Estate, Agriculture, Home Utilities, Real Estate,
Securiteis Transportation, Appliances, Mining, Transportation, Transportation, Home
Mining, Restaurant Nonferrous Metals, Health Care, Food & Appliances, Electrical
& Food Services, Machinery, Trading Beverage Components,
Nonferrous Metals and Retailing Nonferrous Metals
L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 201

In addition, finance industry is a cyclical business which is highly related to


economic fluctuation. Therefore, finance industry is chosen as a sample to investigate
on meta-investment strategy.

3.5. Sample Determination

Since 2009, mainstream securities have studied the investment clock in China. Typical
investigations include Guotai Junan Securities (2009), Shenyin Wanguo Securities
(2009), Orient Securities (2009) and Guoxin Securities (2010). The methodologies
including classification of domestic business cycle and statistical processing to
different industries are similar despite the different industry classification benchmark
and time range.
Chinese capital market is immature, for Chinese financial market greatly changes
because of different policies in different periods. During the period from 2016.02.01 to
2016.04.30, data is relatively more comprehensive and timely. In this way, extracted
rules are more likely to comply with implicit ones of Chinese capital market. In
addition, there are 51 stocks in finance industry at present. If we choose data before
2015, data size will be reduced greatly. For example, Guoxin Securities (002736) went
public in December, 2014 while Orient Securities (600958), Guotai Junan Securities
(601211), Dongxing Securities (601198) and Shenwan Hongyuan Group (000166)
went public in 2015. In conclusion, the chosen timeframe is considered from three
aspects: sector rotation theory, data size and timeliness.
It’s important to note that this study judges this period (2016.02.01-2016.04.30) to
be recovery. Subsequently, the training samples are confined to the first three quarters
in 2015. Financial and technical indicators are imported as input variables. Because of
the time lags of financial indicators, whether the stock in financial industry is heavily
held in the next quarter is set as target variable. Classification rules are produced
through C5.0 Decision Tree. Then the data of the last quarter in 2015 is imported for
classification and prediction and a portfolio is constructed with each chosen stock
weighted equally. Finally, performance of this portfolio is back tested from 2016.02.01
to 2016.04.30.
Table 3 summarizes the findings of the four studies mentioned above.
Since the research period in this study is recovery from 2016.02.01 to 2016.04.30,
finance industry is chosen according to the conclusion above.

3.6. Input Variables and Target Variable

All the input variables and target variable are shown in Table 4.
We manually chose input variables from four dimensions - profitability, operating
capacity, technical factors and indicators per share according to theory of financial
statements theories and previous researches [18-21].
In this study, sample size for model construction is 149. Samples are divided into
training set (70%) and test set (30%). According to Industry Classification of China
Securities Regulatory Commission (CSRC), number of stocks in China’s financial
industry is about 50. Because data used for model construction is confined to the first
three quarters in 2015, after excluding invalid samples, 149 samples in total are
available.
This study doesn’t adopt traditional stock selection model. Instead, we combine
sector rotation strategy and Meta-Investment Strategy. Therefore, after classifying
202 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

business cycle, choosing finance industry and research period (the first three quarters in
2015 for model construction), size of data that needs to be processed and noise data are
greatly reduced.
Table 4. Description of Variables

1st Level 2nd Level Indicators Name Attribute


Indicators
Profitability ROA (TTM) ROATTM Input Variables &
Metric Variable
EBIT Margin (TTM) EBITMarginTTM
Cash to Total Profit Ratio (TTM) CPRTTM
Indicators EPS (TTM) EPSTTM Input Variables &
Per Share Metric Variable

EBIT-EPS (TTM) EBITEPSTTM


Net Cash Flow Per Share (TTM) NCFPSTTM
Net Cash Flow Per Share (TTM) NCFOATTM
Net Cash Flow From Investing NCFIATTM
Activities (TTM)
Operational Price-Earnings Ratio (PE TTM) PERTTM Input Variables &
Ability Metric Variable
Price-Sales Ratio(PS TTM) PSRTTM
Price to Cash Flow (PCF TTM) PCFTTM
Technical Prior Three-Month Momentum Momentum Input Variables &
Indicators Metric Variable
Whether it is >0.02% of Net Value of Stock HH Target Variable &
Heavily Held by Funds Nominal Variable
Equity Funds
“Yes”=1,“No”=0
In the training set, there are 12 input variables and 1 target variable. “Whether It Is
Heavily Held by Stock Funds” is target variable. In this way, 13 indicators in total are
imported during model construction. When applying this model, by importing 12 input
variables, target variable is forecasted.
Additional notes of target variables: whether market value of a stock held by
Public Stock Funds is greater than 2% of Net Asset of all Public Stock Funds.

3.7. The Period Division of Training Sample

The research period of this study is confines to the first three quarters in 2015. Because
twelve of the input variables are from lagging financial statements, this study sets the
rules shown in Table 5.
Table 5. Usage of different Types of Report

Type of Report Correspondence Types of Sample


Seasonal Report of the 1st quarter Holdings of equity funds on 2015.06.30 Training Sample
Semi-annual Report Holdings of equity funds on 2015.09.30 Training Sample
Seasonal Report of the 3rd quarter Holdings of equity funds on 2015.12.31 Training Sample
L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 203

3.8. Rule Extraction Based on C5.0 Decision Tree

By importing 13 stock indicators of all the component stocks in finance industry in the
first three quarter of 2015 as input variables and whether it is heavily held by stocks in
the next quarter as target variable, a set of inference rules is generated as follow.
Detailed rules are shown in appendix.
Rule 1 - estimated accuracy 89.22% [boost 96.1%]
NCFIATTM <= -4.903720 [ Mode: 1 ] => 1.0
NCFIATTM > -4.903720 [ Mode: 0 ]
Momentum <= -0.0665 [ Mode: 0 ] => 0.0
Momentum > -0.0665 [ Mode: 1 ]
Momentum <= 0.3515 [ Mode: 1 ]
PCFTTM <= 3.331410 [ Mode: 0 ]
PCFTTM <= -178.095000 [ Mode: 1 ] => 1.0
PCFTTM > -178.095000 [ Mode: 0 ] => 0.0
PCFTTM > 3.331410 [ Mode: 1 ] => 1.0
Momentum > 0.3515 [ Mode: 0 ] => 0.0…
Hence, we extract some rules and explain them.
For example, the first rule: NCFIATTM <= -4.903720 [ Mode: 1 ] => 1.0.
It means Net Cash Flow of Investment Activities (Trailing Twelve Months) which
is less than -4.903720 is chosen (1.0). In the field of commercial bank management,
banks have fixed demand of asset allocation. In China, the main investment activity of
commercial banks is purchasing treasury bonds. Because of the expansion of a bank’s
asset, the less the Net Cash Flow of Investment Activities (NCFIA) is, the faster of its
expansion. For example, in 2015 a bank has asset of RMB ¥100 Yuan, in which 30% is
allocated as one-year treasury bonds. In 2016 this bank has asset of RMB ¥120 Yuan,
in which 30% is allocated as one-year treasury bonds. The annual rate of return is 3%.
Therefore, in its financial statement, Net Cash Flow of Investment Activities is -6 (-
120*0.3+100*0.33). Minus sign means capital outflow while positive sign means
capital inflow.
The second rule: Momentum > -0.0665 [ Mode: 1 ] .
It means Three-Month Momentum which is greater than -0.0665 is chosen (1.0). In
short-term investment, there is an effect called “Momentum effect”. That is to say, rate
of return of a stock has the tendency of following the original trend.
From the above explanation of two most important rules, we know that the
extracted rules are reasonable. Certainly, we can also explain the others, which shown
in appendix.

4. Results Analysis

4.1. Comparison Object: CSI300

CSI 300 is a capitalization-weighted stock market index, designed to replicate the


performance of 300 stocks traded in the Shanghai and Shenzhen stock exchanges.
Therefore, it can be used as performance benchmark.
204 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

4.2. Results Comparison

By importing 12 financial indicators in the last quarter of 2015 (up to the end of
2015.12.31) and prior three-month momentum before 2016.02.01 of all the component
stocks in finance industry, classification results are produced. From those stocks with
classification value of “1”, stocks with a confidence level higher than 90% are chosen.
Subsequently, a portfolio is constructed with each chosen stock weighted equally.
Finally, performance of this portfolio is back tested from 2016.02.01 to 2016.04.30.
Classification results and performance of the portfolio are shown in Table 6.
Table 6. Classification Results

Code Stock Name $C HH (Forecast Level of


value, whether it is confidence
heavily held by equity
funds )
000001 Ping An Bank 1 1
000776 Guangfa Securities 1 1
002142 Ningbo Bank 1 1
600000 Shanghai Pudong Development Bank (SPDB) 1 1
600016 Minsheng Bank (CMBC) 1 1
600036 China Merchants Bank㧔CMB㧕 1 1

601009 Bank of Nanjing 1 1


601166 Industrial Bank (CIB) 1 1
601198 Dongxing Securities 1 1
601318 Ping An Insurance (Group) Company of China 1 1
601328 Bank of Communications 1 1
601818 China Everbright Bank Company 1 1
Result below (Figure 4. & Table 7) shows that Meta-Investment Strategy
outperformed CSI300, CSI300 of Finance Sector (000914) and yielded significant
excess return with a winning rate of 68.97% from 2016.02.01 to 2016.04.30.

Figure 4. Performance
L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 205

Table 7. Back-testing data from 2016.02.01 to 2016.04.30

Period 2016.02.01-2016.04.29
Cumulative Return of CSI300 (399300) 9.12%
Cumulative Return of CSI300 Finance Sector (000914) 9.45%
Cumulative Return of Portfolio Based on Meta-Investment Strategy 12.53%
Winning Rate (Ratio of Days outperforming CSI300 to Total Days) 68.97%

4.3. Description and Explanation of Results

In this study, 149 samples are divided into training set (70%) and test set (30%).
Number of trails of boosting is five. It is used for amplifying sample size and
enhancing accuracy.
After selecting training set and test set and five iterations, the overall accuracy is
up to 96.1%. It is necessary to note that samples in each iteration are different to some
extent. The first model is built on equal probability sampling of training set, while the
second model is mainly based on the incorrectly classified samples of the first model.
The third model focuses on incorrectly classified samples of the second model and so
forth. Therefore, the estimated accuracy is different among rules.
It is also necessary to explain that the purpose of setting “Whether it is Heavily
Held by Equity Funds” as target variable is not for forecasting bulk-holding stocks of
stock funds. Instead, our purpose is to apply investment result of stock funds, extract
principals and rules and invest by “standing on the shoulders of giants”. Therefore, this
study uses the comparison among cumulative return of portfolio, cumulative return of
CSI300 Finance sector and cumulative return of CSI300 to test effects of extracted
rules and stock selection model.

5. Conclusions

Our research on Meta-Investment Strategy is combined with sector rotation theory. In


order to extract comprehensive rules and select proper investment strategies, this study
is based on investment results of stock funds.
There is mounting evidence in the literature that sector rotation phenomenon
incorporate the economic cycle. Armed with this evidence, we investigate on the nature
of sector rotation strategy from three aspects (business cycle, monetary shock and lead-
lag relationship). In this way, we draw to a conclusion that sector rotation originates
from the changes of business cycle and monetary shock. In addition, industrial
structure determines the expression form of sector rotation.
Furthermore, this study firstly proposes Meta-Investment Strategy, which is an
extension from the concept of Meta-Cognition and Meta-Search Engine. Meta-
Investment Strategy is based on stock funds. To facilitate understanding, we compare
enormous web information to all A shares, process of searching information to stock
selection and search engines to equity funds. Through the construction of indicator
system and building of statistical modeling, the stock selection rules of funds can be
extracted for portfolio construction.
Finally, we combines sector rotation theory and decision tree model. After
classifying the period from 2016.02 to 2016.04 as recovery, we selected finance
206 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

industry. By importing 13 stock indicators of all the component stocks in finance


industry as input variables and whether it is heavily held by stocks funds as target
variable, the decision tree model is constructed. Subsequently, by entering data of the
last quarter in 2015, the predictive classification results are obtained. Result shows that
Meta-Investment Strategy outperformed CSI300 and CSI300 of Finance Sector
(000914) and obtained significant excess return from 2016.02.01 to 2016.04.30.
However, due to limitations of time, energy and data resources, the data back-
testing does not includes another three phases of economy, namely, overheat,
stagflation and recession. Follow-up studies will consider loosing restrictions on
research period and industries. Moreover, decision tree model in this study is static. A
dynamic decision tree model will be constructed in the follow-up studies, by which
training samples can be increased and validity of inference rules can be enhanced.
This study did not choose traditional stock selection model, which usually select
stocks from massive data. It requires complex data processing operations because of
noise data. Instead, stock selection model in this study can be seen as a secondary filter
(Its stock screening process is based on stock funds’ investment results). There are
various advantages. For example, it is easy to operate with a relatively small amount of
computation and stock selection rules can be extracted directly.
This study applied public stock funds’ investment result to investment practice
directly for the first time. By choosing “Whether It Is Heavily Held by Stock Funds” as
target variable and building stock selection model, portfolio was constructed, which
rate of return outperformed average market rate of return. In this way, our results prove
that stock funds’ investment result can be used for portfolio construction and portfolio
optimization.

Acknowledgments

This paper is supported by the National Natural Science Foundation of China (No.
71271061), the National Students Innovation Training Program of China (No.
201511846058), Student Science and Technology Innovation Cultivating Projects &
Climbing Plan Special Key Funds in Guangdong Province (No. pajh2016b0174),
Philosophy and Social Science Project (No. GD12XGL14) & the Natural Science
Foundations (No. 2014A030313575, 2016A030313688) & the Soft Science Project
(No. 2015A070704051) of Guangdong Province, Science and Technology Innovation
Project of Education Department of Guangdong Province (No. 2013KJCX0072),
Philosophy and Social Science Project of Guangzhou (No. 14G41), Special Innovative
Project (No. 15T21) & Major Education Foundation (No. GYJYZDA14002) & Higher
Education Research Project (No. 2016GDJYYJZD004) & Key Team (No. TD1605) of
Guangdong University of Foreign Studies.

References

[1] M. Lynch, M. Hartnett, The investment clock (Report), 2004


[2] P. Sassetti, M. Tani, Dynamic asset allocation using systematic sector rotation, Journal of Wealth
Management 8 (2006), 59-70.
[3] Y. Peng, W. Zhang, The research on strategy and application of sector rotation in Chinese stock market,
The Journal of Quantitative & Technical Economics 20 (2003), 148-151.
L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 207

[4] Y. Ye, The cointegration analysis to stock market plate indexes based on association rules, Statistical
Education 9 (2008), 56-58.
[5] W. Yang, Research of sector rotation across the business cycle in the Chinese A share market, Wuhan:
Huazhong University of Science & Technology, 2011
[6] X. Lin, J. Dai, Quantitative and structural analysis of Guoxin investment clock (Report), Shenzhen China
(2012).
[7] C. M. Conover, G. R. Jensen, R. R. Johnson, et al., Is fed policy still relevant for investors? Financial
Analysts Journal 61 (2005), 70-79.
[8] H. Chen, Industry allocation in active portfolio management, Wuhan: Huazhong University of Science &
Technology (2011).
[9] M. Su, Y. Lu, Investigation on sector rotation phenomenon in Chinese A share market—from a
perspective of business cycle and monetary cycle, Study and Practice 27 (2011), 36-40.
[10] C. He, Analysis of sector rotation phenomenon in Chinese A share market, Economic research 47
(2001), 82-87.
[11] E. W. Glover, S. Lawrence, W. P. Birmingham, et al., Architecture of a metasearch engine that
supports user information needs, Conference on Information and Knowledge Management, 1999.
[12] J. H. Flavell, Metacognition and cognitive monitoring: A new area of cognitive-developmental inquiry,
American Psychologist 34 (1979), 906 – 911.
[13] W. Xue, H. Chen., SPSS modeler--the technology and methods of data mining, Beijing: Publishing
House of Electronics Industry (2014).
[14] H. Hu, J. Luo, Profitability and momentum are the key factors of selection of industries—the
exploration of the decision tree applied in sector selection (Report), Shenzhen China (2011).
[15] E. H. Sorensen, K. L. Miller, C. K. Ooi, The decision tree approach to stock selection, Journal of
Portfolio Management 27 (2000), 42-52.
[16] W. Liu, L. Luo, H. Wang, A forecast of bulk-holding stock based on random forest, Journal of Fuzhou
University (Natural Science Edition), 36 (2008), 134-139.
[17] L. Zhang, C. Wang, The investigation of Chinese business cycle and sector allocation on the
macroeconomic perspective (Report), Shenzhen China (2009).
[18] L. Zhang, Stock Selection Base on Multiple-Factor Quantitative Models, Shijiazhuang: Hebei
University of Economics and Business (2014).
[19] J. Zhao, Sector Rotation Multi-factor Stock Selection Model and Empirical Research on its
Performance, Dalian: Dongbei University of Finance and Economics (2015).
[20] P. Wang, J. Yu, Analysis of Financial Statements, Beijing: Tsinghua University Press (2004).
[21] H. Peng, X.Y. Liu, Sector Rotation Phenomenon Based on Association Rules, Journal of Beijing
University of Posts and Telecommunications, 18 (2016), 66-71.
208 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-208

Virtualized Security Defense System for


Blurred Boundaries of Next Generation
Computing Era
Hyun-A PARK 1
Department of Medical Health Sciences, KyungDong University,
815 Gyeonhwon-ro, MunMak-eub, WonJu-City, Kangwon-do, Korea

Abstract. This paper deals with the security problems facing with next genera-
tion computing environments. As the method, Virtualized Security Defense Sys-
tem (VSDS) is proposed under the application of ‘Trello’(web application) for on-
line patient networks, it deals with the following problems; (1) blurred security
boundaries between attackers and protectors, (2) group key management system,
(3) secret-collaborative works and sensitive information-sharing for group mem-
bers, (4) preserving privacy, (5) rendering of 3D image(member indicator, high
level of security). Consequently, although current IT paradigm is changing to more
‘complicated’, ‘overlapped’ and ‘virtualized’, VSDS makes it securely possible to
share information through collaborative works.
Keywords. Blurred Security Boundaries, Virtualized Security Defense System,
PatientsLikeMe, Trello, Group key, Reversed hash key chain, VR/AR, Member
indicator, Pseudonym

Introduction
0.1. Computing Environments for Next Generation and Problem Identification

In the quickly shifting computing societies, various kinds of information technologies


have developed new types of IT-enabled product and service innovations in our daily-
lives. The important features of these IT innovative technologies are highly advanced
wireless techniques such as mobile-internet, SNS, cloud, or big data technologies in the
networked collaborative computing environments.
Currently, IT paradigm, which has been changed from wired to wireless or to inte-
grated information environments, has made the information boundaries blurred between
attackers and protectors. Here, one of the most important problems is - although infor-
mation sharing is highly increased through collaborative works, virtualized IT resources
and overlapped trust boundaries have given rise to security dilemma about to protect
‘what information boundaries’ and ‘what characteristic information’[1]. Considering se-
curity information systems and mobile application researches, the security has consid-
1 Corresponding Author: Hyun-A Park, Department of Medical Health Sciences, KyungDong University,

815 Gyeonhwon-ro, MunMak-eub, WonJu, Kangwon-do, Korea; E-mail:kokokzi@naver.com


H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 209

ered a clear objective in the traditional IT environments, and it is divided as two groups–
attackers and protectors, in which the security specialists take responsibility to prevent
the attacks and threats from outsiders using their knowledge in the security architec-
ture. On the other hand, at present, the changing IT paradigm has made the information
boundaries between attackers and protectors blurred.
The characteristics of next generation computing era (IT paradigm) can be summa-
rized as follows; (1) the increase of collaborative works through the network connections,
(2) the increase of information sharing in information-oriented society, (3) the blurred
security boundaries to protect, which is caused by virtualized IT resources and migration
policy, (4) the increase of 3D data such as in VR/AR.
Therefore, in this paper, to solve the problem against ‘blurred security boundaries’,
Virtualized Security Defense System (VSDS) is proposed under the web application of
‘Trello’ to construct online patient networks very similar to ‘PatientsLikeMe’.

0.2. Main Methods and Contributions


The solution for the problem is the security defense system for next generation’s com-
puting environments. The application is the web application ‘Trello’. With the Trello,
online patient network is constructed very similarly to ‘PatientsLikeMe’, because ‘Pa-
tientsLikeMe’ is very difficult to use for the patients(users) in non-English speaking re-
gions and the patients suffering from all other diseases. Hence, VSDS is extended for the
persons(ex. Researcher) having interests about the same diseases and the patients who
use all other languages including English and struggling against all other diseases by
using ‘Trello’. The main methods are as follows.
1. The proposed system VSDS (Virtualized Security Defense System) is the new con-
cept for security solutions. Its goal is to figure out the problem against ‘blurred security
boundaries’, so that VSDS figures out the problem by constructing the ‘Virtualized’ se-
curity solution for next generation’s computing environments. As the methods, it largely
uses Cryptographic Techniques and Member Indicator as 3D Video Image Technology
for virtualized IT resources. Especially, Member Indicator is a new security solution re-
flecting the characteristics of next generation’s computing.
2. VSDS should be secure and efficient group key management system, because in-
formation sharing has been and will be highly increased even in blurred security bound-
aries. As the method, each member’s group key is made based on Reversed One-way
Hash chain. According to onewayness properties of hash function[2], VSDS can guar-
antee Forward Secrecy and Backward Accessibility, Group Key Secrecy[3], which are
security requirements in group information-sharing system, VSDS.
2-1. Forward Secrecy and Backward Accessibility of security requirements should
be satisfied. In VSDS, a leaving member cannot know the next group key(Forward Se-
crecy) but a joining member can know all the previous keys and information (Backward
Accessibility) by properties of the reversed hash key chain. Therefore, VSDS is suitable
for secret-collaborative works and sensitive information-sharing for group members.
2-2. VSDS does not need to do re-keying for membership-changes. The principle of
each member’s group key generation: A fixed fundamental group key is assigned to each
group, and random numbers for each user and his every session are newly generated.
After applying the group key and random numbers to hash function respectfully, revers-
edly and repeatedly, the hashed group key and random numbers are combined according
210 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era

to the given(developed) equation algorithm. Then, total five sub-group keys are made as
each member’s group session key.
Hence, every member has different group keys for each session. However, con-
sequently, every result of authentication(including encryption/decryption) is the same
as the fundamental group key’s result under the computation of the developed proto-
col(equation algorithm).
One of the most important things is that the same result value between fundamen-
tal group key and all other group keys has no need of re-keying processes whenever
membership-changes.
3. VR/AR Technique: A new concept of the 3D Video Image Mobile Security Tech-
nology Solution is proposed. As a member indicator, the 3-dimensional realistic models
which are decided at the registration time should be rendered in the log-in process to be
authenticated as a legitimate user [4].
4. VSDS preserves privacy. (1)Anonymity and Pseudonymity; Every session we use
pseudonymity. Although perfect anonymity cannot be provided, instead pseudonymity
can be provided, (2) Unlinkability; Every session users log-in with different pseudonyms
(Pd) and use different encryption keys(each member’s group key). Consequently, VSDS
can achieve the similar level of security to ’One Time Encryption’. (3) Unobservability;
All information is encrypted and pseudonym is changed every session by reversed hash
chain [5].
5. Access Control by Cryptographic Techniques and VR/AR Technique
6. VSDS is scalable to other group project systems on the websites. Application sce-
nario is about patient networks on the web, however VSDS is extendable to other secure
group projects.

1. Related Works and Application

1.1. Related Works


Among the researches related to main methods - Cryptographic techniques(especially,
group key management systems) and a member indicator as VR/AR technique, only one
research area about group key management systems is introduced and reviewed as related
works, because VR/AR Technique was applied just to use the new concept of security
solution.
The research areas about group key are so various such as group key agreement,
exchange, revocation, multicast/broadcast, yet this work only focuses on group key ap-
plication for multi-users. Especially, VSDS has a little different property from general
group key in that the security requirement of VSDS is not Backward Secrecy(a joining
member cannot know all the previous keys and information) but Backward Accessibil-
ity(a joining member can know all the previous keys and information). It is caused by
that the goal of application environments is all information-sharing with present group
members. That is the reason why VSDS is related to the research area of search schemes
for multi-users setting.
According to [6], Park et al.’s privacy preserving keyword-based retrieval proto-
cols for dynamic groups [7] is the first work for the multi-users setting in secret search
schemes. In [7], Park et al. generate each member’s group session key based on reversed
hash key chain and their shceme also stisfies with backward accessibility. As for other
H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 211

researches based on reversed hash key chain, there are [3], [8]. [3] proposed two practical
approaches - efficiency and group search in cloud datacenter, where the authors defined
the group search secrecy requirements including baward accessibility. [8] suggested the
protocol for designated message encryption for designated decryptor, so that they make
a server see the only corresponding message in the cloud service system based on onion
modification and reversed hash key chain.
As for the multi-users setting researches not-based on reversed hash key chain, there
have been the following works; [6] proposed the common secure indices to make multi-
users obtain securely the encrypted group’s documents without re-encrypting them,
which is based on keyword field, dynamic accumulators, Paillier’s cryptosystem and
blind signatures. They formally defined common secure index for conjunctive keyword-
based retrieval over encrypted data (CSI-CKR) and its security requirements. The next
year, they proposed another scheme of keyword field-free conjunctive keyword searches
on encrypted data in the dynamic group setting [9], whereby the authors solve the open
problem asked by Golle et al. In [10], Kawai et al. showed the flaw of Yamashita and
Tanaka’s scheme SHSMG, and they suggest a new concept of Secret Handshake scheme;
monotone condition Secret Handshake with Multiple Groups (mc-SHSMG) for members
to authenticate each other in monotone condition. [11] suggested a new effective fuzzy
keyword search in a multi-user system over encrypted cloud data. This system supports
differential searching and privileges based on the techniques; attribute-based encryption
and Edit distance, which achieves optimized storage and representation overheads.
In this paper, VSDS generates group session keys for each user which are composed
of five sub-keys by using reversed hash key chain and random numbers. According to the
developed encryption/decryption algorithm, the group key achieves no need of re-keying
processes whenever membership changes happen.

1.2. Application

‘PatientsLikeMe’ is online patient networks, actually, VSDS does not apply to the web
‘PatientsLikeMe’ directly. The substantial application for VSDS is the web application
‘Trello’. It is intended that the proposed system VSDS, which is applied to Trello with
cryptographic and security techniques, can accomplish the goal and functional roles of
PatientsLikeMe. Hence, we need to know both of two websites.
Trello. Trello is a web-based project management application. Generally, basic ser-
vice charge is free, except for a Business Class service. Projects are represented by boards
containing lists (corresponding to task lists). Lists contain cards to progress from one list
to the next. Users and boards are grouped into organizations. Trello’s website can access
to most mobile web browsers. Trello dose various works such as real estate management,
software project management, school bulletin boards, and so on [12].
PatientsLikeMe. This online patient network has the goal of connecting patients
with one another, improving their outcomes, and enabling research. PatientsLikeMe
started the first ALS (amyotrophic lateral sclerosis) online community in 2006. There-
after, the company began adding other communities such as organ transplantation, mul-
tiple sclerosis (MS), HIV, Parkinson’s disease and so on. Today the website covers more
than 2,000 health conditions. The approach is scientific literature reviews and data-
sharing with patients to identify outcome measures, symptoms, treatments through an-
swering questions [13].
212 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era

Figure 1. System Configuration of VSDS

1.3. Application Scenario

Using the web project application ‘Trello’, Security Defense System VSDS is con-
structed for the patients with any disease in all over the world, just like ‘PatientsLikeMe’.
The reasons are; (1) PatientsLikeMe does not deal with all kinds of disease. Although
the company began adding other communities such as MS, Parkinson’s disease, so many
other patients want to get in such kind of web and to be helped more easily. (2) The lan-
guage of PatientsLikeMe allows only for ‘English’. Patients in non-English speaking re-
gions are so difficult to sign in and use. The system is for group members who want to get
helps through information-sharing. The information scope is health conditions and pa-
tient profile. Mostly, the sensitive information could be shared but some secret personal
data in patient profile should not be revealed to anyone. Plus one more important thing
is that the system is Virtualized SDS using 3D image rendering for the next generation
computing.
The details are as follows; A board is assigned to one group. A list containing cards
is assigned to a user. Each member uploads his/her conditions or information to a card
and then the information is shared.
VSDS has three parties; Users, SM(Security Manager), VSDS Server. SM(Security
Manager) is a kind of a client, which is granted a special role of a security manager. SM
is assumed as a TTP (trusted third party) and it is located in front of the VSDS server. SM
controls group-key and key-related information, all sensitive information, and all other
events with powerful computational and storage abilities. Fig.1 is the system configura-
tion of VSDS. Every user should register at SM at first, thereafter they should get through
the authentication process every session and then they start some actions. When some
information is shared with other patients (it means that the shared card is generated), we
know the card is encrypted by the group’s encryption key. Only the legitimate users (who
registered at SM and have stored the information given by SM for authentication at his
device) can pass authentication processes and know the sharing information. In the last
H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 213

step of the authentication, 3-dimensional image which is planned in advance is rendered.


This image can be said as a member indicator.

2. The Construction of VSDS

2.1. Notations
• KG : the fundamental group key of group G
• m : the number of group G’s members, j : session number, i : each member of group G
j
• kmi : group session key for each member ’i’ in the j-th session
j j j j j j
• Ki,1 , Ki,2 , Ki,3 , Ki,4 , Ki,5 : five subkeys for i’s group key kmi
j
• αi : random number of member i in the j-th session
j
• pdi : pseudonym of member i in the j-th session
• h(·) : hash function, f (·) : pseudorandom function
• C, E : Encryption function, D : Decryption function
q
• Vi : a video image information for a member i to render at q-th session, RV : a rendered
image of V

2.2. The Generation Process of Group Key


2.2.1. Group member’s group keys.
We assume that there are ‘m’ members of the group ‘G’, then the group key for the group
G is KG and the group keys for each member ‘i’ are kmij , (1 ≤ i ≤ m, 1 ≤ j ≤ q). Here, j
is a session number and q is the last session. The each member i’s group key kmij consists
j j j j j
of totally five subkeys; Ki,1 , Ki,2 , Ki,3 , Ki,4 , Ki,5 . We generate random number αiq for these
subkeys. Therefore, the last session group key of user ‘i’ is kmqi =;
q q
Ki,1 = h(KG )αi ,
q q
Ki,2 = h(KG ) fKG (KG )(1 − αi ),
q
Ki,3 = g fKG (KG ) ,
q q
Ki,4 = −(h(KG ) + αi ),
q q
Ki,5 = fKG (KG )αi

2.2.2. Group session keys - Reversed hash key chain.


We assume the total number of sessions is q. For every member i, we generate each
different random number αiq (1 ≤ i ≤ m) for the last session. Here, we again apply αiq to
hash function (q-1) times repeatedly to generate all sessions’ random number as follows.
q
αi , (randomly generated)
q q−1
h(αi ) = αi
q−1 q−2 q
h(αi ) = αi = h2 (αi )
q−2 q−3 q
h(αi ) = αi = h3 (αi )
.........
q
h(αi4 ) = αi3 = hq−3 (αi )
q
h(αi ) = αi = h (αi )
3 2 q−2
q
h(αi ) = αi = h (αi )
2 1 q−1

Therefore, the first session’s random number of member i is αi1 and the s-th session’s
214 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era

random number of member i is αis ; h(αis+1 ) = αis = hq−s (αiq ). To make the member i’s
group session keys, αij is changed to αij+1 , 1 ≤ j ≤ q − 1 in the member’s group key.
With these different random numbers, we can make all different group keys for each
member and each session respectively.
One-way hash function h() plays the important role of group information-sharing
system in VSDS. One-Way Hash Key Chain is generated by randomly selecting the last
value, which is repeatedly applied to an one-way hash function h(). The initially selected
value is the last value of the key chain. One-way hash chain has two properties; 1. Any-
one can deduce an earlier value(ki ) with the later value(k j ) of the chain by computing
h j−i (ki ) = k j . 2. An attacker cannot find a later value(k j ) with the latest released value(ki )
because of h j−1 (k j ) = ki . Therefore, two properties make it possible that a leaving mem-
ber cannot compute new keys after leaving the group and any newly joining member
can obtain all previous keys and information through applying the current key to hash
function h() repeatedly.

2.2.3. Group members’ pseudonym keys - Reversed hash key chain.


In this scheme, there are each member’s pseudonyms, which are generated with the re-
versed hash key chain as the same way of group session keys. Thus, each member has
also q pseudonyms which are denoted as pdij (for each member i, 1 ≤ j ≤ q).
q
pdi , (randomly generated)
q q−1
h(pdi ) = pdi
q−1 q−2 q
h(pdi ) = pdi = h2 (pdi )
q−2 q−3 q
h(pdi ) = pdi = h3 (pdi )
.........
q
h(pdi4 ) = pdi3 = hq−3 (pdi )
3 2 q−2 q
h(pdi ) = pdai = h (pdi )
2 1 q−1 q
h(pdi ) = pdi = h (pdi )

2.3. Encryption and Decryption with group members’ group key

We assume that the encryption method for a massage ‘M’ with the group key ‘KG ’ is
q q q q q
C = gh(KG ) f (KG ) M. For simplicity, we put Ki,1 , Ki,2 , Ki,3 , Ki,4 , Ki,5 as K1 , K2 , K3 , K4 , K5 and
fKG (KG ) as f (KG ). Then, the encryption method with the each member’s group key kmij ,
for example, in the last session (i.e. j=q, kmqi ) is as follows.
q q
C = Ekmq (M) = K3K1 gK2 M = (g f (KG ) )h(KG )αi gh(KG ) f (KG )(1−αi ) M = gh(KG ) f (KG ) M. We can
i
check that the result of encryption with the group key ‘KG ’ is the same as one with each
member’s group key kmij , that is K1 , K2 , K3 , K4 , K5 .
The decryption method with the group key ‘KG ’ is D = C · g−h(KG ) f (KG ) = M. Then,
the decryption method with the the each member’s group key kmqi in the last session is;
q q
D = C · K3K4 gK5 = gh(KG ) f (KG ) M · (g f (KG ) )−(h(KG )+αi ) · g f (KG )αi =
q q
gh(KG ) f (KG )− f (KG )h(KG )− f (KG )αi + f (KG )αi · M = M
We can also check that the result of decryption with the group key ‘KG ’ is the same
as one with each member’s group key kmij . Because of the properties of this developed
encryption and decryption algorithm, VSDS can achieve no need of re-keying processes
whenever membership-changes happen.
H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 215

In addition, pseudorandom function fk (·) is simple cryptographic function for en-


cryption with the secret key k.

2.4. Member Indicator

At the registration stage, SM assigns the 3-dimensional real models of the number of
j(3 ≤ j ≤ t, t value depends on the condition and policy of systems) to each member i of
each group A, and the members keep the given j 3D real models for later authentication.
s is put as the
Every group has its particular j models(3D shaped thing) respectfully. VA,i
video image information for the 3D model of the member i in group A at the s-th ses-
sion. Every session SM selects one model of the group’s 3D models and challenges the
member of the group with VA,is . Then, the member renders the 3D real model for V s .
A,i

2.5. Whole Protocol

2.5.1. Registration.
As the first process, every user should register at the Security Manager (SM). In this
registration Stage, pseudonyms, group members’ group keys, group session keys, and
the other information including member indicators are generated for each user to use this
system with safety.
Then, every user is given some information from SM. They stores them in one’s own
device such as smartphone or PC and keep VA,i s to j 3D real models. The given infor-

mation for each member i is as follows; h(Ekm1 (pdi1 ||V )), pdi1 , km1i , {h(Ekm j (pdij )), (1 ≤
i i
j ≤ q)}.
SM should also store some information for each member ; αiq , the values for pseudonym
hash key chain {h(pdij ), pdij , (1 ≤ j ≤ q)}.
Fig. 2 shows the whole process of VSDS from Registration.

2.6. The Detailed Protocol

[The First Session_Log-in Stage]


1. With the stored value pdi1 , km1i , a member i computes f pd 1 (km1i ), h(pdi1 ), then sends
i
the below information in 1 to SM, where h(Ekm1 (pdi1 ||Vi1 )) is also the stored value
i
at registration time. Because km1i is the member i’s group key in the first session,
K1 1
Ekm1 (pdi1 ||Vi1 ) means C = Ekm1 (pdi1 ||Vi1 ) = K31 1 gK2 (pdi1 ||Vi1 )
i i
1 1
= (g f (KG ) )h(KG )αi gh(KG ) f (KG )(1−αi ) (pdi1 ||Vi1 ) = gh(KG ) f (KG ) (pdi1 ||Vi1 ). Here, for simplic-
ity, K11 , K21 , K31 are denoted as the member i’s subkeys for its group key km1i in the first
session. K41 , K51 are also the subkeys for km1i .
2. After receiving the information from the member i, SM checks 1(s), h(pdi1 ) and
find the corresponding values αiq , pdi1 from its storage. Then, with pdi1 , SM decrypts
D( f pd 1 (km1i )) and gets km1i . For the found value αiq , SM applies αiq to hash function
i
 
repeatedly, to the (q-1) times. If he obtains the result αi1 , then SM computes km1i =;
       
K11 = h(KG )αi1 , K21 = h(KG ) fKG (KG )(1−αi1 ), K31 = g fKG (KG ) , K41 = −(h(KG )+αi1 ), K51 =

fKG (KG )αi1 .
216 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era

Registration
The First Session
User SM
Log-in Stage
1. Compute: f pd 1 (km1i ), h(pdi1 )
i
{1(s),h(pdi1 ), f pd 1 (km1i ),h(Ekm1 (pdi1 ||Vi1 ))}
−−−−−−−−−−−−−−−−−−i−−−−−−→ i
q
2. Find: 1(s), h(pdi1 ) → αi , pdi1
Decrypt: D( f pd 1 (km1i )) = km1i
i
q 
Compute:hq−1 (αi ) = αi1 ,

km1i = {K11 , K21 , K31 , K41 , K51 }

Verify: km1i = km1i
Compute & Verify:

h(Ekm1 (pdi1 ||Vi1 )) =? h(Ekm1 (pdi1 ||Vi1 ))
i i
3. Compute: αi , kmi
2 2
f pd 1 (km2i ,pdi2 ), f pd 2 (pdi1 ||Vi1 )
Compute & Send:
 
←−i−−−−−−−−−i −−−−−−
4. Decrypt: D( f pd 1 (km2i , pdi2 )) = km2i , pdi2
i
Compute & Verify:
 
h(Ekm2 (pdi2 )) =? h(Ekm2 (pdi2 )), h(pdi2 ) = pdi1
i i
 
Then, km2i → km2i , pdi2 → pdi2
5. Decrypt: D( f pd 2 (pdi1 ||Vi1 )) = pdi1 ||Vi1
i
Render at a card: R(Vi1 )
6. Verify the card: R(V ) = RV 1
i
Action Stage
User V SDS Server
[member − i]
1
Ki,1 K1
7. Encrypt & Upload M: Ci1 = Ekm1 (M) 1
=Ki,3 ·g i,2 ·M=gh(KG ) f (KG ) M
i −−−−−−−−−−−−−−−−−−→
[member − j]
8. Download from VSDS Server: C1
←−−−
i
−−
K1 K 1j,5
9. Decrypt Ci1 :D = Ci1 · K 1j,3 j,4 ·g =M
2nd Session
User SM
Log-in Stage
1. Compute & Send:
{2(s),h(pdi2 ), f pd 2 (km2i ),h(Ekm2 (pdi2 ||Vi1 ))}
−−−−−−−−−−i−−−−−−−−i−−−−−−→ q
2. Find: 2(s), h(pdi2 ) → αi , pdi2
Decrypt: D( f pd 2 (km2i )) = km2i
i
q 
Compute:hq−2 (αi ) = αi2 ,

km2i = {K12 , K22 , K32 , K42 , K52 }

Verify: km2i = km2i
Compute & Verify:

h(Ekm2 (pdi2 ||Vi1 )) =? h(Ekm2 (pdi2 ||Vi1 ))
i i
3. Compute: αi3 , km3i
f pd 2 (km3i ,pdi3 ), f pd 3 (pdi2 ||Vi2 )
Compute & Send:
 
←−i−−−−−−−−−i −−−−−−
4. Decrypt: D( f pd 2 (km3i , pdi3 )) = km3i , pdi3
i
Compute & Verify:

h(Ekm3  (pdi )) = h(Ekm3 (pdi3 )), h(pdi3 ) = pdi2
3
i i
 
Then, km3i → km3i , pdi3 → pdi3
5. Decrypt: D( f pd 3 (pdi2 ||Vi2 )) = pdi2 ||Vi2
i
Render at a card: R(Vi2 )
6. Verify the card: R(V ) = RV 2
i
Action Stage
Same as the 1st Session
Figure 2. The Whole Process of VSDS.
H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 217

 
Then, SM verifies km1i = km1i or not. Again, SM computes h(Ekm1 (pdi1 ||Vi1 ))) with the
i

km1i and checks h(Ekm1 (Vi1 )) is the same as the received value h(Ekm1 (Vi1 )) or not. Here,
i i
Ekm1 (pdi1 ||Vi1 ) has the same meaning as the above 1.
i
3. SM computes αi2 by applying αiq to hash function (q-2) times, and then computes
km2i ;
K12 = h(KG )αi2 , K22 = h(KG ) fKG (KG )(1 − αi2 ), K32 = g fKG (KG ) , K42 = −(h(KG ) + αi2 ), K52 =
fKG (KG )αi2 .
SM computes and sends f pd 1 (km2i , pdi2 ), f pd 2 (pdi1 ||Vi1 ). Here, pdi2 is the stored value.
i i
4. With the value pdi1 , the member i decrypts the received value; D( f pd 1 (km2i , pdi2 )) =
i
    
km2i , pdi2 . With the obtained values km2i , pdi2 , the group member i computes h(Ekm2 (pdi2 ))
i
and verify if this is the same as h(Ekm2 (pdi2 )).
i
Because km2i is the member i’s group key, the encryption method is also the same
 
as 1. Then, i hashes the value pdi2 and verifies; h(pdi2 ) = pdi1 . If the verifications are
 
successful, km2i and pdi2 become km2i and pdi2 .
5. With this pdi , the group member i also decrypts; D( f pd 2 (pdi1 ||Vi1 )) = pdi1 ||Vi1 .
2
i
With the decrypted Vi1 , i renders this R(Vi1 ) then i uploads the image of R(Vi1 ) at a card.
6. SM verifies if the rendered card image R(Vi1 ) is the same as RV 1 (3D real model)
i
or not. At the first session’s verification, member indicator’ authentication is processed.
If SM’s verification is successful, the member i can begin to act (log-in allowed). The
action means uploading, reading(decryption) and downloading.
[The First Session_Action Stage]
K1
1 i,1 · gKi,2 · M = gh(KG ) f (KG ) M.
1
7. A member i encrypt message M: Ci1 = Ekm1 (M) = Ki,3
i
Then, the member i uploads M to his card.
8.Another member j downloads an encrypted message Ci1 from a VSDS board(server).
K 1j,4 1
9.The member j decrypts Ci1 with his first group session key; D = Ci1 · K 1j,3 · gK j,5 =
gh(KG ) f (KG ) M · (g f (KG ) )−(h(KG )+α j ) · g f (KG )α j = gh(KG ) f (KG )− f (KG )h(KG )− f (KG )α j + f (KG )α j ·
1 1 1 1

M=M
[The Second Session]
From the second session, most processes are similar to the first session. As the
session is changed, the corresponding pseudonym keys and group session keys are also
changed. As for the video image information V for 3D real model, a member sends the
information V 1 kept from the first session to SM, and then SM challenges the member
with the newly selected information V 2 in the third step. Lastly, the member renders 3D
real model R(V 2 ) at his card. Action stage is also similar to the first session.
From the third session, all processes go through the same paths as the second session.

3. Discussion

3.1. Efficiency
3.1.1. Strength.
In the secure group information-sharing communication, ’Group Re-Keying’ is the im-
portant task when user joins or leaves the group. The group keys needs to be updated
218 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era

to maintain the forward and backward secrecy [14]. However, in the proposed system
VSDS, according to computation of the developed protocol, the every result of authenti-
cation is the same as the authentication-result with the fundamental group key. Therefore,
it does not need to do re-keying for membership -changes.
3.1.2. Weakness.
In the last step of the first session’s authentication (5, 6), 3-dimensional image Vis ren-
dered. R(V) plays a role of a member indicator which is decided by SM in advance. The
meaning is “improving security". If 3-dimensional image is inefficient in a real world,
2-dimensional image is recommended.
However, Google’s project ‘Tango’ has been recently showcased with indoor map-
ping and VR/AR platform [4]. ‘Tango’ technology makes a mobile device possible to
measure the physical world. Tango-enabled devices (smartphones, tablets) are used to
capture the dimensions of physical space to create 3-D representations of the real world.
‘Tango’ gives the Android device platform the new ability for spatial perception. There-
fore, it can be said that the proposition of VSDS is timely good keeping abreast of
‘Tango’s AR/VR technique to mobile devices.

3.2. Security

VSDS is a reversed hash key chain based group-key management system. Message con-
fidentiality is one of the most important features in secure information sharing for group
members. The group key security requirements are;
1. Group Key Secrecy: It should be computationally impossible that a passive adver-
sary discovers any secret group key.
2.Forward Secrecy: Any passive adversary with a subset of old group keys cannot
discover any subsequent(later) group key.
3.Backward Secrecy: Any passive adversary with a subset of subsequent group keys
cannot discover any preceding(earlier) group key.
4. Key Independence; Any passive adversary with any subset of group keys cannot
discover any other group key [3, 15].
However, group-key based information sharing and service system does not follow such
requirements because a new joiner to the group could search all of the previous informa-
tion to be helped. Namely, backward secrecy is not eligible for a security requirement of
VSDS. The System VSDS satisfies with Group Information-sharing Secrecy as follows;
1. Forward Secrecy: For any group GT and a dishonest participant p ∈ GTj , the prob-
ability that a participant p can generate valid group key and pseudonym for (j+1)-th au-
thentication is negligible when the participant knows group key kmij and pseudonym pdij ,
where p ∈ GTj+1 and 0 < j < q. It means that all leaving members from a group should
not access to all of the next information or documents of the group any more.
2. Backward Accessibility: For any group GT and a dishonest participant p ∈ GTj , the
probability that a participant p can generate valid group key and pseudonym for (j-l)-th
authentication is 1 − η(n)2 when the participant knows group key kmij and pseudonym
pdij , where p ∈ GTj−l and 0 < l < j. Namely, all joining members to a group can access

2 the term negligible function refers to a function η : N → R such that for any c ∈ N, there exists n ∈ N, such
c
that η(n) < n1c for all n ≥ nc [16]
H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 219

to all of the previous information or documents of the group.


3. Group Key Secrecy : For any group GT , and a dishonest participant p who knows a
set of initial knowledge-group fundamental key KGT and one member i’s group key km1i ,
the probability that participant p can guess correctly the encrypted information message
M of group GT at the j-th session is negligible. It must be computationally impossible for
dishonest participant p to know or guess correctly the contents of the encrypted message
even if a leaving member or another member of a group reveals his group keys.

4. Conclusion
VSDS was proposed for the patients in all over the world who want to get some helps
and share information such as the web ‘PatientsLikeMe’. This system guarantees secu-
rity and privacy, because most health and private information are sensitive. Therefore,
VSDS is scalable to other group’s project applications with safety. Moreover, it is firmly
believed that the identified problems between next generation’s collaborative comput-
ing and security and the approaches also should be managed as an Integrated Security
Management (ISM).

References

[1] H.A.Park, Secure Chip Based Encrypted Search Protocol In Mobile Office Environments, International
Journal of Advanced Computer Research, 6(24), 2016
[2] Y.Hu, A.Perrig, D.B.Johnson, Efficient security mechanisms for routing protocols, In the proceedings of
Network and Distributed System Security Symposium (2003), 57-73
[3] H.A.Park, J.H.Park, and D.H.Lee, PKIS: Practical Keyword Index Search on Cloud Datacenter,
EURASIP Journal on Wireless Communications and Networking, 2011(1), 84(2011), 1364-1372
[4] G.Sterling, Google to showcase Project Tango indoor mapping and VR/AR platform at Google I/O,
http://searchengineland.com/google-showcase-project-tango-indoor-mapping-vrar-platform-google-io-
249629, 2016
[5] H.A.Park, J.Zhan, D.H.Lee, PPSQL: Privacy Preserving SQL Queries, In the Proceedings of ISA(2008),
Taiwan, 549-554
[6] P.Wang, H.Wang, and J.Pieprzyk, Common Secure Index for Conjunctive Keyword-Based Retrieval over
Encrypted Data, SDM 2007 LNCS 4721(2007), 108-123
[7] H.A.Park, J.W.Byun, D.H.Lee, Secure Index Search for Groups, TrustBus 05 LNCS 3592(2005), 128-
140
[8] H.A.Park, J.H.Park, J.S.Kim, S.B.Lee, J.K.Kim, D.G.Kim, The Protocol for Secure Cloud-Service Sys-
tem. In the Proceedings of NISS(2012), 199-206
[9] P.Wang, H.Wang, and J.Pieprzyk, Keyword Field-Free Conjunctive Keyword Searches on Encrypted
Data and Extension for Dynamic Groups, CANS 2008 LNCS 5339(2008), 178-195
[10] Y.Kawai, S.Tanno, T.Kondo, K.Yoneyama, N.Kunihiro, K.Ohta, Extension of Secret Handshake Proto-
cols with Multiple Groups in Monotone Condition. WISA 2008 LNCS 5379(2009), 160-173
[11] J.Li, X.Chen, Efficient multi-user keyword search over encrypted data in cloud computing, Computing
and Informatics 32 (2013), 723-738
[12] http://lifehacker.com/how-to-use-trello-to-organize-your-entire-life-1683821040
[13] https://www.patientslikeme.com/
[14] R.V.Rao, K.Selvamani, R.Elakkiya, A secure key transfer protocol for group communication, Advanced
Computing: An International Journal, 3(2012), 83-90
[15] A.Gawanmeh, S.Tahar, Rank Theorems for Forward Secrecy in Group Key Management Protocols, In
the Proceedings of 21st AINAW(2007), 18-23
[16] D.Boneh, B.Waters, Conjunctive, Subset, and Range Queries on Encrypted Data, In the Proceedings of
4th TCC(2007), 535-554
220 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-220

Implicit Feature Identification in Chinese


Reviews Based on Hybrid Rules
Yong WANG1, Ya-Zhi TAO, Xiao-Yi WAN and Hui-Ying CAO
Key laboratory of electronic commerce and logistics of Chongqing, Chongqing
University of Posts and Telecommunications, Chongqing 400065, China

Abstract. In the most existed text-mining schemes for customer reviews, explicit
features are usually concerned while implicit features are ignored, which probably
leads to incomplete or incorrect results. In fact, it is necessary to consider implicit
features in customer review mining. Focusing on the identification of implicit
feature, a novel scheme based on hybrid rules is proposed, which mixed statistical
rule, dependency parsing and conditional probability. Explicit product features are
firstly extracted according to FP-tree method and clustered. Then, association pairs
are obtained based on dependency parsing method and the production of frequency
and PMI. Finally, implicit features are identified by considering the association
pairs and conditional probability of verbs, nouns and emotional words. The
proposed scheme is tested on a public cellphone reviews corpus. The results show
that our scheme can effectively find implicit features in customer reviews.
Therefore, our research can obtain more accurate and comprehensive results from
the customer reviews.

Keywords. network reviews, implicit feature, comment mining, association pair


extraction, conditional probability Introduction

Introduction

Today, E-commercial websites contain a large of consumer reviews about products. On


one hand, potential consumers can decide whether to buy the product after reading the
product reviews; on the other hand, reviews are helpful for manufacturers to improve
product design and quality. However, it is impossible for people to read all reviews by
themselves, because the amount of reviews is huge. So, review mining is emerging as
the times require and becomes a significant application field. Feature identification,
containing explicit features identification and implicit feature identification, is a core
step in review mining. If a feature appears in a review directly, it’s defined as an
explicit feature. Similarly, if a feature doesn’t appear in a review but is implied by
other words, it’s defined as an implicit feature [1]. A sentence which contains explicit
features is defined as explicit sentence, and a sentence which contains implicit feature
is defined as an implicit sentence. Wang et al. [2] counted the Chinese reviews they
crawled and discovered that at least 30 percent of the sentences are implicit sentences.
Thus, it can be seen that implicit features play a significant role in reviews mining.

1
Corresponding Author: Yong WANG, Chongqing University of Posts and Telecommunications, No.2
Chongwen Road, Nan’an District, Chongqing City, China; E-mail: wangyong1@cqupt.edu.cn.
Y. Wang et al. / Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules 221

In recent years, some scholars have been studying implicit feature extraction. In
most proposes, implicit features are identified on the basis of emotional words. Qiu et
al. [3] proposed a novel approach to mine implicit features based on clustering
algorithm of k-means and F2 statistics. Hai et al. [4] identified implicit features via co-
occurrence association rules (CoAR) mining. Zeng et al. [5] proposed a method based
classification for implicit features identification. Zhang et al. [6] used explicitly multi-
strategy property extraction algorithm and similarity to detect implicit features. What’s
more, Wang et al. [7] proposed a hybrid association rule mining method to detect
implicit features.
To identify implicit feature, we proposed a novel scheme based on a hybrid rules,
which consist of three different methods. Compared with previous research results, the
presented scheme has two advantages: (1) considering semantic association degree and
statistical association degree together, we would get more accurate <feature clusters,
emotional words> association pairs. (2) In Chinese reviews, some emotional words can
qualify more than one features, such as ̌ᅢ̍(good),̌Ꮕ̍(bad). Thus, it is not
accurate to only consider the association between emotional words and features. To
solve this problem, the association between verbs, nouns and features is also
considered.

1. Scheme Design

Figure 1 depicts the framework of our scheme which is composed of several parts.

Figure 1. Scheme framework.

1.1. Explicit Features Extraction and Clustering

In this stage, explicit features are extracted. Detail steps are as follows:
 Do word segmentation and POS (part-of-speech) tagging for reviews via
ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis
System).Then, nouns and noun phrase from the annotated corpus of comments
are stored in a transaction file.
 Frequent itemsets obtained by FP-tree method are regarded as candidate explicit
features I0.
 Candidate explicit features I1 are got after pruning all single words in I0.
222 Y. Wang et al. / Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules

 According to Chinese semantic and grammatical knowledge, a rule which


contains frequency words but non features is established. This rule is used to
filter I1 for getting Candidate explicit features I2. The rule is as follows:
appellation nouns, such as ̌ ̍(friend), ̌ ̍(classmate), etc.
colloquial nouns, such as ̌ ̍(machine), etc.
product name, such as ̌ ̍(cellphone), ̌ ̍(computer) , etc.
abstract nouns, such as ̌ ̍(reason), ̌ ̍(condition) , etc.
collective nouns, such as “ੱ ̍(people), etc.
 PMI algorithm [8] is used to measure the value between the product and each
feature in I2. The final explicit features are obtained after filtering the features
with PMI value small than a threshold. The PMI value is calculated as follows:
hit (" "and" ")
PMI ( )=log 2
hit ( )hit ( )
(1)
Where hit(x) is the pages returned by Baidu search engine when using x as a
keyword; the threshold is set as -3.77, which is determined by experimental sample
data.
 Each features similarity is calculated by Tongyici Cilin [9]. Features are clustered
into one group if their similarity values are 1. Once explicit feature clusters are
obtained, a feature will be chosen as the representative feature for the cluster
which is in.

1.2. Explicit association pairs <explicit feature cluster, emotional word> extraction

We use dependency parsing method and frequency*PMI method to judge whether


feature clusters and emotional words can compose association pairs from the two
aspects of semantics and statistics. Detail contents are as follows:
 Extracting emotional words in explicit sentences. Extracting adjective, POS of
words are ̌/a̍ or ̌/an̍, in explicit sentences as emotional words.
 Calculate frequency*PMI between emotional words and explicit feature clusters.
The frequency*PMI formula is as follows:
Pf &w
frequency * PMI  f ,w ! Pf &w *log 2
Pf Pw (2)
Where w is the emotional word, f is the feature cluster, Pf is the probability of the
feature f occurrence in explicit sentences. The formulas of Pf and Pf&w are as follows:
n
Pf ¦P
i 1
fi
(3)

n
Pf &w ¦ Co _ occurrence f , w / R
i 1
i
(4)
Where n is the number of features in a feature cluster, fi is ith feature in the feature
cluster f, Co_occurence (fi, w) is co-occurrence times of fi and w explicit sentences, R is
the number of sentences in explicit sentences.
 Using syntax analysis tools to obtain all dependence relationship in the
sentences. If “nsubj” relationship exists between feature clusters and emotional
words, there is modified relation feature between feature clusters and emotional
words. If a feature in a feature cluster has a modified relation with an emotional
Y. Wang et al. / Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules 223

word, we consider that the feature cluster has a modified relation to the
emotional word.
 Setting a threshold p. The association pairs with frequency*PMI value larger
than p, or the frequency*PMI value smaller than p but existing modified
relations, are chosen as final association pairs. The p in the paper is -0.00009.

1.3. Implicit Features Identification

At present, most research only considered emotional words in implicit sentences.


Different from them, we identify implicit features by considering emotional words,
verbs and nouns. Detail steps are as follows:
 Analyzing elements of the implicit sentence and making two judgments. The
first judgment is whether the emotional words are in the implicit sentence. The
second judgment is whether existing verbs or nouns are in the implicit sentence.
 There are four types in terms of the two judgments: Y1 represents that the
association pairs are found by emotional words in the implicit sentence; N1 is
the opposite of Y1.Y2 represents that verbs or nouns are in the implicit sentence;
N2 is the opposite of Y2.
Y1N1
Step 1, extracting emotional words in the implicit sentence, then candidate
association pairs containing these emotional words are obtained. Feature clusters in the
candidate association pairs are treated as candidate feature clusters.
Step 2, verbs and nouns in the implicit sentence are extracted and treated as
notional words set. Then, we calculate each candidate feature cluster’s conditional
probability under the condition of these words. We defined the calculation formula
follows:
Co _ occurrence f , word j
P( f | word j )
count ( word j ) (5)
Where wordj is jth word in notional words set, f is a candidate feature cluster. Then we
defined the f’s average condition probability as follows:
v
T( f ) ¦ P( f
j 1
| word j ) / v (6)
Where v is the number of notional words set.
Step 3, we defined the scores of each candidate feature clusters as follows:

Score  f , w ! D *( frequency * PMI  f , w ! )  (1  D )T ( f ) (7)

Where D is a weight coefficient, and it is set as 0.7 after several experiments. Then the
representative feature of a feature cluster which is in the association pairs with the
highest score is chosen as the implicit feature.
Y1N2
Step 1 is the same as the first step of Y1N1.
Step 2, the representative feature of a feature cluster which is in the candidate
association pairs with the highest frequency*PMI value is chosen as the implicit feature.
Y2N1
Step 1, verbs and nouns in the implicit sentence are extracted and treated as
notional words set. Then, we use Eqs. (5) and (6) to calculate all explicit feature
cluster’s average conditional probability under the condition of these words.
224 Y. Wang et al. / Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules

Step 2, the representative feature of a feature cluster which is in the association


pairs set with the highest score is chosen as the implicit feature.
Y2N2
The implicit feature can’t be identified

2. Experiment Evaluation

2.1. Data Set and Evaluation

Six hundred reviews about one kind of cell phone was download from a pubic website
called Datatang.com. In order to evaluate the performance of the scheme, data set was
manually annotated. In the data set, there are 1870 explicit sentences and 413 implicit
sentences. Three traditional methods, precision, recall and F-measure, are used to
evaluate the performance of the scheme.

2.2. Experimental Results and Comparison

89 product explicit features are obtained by the method described in Section 1.1. The
top 5 features most concerned by customers are shown in Table 1. The precision of this
method is 70.8%, the recall is 73.3% and F-measure is 72%.1285 association pairs are
extracted from explicit sentences by the approach described in Section 1.2. Five
association pairs are shown in Table 2. As seen from the table, the performance of the
approach is good.
Table 1.Top 5 product features results
rank feature PMI frequency
1 ᥓ⢻(intelligence) 0.0 14
2 ઙ(software) -0.10005 42
3 ภ (number) -0.44418 30
4 ዳ᐀(screen) -0.6529 194
5 ચᩰ(price) -0.79837 34
Table 2. Association pairs
rank feature PMI frequency
1 ᥓ⢻(intelligence) 0.0 14
2 ઙ(software) -0.10005 42
3 ภ (number) -0.44418 30
4 ዳ᐀(screen) -0.6529 194
5 ચᩰ(price) -0.79837 34
Implicit features are identified by the approach described in Section 1.3. Table 3 is
partial result. Compared with Ref. [4] by using the same data, results are in Table 4.
Table 3.partial result about Implicit features identification Table 4.Comparative results
Implicit sentences implicit feature Evaluation index our scheme Ref.[4]
900 Ma is difficult to meet the needs battery precision 67.49 % 41.55%
too expensive price recall 65.86% 37.53%
very slow and very troublesome reaction F-measure 66.67% 39.44%
Very beautiful appearance
shape looks like hard appearance

It can be seen from the above tables that the proposed algorithm is far superior to
the algorithm in [4]. Our scheme can better meet the needs of the practical application.
The algorithm proposed in this paper takes statistical analysis and semantic analysis
Y. Wang et al. / Implicit Feature Identification in Chinese Reviews Based on Hybrid Rules 225

into account which can find more association between emotional words and explicit
feature clusters. The research in [4] only focused on mining product features from the
point of statistical view. Therefore, our method has more advantages in performance.

3. Conclusion

Implicit features in customer reviews have an important effect on the text mining
results, which is also an important factor for customers or enterprises to make a wise
decision. In this paper, we proposed a scheme combining several rules to extract the
implicit features from the word segmentation to identification. Compare with the
conventional methods, our scheme not only obtains the association between emotional
words and product features based on statistics and semantics, but also consider the
effect of emotional words, verbs and nouns to the final results. Experiment results
shows that our scheme lays a good basis for the application of network reviews.

Acknowledgments

This work is supported by National Natural Science Foundation of China (61472464),


Natural Science foundation of CQ CSTC (cstc2015jcyjA40025), Social Science
Planning Foundation of Chongqing (2015SKZ09), and Social Science Foundation of
CQUPT (K2015-10).

References

[1] B. Liu, M. Hu, J. Cheng. Opinion observer: analyzing and comparing opinions on the web. In:
Proceedings of the 14th International Conference on World Wide Web (WWW’05), ACM, New York,
NY, USA, 2005, 342̄351.
[2] H. Xu, F. Zhang, W. Wang. Implicit feature identification in Chinese reviews using explicit topic mining
model. Knowledge-Based Systems, 76(2014):166̄175.
[3] Y. F. Qiu, X. F. Ni, L. S. Shao. Research on extracting method of commodities implicit opinion targets.
Computer Engineering and Applications, 51(2015):114-118.
[4] Z. Hai, K. Chang, J.-j. Kim, Implicit feature identification via co-occurrence association rule mining. In:
Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science,
6608(2011), 393̄404.
[5] L. Zeng, F. Li. A Classification-Based Approach for Implicit Feature Identification/ Chinese
Computational Linguistics and, Natural Language Processing Based on Naturally Annotated Big Data.
Springer Berlin Heidelberg, 2013:190-202.
[6] L. Zhang, X. Xu. Implicit Feature Identification in Product Reviews. New Technology of Library and
Information Service. 2015, (12):42-47.
[7] W. Wang, H. Xu, and W. Wan. Implicit feature identification via hybrid association rule mining. Expert
Systems with Applications, 40(2013):3518̄3531.
[8] K W Church, et al. Word association norms, mutual information and lexicography. In: Proceedings of
the 27th Annual Conference of the Association of Computational Linguistics, New Brunswick, NJ:
Association for Computational Linguistics.1989: 76̄83.
[9] J. L. Tian, W. Zhao. Words Similarity Algorithm Based on Tongyici Cilin in Semantic Web Adaptive
Learning System. Journal of Jilin University (Information Science Edition), 28(2011):602-608.
226 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-226

Characteristics Analysis and Data Mining of


Uncertain Influence Based on Power Law
Ke-Ming TANGa, Hao YANG a,b,1 , Qin LIUc, Chang-Ke WANGc, and Xin QIUa
a
School of Information Engineering, Yancheng Teachers University, Yancheng,
Jiangsu, China
b
School of Software and TNList, Tsinghua University, Beijing, China
c
School of Foreign Language, Yancheng Teachers University, Yancheng, Jiangsu,
China

Abstract. The researches of the traditional cascade events, such as avalanche, sand
model, only researched the power-law distribution in the process of all time. In
fact, the speed of Virus propagation is different in each time period. In this paper,
we can find that the number of infected people behaves as a power-law for Guinea,
Liberia and Sierra Leone respectively over different time through our empirical
observations. So the government could take different power exponents of the
number of infected people as the spread of the disease in different periods for the
speed of manufacturing of the drug.

Keywords. Ebola, avalanche, sandpile model

Introduction

A common phenomenon of ‘‘avalanche’’ [1–2] or ‘‘cascade failure’’ [3–7] has attracted


much attentions for a long time, where the event undergoes a chain reaction and often
gives rise to catastrophes or disasters. Snow avalanches [8] or landslide avalanches [9]
induced by cascade failures in power grids [2, 6, 7].
The Ebola epidemic wreaking havoc in West Africa has led to a global ripple effect.
In the absence of Characteristics of Ebola, the disease has alarmed the global public
health community and caused panic among some segments of the population.
The ongoing Ebola epidemic in West Africa has affected the United States and other
Western countries, and the phenomenon of “avalanche” must take place in all of the
word in the absence of effective methods of eradicating Ebola through analyzing
the propagation characteristics.
The researches of the traditional cascade events, such as avalanche, sand model,
only researched the power-law distribution in the process of all time. It means that
large-scale avalanches occur occasionally in the process of evolution. Correspondingly,
various small-scale avalanches appear more and more and its number satisfies the
power-law distribution. In fact, the speed of Virus propagation is different in each time

1
Corresponding Author: Hao YANG, School of Information Engineering, Yancheng Teachers
University, Yancheng, Jiangsu, China; School of Software and TNList, Tsinghua University, Beijing, China;
E-mail: classforyc@163.com.
K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Influence 227

period. For Guinea, Liberia and Sierra Leone, we do some empirical observations about
the spread of the disease in different time.

1. The measurement of the Spread of the Disease Based on Power Law Model

We suppose the sand number increase with n. A possibility is that the sand number per
unit length of the cycles, λ. The parameter is constant. Put k/2 sands (k = 2πλctgθ is an
even number) on the 0-th phase, and k sands at the 1st sand. The number of the sand on
the n-th cycle should be nk. Likely, we assume the falling sands is an inelastic collision
with the resting sands. In this case, each sand slide together after the collision. We make
the sliding sands in order that (n2 −n+1) sands evenly meet 2n resting sands on the n-th
phase. There are (n2 − n + 1)k in the n-th generation (bn = (n2 − n + 1)k, dn = 0), then
N(t)~n(t)2~t2×2~t4 [16].
We resolve with susceptible people(S), latent people (L), infected people (I) and
death people(D). The transformation of the four nodes is shown in Figure 1.

Figure 1. The transformation of the four nodes.


Now we can study the different equations which show the virus spread on the basic
of the related knowledge in this paper. According to sandpile model’s analysis, the
equation about epidemic trend is as follows (with the acceleration of the number of
infected people B, and the acceleration of the number of dead people …′. A is a constant
which related to B in the axis, ′ is a constant which related to …′ in the axis):
ˆ(:(‰)) ˆ(K(‰))
ˆ ‰
=… ˆ‰
= …′ (1)
‹  ‹ 
c(L) = L ∙ 10 9(L) = L ∙ 10 (2)

2. Model Evaluations

We collected data about the number of all the cases and the number of the people
infected from a website (http://www.cdc.gov/vhf/ebola/outbreaks/2014-west-africa/
whats-new.html). As a result of our experiments, a linear function was fitted to the linear
ranges of log-log plotted distributions to estimate the value of the gamma exponent.
Figure 2-4 show distributions of I for Guinea, Liberia and Sierra Leone respectively
(with values of the Pearson correlation coefficient R, and standard deviation SD). Our
method considers the number of infected people from February 4, 2014 to March 25,
2015.
The Log-log plots of the number of the people infected and dead are demonstrated;
a) Log-log plot of the number of the people infected
228 K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Influence

(a)0<t<100 (b)100<t<229

(c)229<t<318
Figure 2. Log-log plot of the number of the people infected for Guinea

(a)2<t<100 (b)100<t<267

(c)267<t<318
Figure 3. Log-log plot of the number of the people infected for Liberia
K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Influence 229

(a)64<t<271 (b)271<t<318

Figure 4. Log-log plot of the number of the people infected for Sierra Leone

b) Log-log plot of the Number of the People Dead

(a)0<t<100 (b)100<t<318

Figure 5. Log-log plot of the number of the people dead for Guinea

(a)0<t<100 (b)100<t<318

Figure 6. Log-log plot of the number of the people dead for Liberia
230 K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Influence

Figure 7.. Log-log plot of the number of the people dead for Sierra Leone

Table 1. Values of R(Pearson correlation coefficient), B(Power exponent) and SD(Standard deviation) for
log-log plot of the number of the people infected for three countries.
Guinea Liberia Sierra Leone
t 0-100 100-229 229-318 2-100 100-267 267-318 64-271 271-318
R 0.96003 0.98086 0.93949 0.88703 0.9855 0.98225 0.97084 0.98431
SD 0.06164 0.06282 0.01599 0.18816 0.10405 0.00326 0.15811 0.00396

Table 2. Values of R(values of the Pearson correlation coefficient) and SD(standard for log-log plot of the
number of the people dead for three countries.
Guinea Liberia Sierra Leone
t 0-100 100-318 2-100 100-318 64-318
R 0.95263 0.9909 0.79757 0.94552 0.96776
SD 0.06817 0.03521 0.15197 0.17199 0.15719

Figure 3-5 show the numbers of the people infected degree distributions for the three
countries and Figure 6-7 show the numbers of the people dead degree distributions for
the three countries. We use R and SD to illustrate the feasibility of our model(with values
of the Pearson correlation coefficient R, and standard deviation SD in Table 1-2). If 0.95
is taken as a minimal reliable value, we can state a power law for the infected and death
people.
Through the above analysis, we can get power relations about the number of people
infected or people dead changing over time. Values of A and B are shown in Table 3-4
respectively. Of course, it is not straightforward the correlation between them and the
number of people. We just represent the numerical results.

Table 3. Values of A and B of the people inflected for the three countries.
Guinea Liberia Sierra Leone
0-100 100-229 229-318 2-100 100-267 267-318 64-271 271-318
0.36666 3.1281 0.99661 0.82688 5.1708 0.73092 4.14169 1.01616
1.3384 -4.46411 0.53758 -0.18821 -8.61879 1.87354 -6.20484 1.336
K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Influence 231

Table 4. Values of A and B values of the people dead for the three countries.
Guinea Liberia Sierra Leone
t 0-100 100-318 2-100 100-318 64-318
A 0.37032 1.92976 0.45961 3.71858 3.59267
B 1.61866 -1.52369 0.4246 -5.38156 -5.30488

3. Conclusion

According our model, the transmission speed of virus is slow at the beginning, but the
speed will accelerate after a period, which can cause people enough attention to the virus
and take some relevant measures to prevent the spread. Then the speed will be
decreased relatively. Our power law model is reasonable by using a simplified sandpile
model and analyzing the empirical data. The data of latent people could not be collected,
so we only analyze the data of infected and death people in the model to produce the drug.
As our experiment, it is a realistic, sensible, and useful mode, and can be applied to
eradicate Ebola.

Acknowledgements

This work is supported by the National High Technology Research and Development
Program (863 Program) of China (2015AA01A201), National Science Foundation of
China under Grant No. 61402394, 61379064, 61273106, National Science Foundation of
Jiangsu Province of China under Grant No. BK20140462, Natural Science Foundation of
the Higher Education Institutions of Jiangsu Province of China under Grant No.
14KJB520040, 15KJB520035, China Postdoctoral Science Foundation funded project
under Grant No. 2016M591922, Jiangsu Planned Projects for Postdoctoral Research
Funds under Grant No. 1601162B, JLCBE14008, and sponsored by Qing Lan Project.

References

[1] P. Bak, C. Tang, K. Wiesenfeld, Self-organized criticality, Physical review A, 38 (1988): 364.
[2] M. L. Sachtjen, B.A. Carreras, V.E. Lynch, Disturbances in a power transmission system, Physical Review
E, 61(2000): 4877.
[3] A. E. Motter, Cascade control and defense in complex networks, Physical Review Letters, 93(2004)
098701.
[4] J. Wang, L.-L. Rong, L. Zhang, Z. Zhang, Attack vulnerability of scale-free networks due to cascading
failures, Physical A, 387(2008): 6671.
[5] S.V. Buldyrev, R. Parshani, G. Paul, et al. Catastrophic cascade of failures in interdependent networks,
Nature, 464(2010): 1025.
[6] R. Parshani, S.V. Buldyrev, S. Havlin, Interdependent networks: reducing the coupling strength leads to a
change from a first to second order percolation transition, Physical Review Letters, 105(2010): 048701.
[7] T. Zhou, B. H. Wang, Chin. Maximal planar networks with large clustering coefficient and power-law
degree distribution, Physical Letters, 22 (2005): 1072.
[8] K. Lied, Avalanche studies and model validation in Europe, Avalanche studies and model validation in
Europe, European research project SATSIE (EU Contract no. EVG1-CT2002-00059), 2006.
[9] D.A. Noever, Himalayan sandpiles, Physical Review E, 47(1993): 724.
232 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-232

Hazardous Chemicals Accident Prediction


Based on Accident State Vector Using
Multimodal Data
Kang-Wei LIU , Jian-Hua WAN a and Zhong-Zhi HAN a
a
School of Geosciences, China University of Petroleum, Qingdao 266580 China
b
Sinopec Safety Engineering Institute, Qingdao, Shandong, 266071 China

Abstract. Hazardous chemicals industry is a high-risk industry, all kinds of explo-


sions, fires, leaks and poisoning incidents is occurred occasio1nally. So it is partic-
ularly important to forecast for hazardous chemical accidents and develop appro-
priate safety measures. In view of the analysis and summary of previous methods,
an improved Hazardous Chemicals Accident Prediction method is proposed based
on accident state vector in this paper. It defines the accident state vector using
Multi-modal Data such as authoritative data, accident report, webpage, image, vid-
eo, speech, etc. The Multi-modal Data is collected by web crawler which is built
by open-source tools. The web crawler is an Internet bot which systematically
browses the known hazardous chemical accident website, for the purpose of col-
lecting Multi-modal accident data. As mentioned before, the Multi-modal Data is
Multi format. In order to define the accident state vector easily, we divide the Mul-
ti-modal data into three dimensions based on the principle of accident causes. Re-
spectively is the human factors, physical state factors, environmental factors. Ac-
cording to the geometrical distribution characteristics of support vector, it can be
selected from the incremental samples that the sample of support vectors most
likely to become forming a boundary vector set by adopting vector distance pre-
extraction method, on which support vector training and accident prediction model
build. It ensures the validity of predictive models due to various factors of the
cause of the accident are fully considered by the accident state vector and ad-
vantages of support vector machines in high-dimensional, multi-factor, large sam-
ple datasets machine learning are exhibited. Sample experimental verification from
the mastered accident of hazardous chemicals has showed that hazardous chemical
accident prediction method proposed in this paper can effectively accumulate ac-
cident history information, possess higher learning speed and be positive signifi-
cance for the safe development of hazardous chemicals industry.

Keywords. hazardous chemical accidents, Support Vector Machine, accident pre-


diction, accident state vector

Introduction

Hazardous chemicals industry represented by petrochemical industry belongs to high


risk industry, which has some perilous characteristics of high temperature and high
pressure, inflammable and explosive, poisonous and harmful, continuous operation,
long chain side wide, etc. At present, the safety production situation of hazardous

1
Corresponding Author: Kang-Wei LIU, Engineer of Sinopec Safety Engineering Institute, No339,
Songling Road, Qingdao, Shandong, China ; E-mail: liukw.qday@sinopec.com.
K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector 233

chemical is very grim with all kinds of explosion, fire, leakage and poisoning accidents
occurred at times. According to statistics, more than 96000 chemical enterprises in
China, of which dangerous chemicals production enterprises are more than 24000 , the
species of chemicals are more than 100000,but more than 4600 chemical accidents
have occurred nearly a decade. As the device of large-scale, intensive, material and
economic loss will occur when any accident happens, and especially death and disabil-
ity loss will lead to health life loss. Therefore, it is particularly important to forecast for
hazardous chemical accidents and develop appropriate safety measures on this basis.
Accident prediction is based on the known information and data, which forecasts
and predict for the security of forecasting object, as shown in Figure 1. Accident pre-
diction method has become a hot topic of scholars gradually as the change trend and
security hidden danger of the accident can be analyzed through the method in recent
years. According to incomplete statistics, all kinds of forecasting method was more
than 300 now, and the development of modern forecasting methods are often accompa-
nied by cross-analysis and mutual penetration of all kinds of forecasting methods, so it
is difficult to classify them absolutely. The current common accident prediction method
can be summarized into 6 types of situational analysis method, regression prediction
method, time prediction method, Markova chain prediction method, gray prediction
method and the nonlinear prediction method. The establishment and algorithm im-
provement of model often tend to be an emphasis in the traditional accident prediction
method and the collection and carding of the prior accident data will be an overlook
frequently. Limited by difficulty of priori data collection and complexity of model, the
accident prediction models are usually based on number of factors of strong causal re-
lationship artificially to hazardous chemical accidents ,which include the number of
accident ,death toll and the amount and type of hazardous chemical, then, leading to
incomplete and inaccurate of the accident forecasting result ultimately.
Accident
Prior data
prediction
Of accidents
model
Figure 1. Establishment way of the Accident prediction model
Support Vector Machine (SVM) is developed by Vapnik and co-workers[1] It is
an excellent method of machine learning. SVM have empirically been shown to give
good generalization performance on a wide variety of problems. SVM is a kind of im-
plementation way of statistical learning theory, which is not only the pursuit of accura-
cy on the training sample, but also the consideration of complexity of the learning
space on the basis of the training sample accuracy, namely, it adopted a compromise
between spatial complexity and sample learning precision so that the resulting models
for unknown samples possess good generalization ability.
In view of the analysis and summary of previous methods, an improved Hazardous
Chemicals Accident Prediction method is proposed based on Support Vector Machine
in this paper. It defines the accident state vector from three dimensions of the human
factors, physical state factors, environmental factors based on the principle of accident
causes. According to the geometrical distribution characteristics of support vector, it
can be selected from the incremental samples that the sample of support vectors most
likely to become forming a boundary vector set by adopting vector distance pre-
extraction method, on which support vector training and accident prediction model
build. It ensures the validity of predictive models due to various factors of the cause of
234 K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector

the accident are fully considered by the accident state vector and advantages of support
vector machines in high-dimensional, multi-factor, large sample datasets machine
learning are exhibited.

1. Overview of SVM

The core of SVM is finding a hyperplane that separates a set of positive examples from
a set of negative examples with maximum margin[1,2,3]. The training of a Support
Vector Machine can be reduced to maximize a convex quadratic program to linear con-
straints. Given a training sample:
{( xi, yi )| i=1,…,l; xi Rn yi {+1, -1}},
For the condition of Linear Separable: The goal of SVM is to find a hyperplane
<w, x> + b = 0
Which divides the sample set exactly. But there are always not only one hyper-
plane. The hyperplane which has the largest margin of the two kinds of samples - the
optimum classification hyperplane - attains the best capacity of spread. The optimum
hyperplane is only determined by samples closest to it and has no responsibility on
other samples. These samples are so called support vectors. This is also the origin of
“support vector”[4,5,6,7].

2. The Definition of Accident State Vector

The accident causation theory is used to illustrate the causes of accidents, exploring
process and accident consequences, so the occurrence and development of accident
phenomenon can be analyzed definitely. It is accident mechanism and model extracted
from the essence of a large number of typical accident, which reflects the regularity of
the accident, provides a scientific and complete basis in theory for the accident predic-
tion and prevention, besides the improvement of the safety management work owing to
the capacity for quantitative and qualitative analysis of accident cause
In accordance with the accident causation theory, the insecure elements of human
beings, the insecure status of objects and insecure impact of environment can all lead to
the occurrence of accidents, so the accident can be described as three categories of sub-
jective evaluation indicator (human factors), objective inherent indicator (physical fac-
tors), environmental indicator (environment factors), as shown in Figure 2.

P D E

Figure 2. Accident causation theory


K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector 235

(1) Subjective evaluation indicator (human factors)


Subjective evaluation indicator are judged and scored regularly by the enterprise,
assuming that the number of subjective evaluation indicator is m, then, the subjective
evaluation indicator can be represented as an m dimensional vector P (People),
P = {P1, P2, P3,̖̖Pm}
Subjective evaluation indicator are mainly involves acquisition of some safety in-
dicators impossible to quantify or unable to automatically extract such as the question
of "Are the training of special equipment operation and maintenance in place?" or
"whether the security regulatory behavior in place or not? " etc. these indicators need to
be evaluated and scored by the enterprise personnel regularly
(2) Objective inherent indicator (physical factors)
Objective inherent indicator refers to the enterprise inherent risk level, assuming
that the number of objective inherent indicator is n, then, the objective inherent indica-
tor can be expressed as an n dimensional vector D (Device)
D ={D1, D2, D3,̖̖Dn}
Objective inherent indicator can be obtained automatically, for instance, "chemi-
cals production", "number of major hazard installation", "fire and explosion indicator
of hazardous substances", "chemical material toxicity indicator", etc.
(3) Environmental indicator (environment factors)
Local climate and weather, geography and geological environment, frequency of
natural disaster, regulation level of government and social events are usually included
in environmental indicator. In a word, all not classified as former two kinds pertain to
environmental indicator in order to meet the requirement of big data fault tolerance.
These indicators should be corresponding to a t dimensional vector E finally
E={E1, E2, E3,̖̖Et}
In conclusion, accident state vector can be defined as follows:
accident state vector A = { P, D, E }
Wherein: Human vector P={P1, P2, P3,̖̖Pm},Physical state vector D={D1,
D2, D3,̖̖Dn}, Environmental vector E ={E1, E2, E3,̖̖Et}.

3. Training Algorithm Based on Accident State Vector

Hyperplane can be established and predicted by Support Vector Machine (SVM)


through the learning of accident state vector, and the unknown accident state vector can
be forecasted via the hyperplane, thus forming the accident prediction model[9,10]. As
mentioned above, not all the vector works for the establishment of prediction hyper-
plane, but only a small amount of training sample called support vector function when
training and learning via SVM, which distributed to neighborhood of hyperplane in
geometry position[11,12]. So we should choose one which may become a support vec-
tor samples as far as possible for studying. Therefore, this paper presents Support Vec-
tor Machine (SVM) training algorithm based on accident state vector (ASV-SVM algo-
rithm).
We can descript the incremental learning algorithm with support vector machine as
follows:
Historical sample set (M), incremental sample set (N), Suppose that M and N satis-
fy  I , : is the initial SVM classifier and is corresponding support vector
236 K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector

set of M. Obviously, Ž , the learning target is to find the classifier : and cor-
responding support vector set * of * .
Based on the geometric character of support vector, determining whether one sam-
ple can transfer to support vector should consider two complications: One is the dis-
tance between the sample and the hyperplane; the other is the distance between this
sample and the center of this kind of samples [13,14,17]. So we can do our best to se-
lect the samples likely to become support vectors as the newly training set. There may
be samples which will become support vector in  and . Select samples which
are close to separating hyperplane and between the class center and hyperplane as new-
ly-increased samples. Select samples whose hyperplane-distance is less than center
plane distance form edge sample set T. Set * * as the final training set of in-
cremental learning.

4. Experimental Results

We apply this algorithm into establishment of the model for the prediction of hazardous
chemicals accidents. We compare the ASV-SVM algorithm with traditional SVM
learning algorithm and KNN k-Nearest Neighbor algorithm. Simple description of
three algorithms is as follows:
Classical SVM algorithm: This is traditional SVM algorithm. The algorithm com-
bines original samples and newly-increased samples, and does the learning again for all
of the training samples.
Classical KNN algorithm: KNN is a memory-based method. Prediction on a new
instance is preformed using the labels of similar instances of the training set.
ASV-SVM algorithm: Using ASV-SVM algorithm which select support vector
based on vector-distance for incremental learning.
In this experiment, the accident state vector is defined by Multimodal data. The
method is as follows:
(1)Collect and maintain the data of 619 typical hazardous chemicals accidents oc-
curring within the last ten years, including accident report, accident cause analysis,
accident consequence and influence.
(2)Crawling related data of mentioned accidents using web crawler which build by
open-source tools. The web crawler is an internet bot which systematically browses the
known hazardous chemical accident website, for the purpose of collecting Multi-modal
accident data. Such as the weather condition, geographical situation, population density
etc. when the accident happened.
(3)In order to do a good job of comparative test, we collected two to three sets of
non-accident status data on other times at the place where the accident occurred. And
1288 non accident state data are formed by this way.
(4)The data collected above is Multi format such as authoritative data, accident re-
port, webpage, image, video, speech, etc. In order to define the accident state vector
easily, we divide the Multi-modal data into three dimensions based on the principle of
accident causes. We use the open source big data tools, with the manual screening, the
data were structured to deal with. We add as many attribute labels as possible to each
data, so that these non-structural data become structured data. To be frank, for unstruc-
K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector 237

tured data such as video, image, most of which is done by artificial recognition, the
accuracy and availability of automatic recognition of the machine is not satisfactory.
(5)All the Multimodal data will have a lot of attribute labels after structured pro-
cess. Categorize these attribute labels by three dimensions based on the principle of
accident causes. Respectively is the human factors, physical state factors, environmen-
tal factors.
(6)Determine the dimension of the accident state vector of 265 dimensions, and
each attribute label represents one dimension, including human vector P (185 dimen-
sions), divided into leadership and safety culture, safety, process safety information for
process control, inspection and human performance, the state vector D (49 dimensions),
divided into Fire index of hazardous substances, explosive index, toxicity index, pro-
cess index, equipment index, safety facility index, etc. The environment vector (31
dimensions), by the meteorological index (We) and geography information index (Gi).
(7)Transfer accident state and the non-accident state into the vector form of acci-
dent as follows[15]:
<label> <index1>: <value1> <index2>: <value2> ̖̖ <indexn>: <valuen>
Label is result of the accident state. 1 is accident state.-1 is non-accident state. In-
dex is attribute label. Value is the weight or description of attribute label. And n is
equal to 256.
(8)And from which 1000 vectors are selected as test sets, 1000 vectors are used as
the initial training set, and the remaining 907 vectors are randomly divided into 3 sets,
as an incremental set.
After the pretreatment, Accident information are transferred to the form of vectors.
Then we use the three algorithms do the learning. All of the algorithms are carried out
in the LibSvm-mat-5.20 saddlebag[16]. The platform of the experiment is E7-4830V2,
operating system is Windows server 2012. In the experiment, kernel is REF function,
C=1. The results of the experiment are shown in table 1 3.
Table 1. Classical SVM algorithm experiment results

Classical SVM algorithm


Incremental set
Test set
number
samples numbers time/s Precision

Initial set 1000 1000 185.1 -----

set1 247 1247 243.6 93.6%

set2 438 1685 528.9 93.1%

set3 222 1907 612.8 92.7%

Table 2. KNN algorithm experiment results

Incremental Classical KNN algorithm


Test set
set number
samples numbers time/s Precision/%

Initial set 1000 1000 412.7

set1 247 1247 514.6 78.9%

set2 438 1685 702.6 79.4%

set3 222 1907 858.1 83.2%


238 K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector

Table 3. ASV-SVM algorithm experiment results

ASV-SVM algorithm
Incremental
Test set
set number
samples numbers time/s Precision

Initial set 1000 1000 185.1 ---

set1 247 453 163.2 92.1%

set2 438 737 348.1 94.2%

set3 222 805 393.3 94.1%


We can see that only in the initial training the ASV-SVM algorithm is similar to
traditional SVM algorithm and KNN algorithm. Other incremental learning process
performance is better than other classical algorithm. The number of training samples is
reduced and the training time is shortened when the accuracy rate is not lost. As in Fig-
ure 3 and Figure 4 below
ASV-SVM algorithm is prior to the traditional SVM learning algorithm and KNN
algorithm on training samples numbers. In the process of incremental learning, the
ASV-SVM algorithm screens the newly-increased and original samples effectively,
thus reduce the number of training samples on the premise of reserving the effective
information of samples. The training time of the ASV-SVM algorithm decreases great-
ly contrast to the Classical SVM and KNN learning algorithm. The decreasing of the
number of the training samples can well control the scale of the incremental learning,
thus shorten the training time on the premise of not losing useful information. The pre-
cision of the ASV-SVM algorithm is a little better to the SVM learning algorithm, and
great better than the KNN algorithm. As is shown in Figure 5.

Figure 3. Contrast of training samples

Figure 4. Contrast of training time


K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector 239

Figure 5. Contrast of classification precision

5. Conclusion

Hazardous chemicals industry is a high risk industry. Explosion, fire, leakage and poi-
soning accidents occur frequently. This paper analyzes the influence of occurrence of
hazardous chemicals accidents form human factors, physical factor and environmental
factor, and defines the accident state vector from three dimensions. In view of the anal-
ysis and summary of previous methods, an improved Hazardous Chemicals Accident
Prediction method is proposed based on Accident State Vector (ASV-SVM). The high
dimension vector is used to define the accident state, and the most possible factors are
considered. Using improved support vector machine learning algorithm (ASV-SVM
algorithm), an accident prediction model is established by accident state vector. A
sample test of the hazardous chemical accident shows that the method proposed this
paper can differentiate accident state accurately and efficiently, and make a positive
significance on accident prediction of hazardous chemicals.

Acknowledgement

This work was supported by the National Natural Science Foundation of China (Grant
No. 31201133).

References

[1] V.N. Vapnik, The Nature of Statistical Learning Theory, Springer-verlag, New York, (2000),332-350..
[2] N. Cristianini, J. Shawe-Talor. An Introduction to Support Vector Machines and Other Kernel-based
Learning Methods. Cambridge University Press, (2004), 543-566
[3] R. Xiao, J.C. Wang, Z.X. Sun An Incremental SVM Learning Algorithm. Journal of Nan Jing Univer-
sity (Natural Sciences) 38(2002), 152-157㧚
[4] N Ahmed, S Huan, K Liu, K Sung, Incremental learning with support vector machines. The International
Joint Conference on Artificial Intelligence, Morgan Kaufmann publishers, 10 1999), 352-356.
[5] P. Mitra, C. A. Murthy, S. K. Pal, Data Condensation in Large Databases by Incremental Learning with
Support Vector Machines. Proceeding of International Conference on Pattern Recognition, (2000), 2708-
2711.
240 K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector

[6] C. Domeniconi and D. Gunopulos Incremental Support Vector Machine Construction. Proceeding of
IEEE International Conference on Data Mining series (ICDM ), (2001),589-592.
[7] G. Cauwenberghs , T. Poggio, Incremental and Decremental Support Vector Machine Learning. Ad-
vances in Neural Information Processing Systems,(2000),122-127.
[8] S. Katagiri , S. Abe, Selecting Support Vector Candidates for Incremental Training. Proceeding of IEEE
International Conference on Systems, Man, and Cybernetics (SMC), (2005),1258-1263,.
[9] D. M. J. Tax, R. P. W. Duin, Outliers and Data Descriptions. Proceeding of Seventh Annual Conference
of the Advanced School for Computing and Imaging, (2001),234-241.
[10] L.M. Manevitz and M. Yousef, One-class SVMs for document classification. Journal of Machine Learn-
ing Research, 2 (2001), 139-154.
[11] R. Debnath, H. Takahashi, An improved working set selection method for SVM decomposition method.
Proceeding of IEEE International Conference Intelligence Systems, Varna, Bulgaria, 21-24(2004), 520-
523.
[12] R. Debnath, M. Muramatsu, H.Takahashi, An Efficient Support Vector Machine Learning Method with
Second-Order Cone Programming for Large-Scale Problems. Applied Intelligence, 23(2005), 219-239.
[13] W D.Zhou, L.Zhang, L.C.Jiao, An Analysis of SVMs Generalization Performance. Acta Electronica
Sinica. 29(2001),590-594
[14] J. Heaton, Net-Robot Java programme guide. Publishing House of Electronics Industry. 22(2002) 1-
141.
[15] C.W. Hsu C.J. Lin A simple decomposition method for support vector machines. Machine Learning,
46(2002) 291–314.
[16] C.C. Chang , C. Lin, LIBSVM : a library for support vector machines, 2001. Software available at
http://www.csie.ntu.edu.tw/~cjlin/libsvm
[17] C.H. Li, K.W. Liu, H.X. Wang. The incremental learning algorithm with support vector machine based
on hyperplane distance, Applied Intelligence, 46(2009):145-152
Fuzzy Systems and Data Mining II 241
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-241

Regularized Level Set for Inhomogeneity


Segmentation
Guo-Qi Liu a,1 , Hai-Feng Li b
a
School of Computer and Information Engineering, Henan Normal University,
Xinxiang 453007, China
b
College of Mathematics and Information Science, Henan Normal University,
Xinxiang 453007, China

Abstract. Active contour model based on level set method is a popular


method for image segmentation. However, intensity inhomogeneity uni-
versally exists in images and it greatly influences image segmentation.
Local binary fitting model (LBF) is a effective method to cope with
inhomogeneous intensity. However, the energy function of LBF is non-
convex and it costs much computational cost. Otherwise, LBF could not
preserve the weak edges. The non-convexity always causes the contour
suffer from local minimum, and the computational cost is large. In or-
der to cope with these shortcomings, we introduce a regularized mini-
mization for improved LBF model. In proposed model, the edge infor-
mation is integrated into the energy functional. The energy function-
al of improved LBF model is convex, and the local minimum is avoid.
Furthermore, some fast optimal method can be utilized. In this paper,
the regularized method is utilized to make the contour converge to min-
imization. Experimental results confirm that proposed method attains
a similar segmentation effect with the LBF but costs less computation
times.
Keywords. intensity inhomogeneity, level set, global minimization,
computation times

Introduction

Image segmentation plays an important role in image processing and computer


vision. Level set method [1-4] is a popular algorithm with the competitive advan-
tages in computational robustness and flexibility of topology changes. In general,
there are two types of level set models. One is the level set method based on
global information and the other is based on the local information. In the models
based on global information, Chan and Vese (C-V) [5] is one of the most pop-
ular methods, whose foreground and background usually have obvious different
intensity means.

1 Corresponding Author: Guoqi Liu; School of Computer and Information Engineering, Henan

Normal University; XinXiang, 453007; E-mail: liuguoqi080408@163.com.


242 G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation

However, C-V model has difficulty in dealing with intensity inhomogeneity


or intensity non-uniformity. Intensity inhomogeneity [6-11] degrades partition ef-
fectiveness and leads to inaccurate target location, which always appears in some
medical images. Therefore, intensity inhomogeneity segmentation method based
on level set or active contour have sprung up to in the past years. In the models
based on local information, Li has derived a local binary fitting (LBF)[6] model.
By incorporating the local image information into the proposed model, the images
with intensity inhomogeneity can be effectively segmented. However, the LBF
could not keep the weak edges and the computational cost is relatively large. Some
researchers also proposed similar methods to improve the performances in dealing
with inhomogeneous intensity, such as Zhang [10]. Generally, the models based on
local information has better performances in dealing with inhomogeneous inten-
sity because the local intensity inhomogeneity can be decreased by local filter. In
order to improve segmentation efficiency and keep true target contour, we extend
the version of LBF. Our paper is organized as follows. In Section 1, we review
the background. In Section 2, a method is proposed to enhance the former LBF
version. Section 3 shows the experimental results and makes comparisons with
LBF. Section 4 makes a summary of this paper.

1. Background

1.1. The C-V model

The energy functional of C-V model is defined as follows:


  
E(C, c1 , c2 ) = μ ds + |I − c1 |2 dx + |I − c2 |2 dx (1)
C Ω1 Ω2

where Ω1 and Ω2 are the region of foreground and background respectively,


whose intensity means are c1 and c2 . C represents the zero level curve and I is
image intensity. Moreover, the first term is the curve length to regularize with a
weight μ and the last two terms are data fitting terms. The Eq. (1) depends on
curve C and intensity means c1 and c2 , which can be solved by variation method
and gradient descent equation. Therefore, by representing contour with level set
φ, the above equation is computed as follows:

∂φ
= −((I − c1 )2 − (I − c2 )2 + μK)δ (2)
∂t

where K is the curvature of curve C and δ represents the Dirac function


with parameter .

1.2. The local binary fitting model (LBF)

A data fitting energy is defined in LBF [6], which can be locally approximated the
image intensities on the two sides of the contour. This energy is then incorporated
into a variational level set formulation, and a curve evolution equation is derived
G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation 243

for energy minimization. Intensity information in local regions is extracted to


guide the motion of the contour, which thereby enables LBF model to cope with
intensity inhomogeneity. The local binary fitting energy is defined as follows:


2 
e= λi ei (x)dx (3)
i=1

 
2
e1 (x) = Kσ (y−x)|I(y)−f1 (x)| Hdy, e2 (x) = Kσ (y−x)|I(y)−f2 (x)|2 (1−Hdy
Ω Ω
(4)
Kσ is served as a kernel function. f1 (x) and f2 (x) are computed as follows:

Kσ (x) ∗ I(x)H Kσ (x) ∗ I(x)(1 − H)


f1 (x) = f2 (x) = (5)
Kσ (x) ∗ H Kσ (x) ∗ (1 − H)

The total energy functional E of LBF is obtained by integrating the length


regularization term in the above energy. Length regularization term is defined as
follows:
 
L(φ) = ds = |∇φ|dx (6)
C

Then the evolution equation of level set function φ is computed as follows:

∂φ ∂E ∇φ
=− = −δ (φ)(e1 − e2 ) + λδ (φ)div( ) (7)
∂t ∂φ |∇φ|

where ∇ is the gradient operator, div(.) is divergence operator.

2. Regularized method for improved LBF

2.1. Improved LBF

The energy functional of LBF is defined as follows:

E LBF (φ) = e(φ) + λL(φ) (8)

The first term is a data fitting term, which is written as


 
e(φ) = e1 H(φ)dx + e2 (1 − H(φ))dx (9)

Similar with [12], the evolution equation of LBF is also computed by mini-
mizing the following energy functional:
 
E = λ|∇φ|dx + (e2 − e1 )φdx (10)
244 G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation

Because of the non-convexity of the above energy, we propose to minimize


the following improved energy functional:
 
E = λg|∇u|dx + (e2 − e1 )udx (11)

1
where g = 1+|∇I| 2 is the edge stoping function, u is the character function

and 0 ≤ u ≤ 1. Since the above energy is convex but with constrained condition
minimization problem, the unconstrained and convex energy is obtained based on
introducing an exact penalty function:


E (u, f1 , f2 , λ, α) = λT Vg (u) + (e2 − e1 )u + αpf (u)dx (12)
Ω

where the parameter α is constant, and pf (ξ) := max{0, 2|ξ − 12 | − 1} is a


penalty function.

2.2. Regularized minimization algorithm for improved LBF

In order to obtain the solution of the energy functional (13), the regularized
method is utilized in this letter. By introducing a variable v, the regularized
energy functional is computed as follows:

 μ
E (u, v, f1 , f2 , λ, α) = λT Vg (u) + u − v F + (e2 − e1 )v + αpf (v)dx (13)
2 Ω

where · F denotes the Frobenius norm, μ is a constant. The following tasks


are to compute the Eq. (13). It is to obtain the iteration solution of u by fixing
v, f1 and f2 . A fast numerical minimization based on the dual formulation of the
TV energy is presented in [12-15]. According to [13], the solution of u is given by

1
u=v− div p (14)
μ

p = (p1 , p2 ) is a dual variable, it is computed by

1 1
g(x)∇( div p − v) − |∇( div p − v)|p = 0. (15)
μ μ

The above equation can be solved by a fixed point method, which is given in [13].
Similarly, v is obtained by minimizing the following equation: v is updated by

μ
v = argmin u − v F + (e2 − e1 )v + αpf (v)dx (16)
v 2 Ω

Finally, v is iteratively computed as follows:

λ
v = min(max(u − (e2 − e1 ), 0), 1) (17)
μ

u and v are computed iteratively. The algorithm is following:


G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation 245

Algorithm 1 Regularized algorithm for improved LBF.


Input: initial value u0 , v0 , B, and p;
for k = 0 to maximum number of iterations do
for i = 1 to B do
uki = vk − μ1 div p
obtain p based on g(x)∇( μ1 div p − v) − |∇( μ1 div p − v)|p = 0.
end for
uk = conv(Gaussian, ukB )
σ (x)∗I(x)u
f1 (x) = KK σ (x)∗u)
and f2 (x) = KσK(x)∗I(x)(1−u)
σ (x)∗(1−u)

vk = min(max(uk − μλ (e2 − e1 ), 0), 1)


if uk+1 − uk F < δ then
return uk ;
end if
end for
Output:
u = 0.5;

Figure 1. The tested original images

Figure 2. Segmentation results of inhomogeneous intensity in medical images.

3. Experimental results and analysis

We have tested our algorithm on images with inhomogeneous intensity. Figure 1


demonstrates three images and initiation curves. The left image is 131 ∗ 103, the
middle image 110 ∗ 111 and the right image is 96 ∗ 127.
246 G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation

Figure 3. Segmentation results of object with weak edges.

Table 1. Quantitative evaluation of the cost times for typical images


1 2 3
LBF 1.928899 2.276515 4.688177
LIF 1.256878 1.753792 3.798538
P roposedmethod 0.898745 1.356784 2.897653

The intensity of foreground and background in these three images is inhomo-


geneity. Segmentation results are provided from Figure 2 to Figure 3 in which the
first column is simulated by Lis algorithm and the second column is computed
by our method. In Figure 2, the medical images are tested. The results with LBF
and proposed method are similar.
Furthermore, proposed method utilizes the gradient information of image
edge. Thus, proposed method has better performance in keeping weak edges com-
pared with LBF model. As shown in Figure 3, the gray image is tested and some
of the edges in object is weak. LBF suffers from weak edges leakage and the strong
edges are extracted. While proposed method converges to the weak edges, since
the edge information g is integrated into the proposed energy functional and it
could preserve weak edges.
On the other hand, proposed method is more efficient compared with LBF
and LIF. All the experiments are conducted by using MATLAB R2010a on the
PC with Intel Core (3.3*4GHz) and 8GB memory under Windows 7 profession-
al without any particular code optimisation. In algorithm 1, proposed method
based on image decomposition iterates to obtain u, which can decrease the evo-
lution times. The computational times are shown in Table 1. From the Table,
proposed method costs less times in converging to objects compared with LBF.
Because proposed algorithm iterates several times to obtain u before iterating v
and this process enhances the image non-smooth component, which causes the
total number of iterations decreasing.

4. Conclusions

In this paper, we first introduce the CV model and the LBF model. Then, we
propose our model to improve the efficiency of contour evolution. There are two
contributions in ours. One is that an energy function with edge information is
added into LBF, the other is to introduce a fast algorithm to obtain the solu-
tion. Experimental results confirm that the proposed method can obtain similar
segmentation and keep weak edges. Meanwhile, proposed method obtains faster
evolution of contour.
G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation 247

Acknowledgements

This work is jointly supported by National Natural Science Foundation of China


(No. U1404603).

References

[1] V. Caselles, R. Kimmel, and G. Sapiro. Geodesic active contours, International journal
of computer vision, 22(1997), 61-79.
[2] S. Kichenassamy, A. Kumar, P. Olver, A. Tannenbaum, and A. Yezzi: Gradient flows and
geometric active contour models, Proc. 5th Int. Conf. Comput. Vis., 1995, 810-815.
[3] R. Kimmel, A. Amir, and A. Bruckstein. Finding shortest paths on surfaces using level set
propagation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(1995),
635-640.
[4] R. Malladi, J. A. Sethian, and B. C.Vemuri. Shape modeling with front propagation: A
level set approach, IEEE Transactions on Pattern Analysis and Machine Intelligence,
17(1995), 158-175.
[5] T. han and L. Vese. Active contours without edges, IEEE Transactions on Image Pro-
cessing, 10(2) (2001), 266-277
[6] C. Li, C. Kao, J. C. Gore, and Z. Ding. Minimization of region-scalable fitting energy for
image segmentation, IEEE Transactions on Image Processing, 17(2008), 1940-1949.
[7] C. Li, Huang R., Ding Z., Gatenby C., Metaxas DN., Gore JC. A level set method for
image segmentation in the presence of intensity inhomogeneities with application to MRI,
IEEE Transactions on Image Processing, 20(7) (2011), 2007-2016.
[8] X.F. Wang, H. Min. A level set based segmentation method for images with intensity in-
homogeneity, Emerging Intelligent Computing Technology and Applications, with Aspects
of Artificial Intelligence, 2009, 670-679.
[9] F.F. Dong, Z.S. Chen and J.W. Wang, A new level set method for inhomogeneous image
segmentation, Image and Vision Computing, 31(2013), 809-822.
[10] K.H. Zhang, H.H. Song and L. Zhang, Active contours driven by local lmage litting energy,
Pattern recognition, 43(2010), 1199-1206.
[11] C. Li, C. Xu, C. Gui, MD. Fox, Level set evolution without re-initialization: A new vari-
ational formulation, Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005,
430-436.
[12] A. Chambolle, An algorithm for total variation minimization and applications, Journal of
Mathematical Imaging and Vision, 20(2004), 89-97.
[13] X. Bresson, S. Esedoglu, P. Vandergheynst, et al. Fast global minimization of the active
contour /snake model, Journal of Mathematical Imaging and Vision, 28(2007), 151-167.
[14] E.S. Brown, T.F. Chan, X. Bresson. Completely convex formulation of the Chan-Vese
image segmentation model, International journal of computer vision, 98(2012), 103-121.
[15] C. Li, R. Huang, Z. Ding, C. Gatenby, D. Metaxas, J. Gore, A variational level set approach
to segmentation and bais correction of images with intensity inhomogeneity, Processing
of medical image computing and computer aided intervention (MICCAI), 2008, Part II,
LNCS 5242, 1083-1091.
248 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-248

Exploring the Non-Trivial Knowledge


Implicit in Test Instance to Fully
Represent Unrestricted Bayesian
Classifier
Mei-Hui LI, Li-Min WANG 1
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of
Education, Jilin University, ChangChun City 130012, P. R. China

Abstract. Restricted Bayesian classifiers have demonstrated remarkable classifica-


tion performance for data mining. However, the restricted network structure makes
it impossible to represent the Markov blanket of class variable, which corresponds
to the optimal classifier. And the test instances are not fully utilized, the final deci-
sion thus may be biased. In this paper, a novel unrestricted k-dependence classifier
is proposed based on identifying the Markov blanket of the class variable. Further-
more, the algorithm also adopts local learning to build local structure, which can
represent the evidence introduced by test instance. 15 datasets are selected from the
UCI machine learning repository for zero-one loss comparison. The experimental
results indicate that the unrestricted Bayesian classifier can achieve good trade-off
between structure complexity and prediction performance.
Keywords. Data mining, Unrestricted Bayesian classifier, Local learning, Markov
blanket

Introduction

In the 1990s, Judea Pearl first talked about Bayesian network [1], which is a kind of infer-
ence network based on probabilistic uncertainty. A particularly restricted model, Naive
Bayes (NB), is a powerful classification technique. Many restricted Bayesian classifiers
[2] have been set out to extend the dependence of NB, such as Tree-augmented Naive
Bayes (TAN) [3] and k-dependence Bayesian classifier (KDB) [4].
Madden [2] finds that unrestricted Bayesian classifiers [5] learned using likelihood-
based scores are comparable to TAN. In this paper, a novel unrestricted k-dependence
Bayesian classifier (UKDB) is proposed to build from the perspective of Markov blanket.
Local mutual information and conditional local mutual information are applied to build
the local graph structure UKDBL for each test instance. UKDBL can be considered a
complementary part of UKDBG , which is learned from training set.
1 CorrespondingAuthor: LiMin Wang, Key Laboratory of Symbolic Computation and Knowledge
Engineering of Ministry of Education, Jilin University, ChangChun City 130012, P. R. China; E-mail:
wanglim@jlu.edu.cn.
M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance 249

Figure 1. Three classical Bayesian classifiers.

The rest of the paper is organized as follows. Section 1 briefly introduces information
theory and Markov blanket. Section 2 introduces related Bayesian classifiers. Section
3 presents the learning procedure of UKDB and basic idea of local learning. Section 4
provides the experimental results and comparisons. Section 5 concludes the findings.

1. Related Theory Knowledge

1.1. Information Theory

In the 1940s, Claude E. Shannon introduced information theory, the theoretical basis of
modern digital communication. Many commonly used measures are based on the infor-
mation theory and used in a variety of classification algorithms.
The mutual information (MI) [6] I(X; Y ) can measure the reduction of uncertainty
about variable X when all the values of variable Y are known. Conditional mutual in-
formation (CMI) [6] I(X; Y |Z) can measure the mutual dependence between X and Y .
Local mutual information (LMI) I(X; y) can measure the reduction about variable X
after observing that Y = y. Conditional local mutual information (CLMI) I(x; y|Z) can
measure the mutual dependence between two attribute values x and y.

1.2. Markov Blanket

Definition 1. [1] The Markov blanket (MB) for variable C is the set of nodes composed
of C’s parents Xpa , its children Xch , and its children’s parents Xcp . Suppose that X =
{Xpa , Xch , Xcp }, Markov blanket Bayesian classifiers approximate P (x, c) as follows,
P (c, x) = P (xpa )P (c|xpa )P (xcp |xpa , c)P (xch |xcp , xpa , c) (1)

Eq.(1) presents a more general case.The Markov blanket of C shields C from effects
of those attributes outside it and is the only knowledge needed to predict its behavior.

2. Bayesian Classifiers: from 0-dependence to k-dependence classifier

NB is the most restrictive probabilistic classification algorithm. The predictive attributes


are assumed to be conditionally independent, then

n
P (x, c) ∝ P (c) P (xi |c). (2)
i=1
250 M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance

Table 1. DataSets for Experimental Study


No. Dataset # Instance Attribute Class
1 Mushrooms 8124 22 2
2 Thyroid 9169 29 20
3 Pendigits 10992 16 10
4 Sign 12546 8 3
5 Nursery 12960 8 5
6 Seer 18962 13 2
7 Magic 19020 10 2
8 Letter-recog 20000 16 26
9 Adult 48842 14 2
10 Shuttle 58000 9 7
11 Connect-4 67557 42 3
12 Waveform 100000 21 3
13 Localization 164860 5 11
14 Census-income 299285 41 2
15 Covtype 581012 54 7

Figure 1(a) graphically shows the structure of NB. NB is 0-dependence classifier.


The basic structure of TAN allows each attribute to have at most one parent attribute
apart from the class, then

n
P (x, c) ∝ P (c)P (xr |c) P (xi |c, xj(i) ), (3)
i=1,i=r

where Xr denotes the root node and {Xj(i) } = Pa(Xi )\C, for any i = r. An
example of TAN is shown in Figure 1(b).
KDB further relaxes NB’s independence assumption by allowing every attribute to
be conditioned on the class and, at most, k other attributes [4]. Then

n
P (c|x) ∝ P (c)P (x1 |c) P (xi |c, xi1 , · · · , xip ) (4)
i=2

where {Xi1 , · · · , Xip } are the parent attributes of Xi and p = min(i − 1, k). Figure
1(c) shows an example of KDB when k=2.

3. The UKDB Algorithm

UKDB can output two kinds of sub-classifiers, i.e., UKDBG and UKDBL , which de-
scribe the causal relationships implicated in training set and test instance, respectively.
UKDB uses I(Xi ; C) and I(Xi ; Xj |C) simultaneously to measure the comprehensive
effect of class C and other attributes (e.g., Xj ) on Xi .
The learning procedures of UKDBG are described as follows:
———————————————————————————————————
Algorithm 1 UKDBG
———————————————————————————————————
M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance 251

Input: Pre-classified training set, DB, and the k value for the maximum allowable
degree of attribute dependence.
1. Let the global Bayesian classifier being constructed, UKDBG , begin with a single
class node C. Let the used attribute list S be empty.
2. Select k attributes {X1 , · · · , Xk } as Xpa that correspond to the maximum of
I(X1 , · · · , Xk ; C).
3. Add {X1 , · · · , Xk } to S. Add k nodes to UKDBG representing {X1 , · · · , Xk }
as the parents of C. Add k arcs from {X1 , · · · , Xk } to C in UKDBG .
4. Repeat until S includes all domain attributes
• Select
q attribute Xi that corresponds to the maximum value of I(Xi ; C) +
j=1 I(Xi , Xj |C), where Xi ∈/ S, Xj ∈ S and q = min(|S|, k).
• Add Xi to S. Add a node that represents Xi to UKDBG . Add an arc from C
to Xi . Add q arcs from q distinct attributes Xj in S to Xi .
5. Compute the conditional probability tables inferred by the structure of UKDBG
by using counts from DB, and output UKDBG .
———————————————————————————————————
The learning procedures of UKDBL are described as follows:
———————————————————————————————————
Algorithm 2 UKDBL
Input: Test instance (x1 , · · · , xn ), estimates of probability distributions on training
set and the k value for the maximum allowable degree of attribute dependence.
1. Let the local Bayesian classifier being constructed, UKDBL , begin with a single
class node C. Let the used attribute list S be empty.
2. Select k attributes {X1 , · · · , Xk } as Xpa that correspond to the maximum of
I(x1 , · · · , xk ; C).
3. Add {X1 , · · · , Xk } to S. Add k nodes to UKDBL representing {X1 , · · · , Xk }
as the parents of C. Add k arcs from {X1 , · · · , Xk } to C.
4. Repeat until S includes all domain attributes
• 
Select attribute Xi that corresponds to the maximum value of I(xi ; C) +
q
j=1 I(xi , xj |C), where Xi ∈
/ S, Xj ∈ S and q = min(|S|, k).
• Add Xi to S. Add a node that represents Xi to UKDBL . Add an arc from C
to Xi . Add q arcs from q distinct attributes Xj in S to Xi .
5. Compute the conditional probability tables inferred by the structure of UKDBL
by using counts from DB, and output UKDBL .
———————————————————————————————————
For UKDBG and UKDBL , estimate the conditional probabilities P̂G (cp |x) and
P̂L (cp |x) that instance x belongs to class cp (p = 1, 2, · · · , t), respectively. The class
label of x is determined by the average of both of the conditional probabilities.

P̂G (cp |x) + P̂L (cp |x)


c∗ = arg max . (5)
cp ∈C 2
252 M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance

Table 2. Experimental Results of Average Zero-one Loss


Dataset NB TAN KDB UKDBG UKDBL UKDB
Mushrooms 0.020 0.000 0.000 0.000 0.001 0.000
Thyroid 0.111 0.072 0.071 0.075 0.093 0.074
Pendigits 0.118 0.032 0.029 0.028 0.028 0.019
Sign 0.359 0.276 0.254 0.243 0.302 0.247
Nursery 0.097 0.065 0.029 0.029 0.070 0.045
Seer 0.238 0.238 0.256 0.258 0.244 0.244
Magic 0.224 0.168 0.164 0.162 0.176 0.161
Letter-recog 0.253 0.130 0.099 0.088 0.130 0.080
Adult 0.158 0.138 0.138 0.135 0.132 0.130
Shuttle 0.004 0.002 0.001 0.001 0.001 0.001
Connect-4 0.278 0.235 0.228 0.219 0.248 0.228
Waveform 0.022 0.020 0.026 0.024 0.019 0.019
Localization 0.496 0.358 0.296 0.297 0.331 0.285
Census-income 0.237 0.064 0.051 0.050 0.061 0.050
Covtype 0.316 0.252 0.142 0.143 0.274 0.150

Table 3. W/D/L Comparison Results of Average Zero-one Loss on All DataSets


W/D/L NB TAN KDB UKDBG UKDBL
TAN 14/1/0
KDB 13/0/2 9/4/2
UKDBG 13/0/2 10/3/2 3/11/1
UKDBL 14/1/0 3/7/5 2/3/10 2/3/10
UKDB 14/1/0 11/4/0 5/8/2 5/9/1 12/3/0

4. Experiments and Results

In order to better verify the efficiency of the proposed UKDB, experiments have been
conducted on 15 datasets from the UCI machine learning repository [7]. Table 1 sum-
marizes the characteristics of each dataset. Table 2 presents for each dataset the average
zero-one loss. The following algorithms are compared:
• NB, standard Naive Bayes.
• TAN [8], Tree-augmented Naive Bayes applying incremental learning.
• KDB (k=2), standard k-dependence Bayesian classifier.
• UKDBG (Global UKDB, k=2), a variant UKDB describes global dependencies.
• UKDBL (Local UKDB, k=2), a variant UKDB describes local dependencies.
• UKDB (k=2), a combination of global UKDB and local UKDB.
Statistically a win/draw/loss record (W/D/L) is computed for each pair of competi-
tors A and B with regard to a performance measure M . The record represents the number
of datasets in which A respectively beats, loses to, or ties with B on M . Finally, related
algorithms are compared via one-tailed binomial sign test with a 95% confidence level.
Table 3 shows the W/D/L records respectively corresponding to average zero-one loss.
Dems̆ar [8] recommends the Friedman test [9] for comparisons of multiple algo-
rithms. For any pre-determined level α, the null hypothesis will be rejected if F > χ2α ,
which is the upper-tail critical value having t − 1 degrees of freedom. The critical value
M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance 253

of χ2α for α = 0.05 is 9.49. The Friedman statistic for zero-one loss in our experiments
are 16.64. By comparing those results, we can get the following conclusions:
For different classifiers the average ranks of zero-one loss on all datasets are {N-
B(4.66), TAN(3.74), KDB(3.56), UKDBG (3.45), UKDBL (3.58), UKDB(2.01)}. UKD-
B and UKDBG performs the best among all classifiers in terms of zero-one loss. From
Table 3, UKDB has lower zero-one loss more often than other classifiers and the differ-
ences are significant. UKDBG also has relative advantages, however, the differences are
not significant. The performance of UKDBL is similar to that of TAN. UKDB can make
full use of the information that is supplied by the training sets and test instances. Thus,
performance robustness can be achieved.

5. Conclusion

The working mechanisms of NB, TAN and KDB were analysed and summarised. The
proposed algorithm, i.e., UKDB, applies local learning and Markov blanket to improve
the classification accuracy. Local learning makes the final model more flexible and
Markov blanket breaks the limitation of strict restriction for the parent variables.
15 datasets are selected from UCI machine learning repository by 10-fold cross val-
idation for zero-one loss comparison. Overall, findings reveal that UKDB model outper-
formed NB, TAN and KDB extraordinarily. To clarify the working mechanism of UKDB
more clearly, global UKDB and local UKDB, are also implemented and compared.

Acknowledgements

This work was supported by the National Science Foundation of China (Grant No.
61272209, 61300145) and the Postdoctoral Science Foundation of China (Grant No.
2013M530980), Agreement of Science & Technology Development Project, Jilin
Province (No. 20150101014JC).

References

[1] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kauf-
mann, Palo Alto, CA, 1988.
[2] M.G. Madden, On the classification performance of TAN and general Bayesian networks, Knowledge-
Based Systems, 22 (2009), 489–495.
[3] R.A. Josep, Incremental Learning of Tree Augmented Naive Bayes Classifiers, in Proceedings of the 8th
Ibero-American Conference on Artificial Intelligence, Seville, Spain, 2002, 32–41.
[4] M. Sahami, Learning limited dependence Bayesian classifiers, in Proceedings of the 2nd International
Conference on Knowledge Discovery and Data Mining, 1996, 335–338.
[5] F. Pernkopf, Bayesian network classifiers versus selective k-NN classifier, Pattern Recognition,
38(2005), 1–10.
[6] C.E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal, 1948, 379–
423.
[7] UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu/ml/datasets.html.
[8] J. Demšar, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning
Research, 7 (2006), 1–30.
[9] M. Friedman, The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of
Variance, Journal of the American Statistical Association, 32 (1937), 675–701.
254 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-254

The Factor Analysis's Applicability on


Social Indicator Research
Ying XIEa, Yao-Hua CHENb and Ling-Xi PENG b,1
a
Guangzhou Social Work Research Center, Guangzhou University,
Guangzhou, P.R. China, 510006
b
School of Mechanical and Electrical Engineering, Guangzhou University,
Guangzhou, P.R. China, 510006

Abstract. Factor analysis is a multivariate statistical method widely used in social


indicator analysis. Most of time, factor analysis results in the textbook only give
some mathematical expressions without clear interpretation. Motivated from a case
study on a popular textbook, this paper attempts to illustrate the potential pitfall of
factor analysis in some real applications. The study demonstrates that without
careful examination of the original dataset, factor analysis can lead to misleading
conclusions. This issue has been largely ignored in the literature including popular
textbooks. The statistical analysis cannot completely rely on the automated
computer software. The Kaiser-Meyer-Olkin (KMO) test results can only be used
as a reference. We should carefully examine the applicability of the original data
and give a cautious explanation. Provided that some popular textbooks ignore this
point, we wish this article can draw the readers’ special attention to the raw data
when conducting factor analysis.

Keywords. applicability, data analysis, factor analysis

Introduction

Factor analysis is a popular method for multivariate statistical analysis. In a typical


multivariate statistics course, factor analysis is an essential part. Normally, before
conducting factor analysis, it is suggested to use Kaiser-Meyer-Olkin (KMO) test to
justify whether the dataset is suitable for factor analysis. However, KMO test is
designed to test the sampling adequacy and does not fully account for the applicability
of factor analysis to a specific dataset. Consequently, even with significant KMO test
results, the factor analysis still produces suspicious conclusions.
The purpose of factor analysis is to reduce the dimensionality of the dataset, and to
examine the underlying relationships among the variables. In general, factor analysis
attempts to find a few factors to capture most information about the original data,
where the factors are combinations of the related variables [1-8].
Clearly, the factor analysis is based on analysis of the characteristics of the original
variables to summarize the information of the original variables [9-10].Therefore, the
choice of the original variables is very important. If there is no correlation among the
original variables, the data is not suitable for factor analysis. Dimension reduction

1
Corresponding Author: Ling-xi PENG, School of Mechanical and Electrical Engineering, Guangzhou
University, Guangzhou, P.R. China. Email: xysoc@gzhu.edu.cn.
Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research 255

effect will be limited. On the contrary, with stronger correlation, factor analysis could
largely reduce the dimensionality, produce superior performance, and improve the
interpretability [11].
Nowadays, factor analysis is implemented in most statistical software. But the
software are unable to understand the underlying meaning of each variable, and thus
researchers need to name cryptic factor, make a practical interpretation of the factors,
and check the applicability of factor analysis to the dataset. Quite often, the
applicability is not tested at all, and the researchers assume applicability by default.
This is one of the key reasons of absurd factor analysis results are not uncommon in
many statistical textbooks and articles. Many authors do not examine the raw data
before conducing the factor analysis.
Specifically, this article will use an example in Statistics (the fourth edition,
Renmin University of China Press) to illustrate the importance of checking
applicability of factor analysis. This textbook is widely used in China, recommended
by National Statistics Committee and Ministry of Education, with supporting
comprehensive database of teaching. In fact, the similar misuses could be found in
many other statistical textbooks, including another popular textbook Multivariate
Statistical Analysis [8].

1. A Case Study

The following example uses factor analysis to rank economic development of China
Provinces. "Based on the data of six major economic indicators for 31 provinces,
municipalities and autonomous regions in 2006, conduct factor analysis, explain the
factors, and calculate the factor score [9]." (Quoted (translated) from 256-269 pages of
the original book, Chapter 12, principal component analysis and factor analysis):

Table 1. The Raw Data

Gross
Governm Total Total Household Total Retail
Regional
ent Investment in Consumption Sales of
Product Populatio
Region Revenue Fixed Assets Expenditure Consumer
Per n (10000
(10000 (100 million (yuan/per Goods (100
Capita persons)
yuan) yuan) Capita) million yuan)
(yuan)

Beijing 50467 11171514 3296.3757 1581 16770 3275.2169

Tianjin 41163 4170479 1820.5161 1075 10564 1356.78652


Hebei 16962 6205340 5470.2356 6898 4945 3397.42296
Shanxi 14123 5833752 2255.7351 3375 4843 1613.43996
Inner
Mongoli 20053 3433774 3363.2077 2397 5800 1595.26514
a

The result is shown in following Table 2-3.


256 Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research

Table 2. Rotated Component Matrix in the Textbook

Component
1 2
Gross Regional Product Per Capita .112 .981
Government Revenue .755 .622
Total Investment in Fixed Assets .931 .247
TotalPopulation .941 -.213
Household Consumption Expenditure .117 .980
Total Retail Sales of Consumer Goods .922 .349
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.

Table 3 Variance Explained Ratio in the Textbook

ComponentInitial Eigenvalues Extraction Sums of Squared Rotation Sums of Squared


Loadings Loadings
Total % of Cumulative %Total % of Cumulative %Total % of Cumulative %
Variance Variance Variance

1 3.963 66.052 66.052 3.963 66.052 66.052 3.197 53.284 53.284

2 1.771 29.518 95.570 1.771 29.518 95.570 2.537 42.286 95.570

3 .128 2.128 97.698

4 .095 1.589 99.287

5 .026 .433 99.720

6 .017 .280 100.000


According to the textbook, the first component is most highly correlated with Total
Investment in Fixed Assets; Government Revenue; Total Retail Sales. The author
defined it as "economical level factor", and defined the second factor as "consumption
level".
Table 4 Region Rank in the Textbook

Rank Region Fac1 Fac2 Score


1 Guangdong 2.42045 .89371 3.31416
2 Shanghai -.54724 3.46909 2.92185
3 Jiangshu 1.96498 .57532 2.54030
4 Shandong 2.36315 .00275 2.36591
5 Zhejiang .94065 1.11499 2.05565
6 Beijing -.64278 2.63862 1.99584
7 Liaoling .41769 .20721 .62490
8 Henan 1.29494 -.83424 .46070

Then, the author weighted each factor according to variance contribution rate, and
then summed. In this way, the textbook calculated the total scores of each region, and
Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research 257

used the total score to reflect regional economic development. The result of rank in the
textbook is in the following Table 4.
According to the textbook, the result of the KMO Test, shown in Table 5, is
statistically significant, which means that the result of factor analysis is meaningful.
However, the result is highly skeptical. For example, Beijing is significantly under-
ranked, and Henan is over-ranked. Guangdong being ranked first is inconsistent with
the actual situation of the economic development.
Table 5 KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .695
Approx. Chi-Square 277.025

Bartlett's Test of Sphericity df 15

Sig. .000
The problem of the above analysis lies in the raw data. The example selects a few
variables to reflect the economic development. However, these variables are not on the
same scale. The GDP per capita is on the "individual" scale, while the "total population
at the end of the year", "investment in fixed assets", "total retail sales of social
consumer goods" and "government revenue" are all on the "population" or "overall"
scale. Because of the mismatched scale, it is inappropriate to combine these variables
into meaningful factors. In fact, to compare the level of economic development, the
"total population" is not even a proper indicator, as it gives advantages to regions with
larger population in the ranking system. Obviously, large population does not
necessarily indicate a prosperous economy. For example, Beijing, the Capital of China,
has much less population than Henan province, but Beijing’s economy is much more
developed than Henan.
To overcome this problem, a more appropriate approach is to examine the raw data
before factor analysis. To evaluate the economic development, the per capita variables
are more reasonable. Using the data from the textbook, the author calculates per capita
data for each variable except "total population", and then using factor analysis to do
same kind of research. The results (Table 6) show that the Number 1 extracted factor
can explain more than 80% of the variation, showing that there is greater relationship
between per capita level of economic indicators. We use the component matrix (Table
7) to recalculate the score.
Table 6 New Total Variance Explained
Initial Eigenvalues Extraction Sums of Squared Loadings
Component
Total % of Variance Cumulative % Total % of Variance Cumulative %
1 4.210 84.210 84.210 4.210 84.210 84.210
2 .592 11.833 96.042
3 .139 2.776 98.818
4 .039 .770 99.588
5 .021 .412 100.000
Extraction Method: Principal Component Analysis.
258 Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research

Table 7 New Component Matrix


a
Component Matrix
Component
1
Government Revenue Per C .966
Investment in Fixed Asset Per C .948
Household Consumption Expenditure .698
Retail Sales of Consumer Goods Per C .968
Gross Regional Product Per C .978
Extraction Method: Principal Component Analysis.
a. 1 components extracted.

The final regional rank of the economic level (Fac 1) is shown below.

Table 8 New Region Rank


Rank Region Fac1
1 Shanghai 2.78325
2 Beijing 2.6151
3 Zhejiang 1.30243
4 Tianjin 1.13873
5 Jiangsu 1.02431
6 Guangdong 0.82984
7 Liaoning 0.67854
8 Shangdong 0.65539
18 Henan -0.31168

Clearly, the ranking in Table 8 agrees with the actual economic situation in China.
The more developed regions are on the top.

2. Conclusions

Factor analysis is a widely taught and used statistical method, especial in the field of
social indicator research. Various professional statistical softwares (such as SPSS and
SAS) integrate modules of factor analysis to automate the process. But without careful
examination of the raw data, erroneous conclusions are unavoidable. The quality of the
factor analysis result highly depends on the original variables, data sources, and the
analysis method.
The KMO test is often employed to test whether the data is suitable for factor
analysis, but this test cannot tell whether the data itself is reasonable for analysis.
Most of time, factor analysis results in the textbook only give some mathematical
expressions without clear interpretation. When researchers and teachers used factor
Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research 259

analysis to show how to analyze practical problems, it is crucial to examine the


applicability of factor analysis to the original data, and to analyze whether the variables
are on the same scale. Only if the original data meet the scientific requirements, the
reliable conclusions could be reached.
In short, the statistical analysis cannot completely rely on the automated computer
software. The KMO test results can only be used as a reference. We should carefully
examine the applicability of the original data and give a cautious explanation. Provided
that some popular textbooks ignore this point, we wish this article can draw the
readers’ special attention to the raw data when conducting factor analysis.

Acknowledgements

This work was supported by the National Social Science Fund 15AZD077.

References

[1] H. H. Harman, Modern Factor Analysis, 3rd ed. Chicago: University of Chicago Press. 1976.
[2] N. Cressie, Statistics for spatial data. John Wiley & Sons, 2015.
[3] J. L. Devore, Probability and Statistics for Engineering and the Sciences. Cengage Learning, 2015.
[4] D. R. Anderson, D. J. Sweeney, T. A. Williams, et al. Statistics for business & economics. Nelson
Education, 2016.
[5] J. Pearl, M. Glymour, N. P. Jewell. Causal Inference in Statistics: A Primer. John Wiley & Sons, 2016.
[6] J. R. Schott, Matrix analysis for statistics. John Wiley & Sons, 2016.
[7] D. C. Howell, Fundamental statistics for the behavioral sciences. Nelson Education, 2016.
[8] X. Q. He, multivariate statistical analysis, Renmin University of China Press, 2011, 143-173.
[9] J. P. Jia, Statistics, Renmin University of China Press, 2011, 254-270.
[10] J. Kim and C. W. Mueller. Factor Analysis: What it is and how to do it. Beverly Hills and London:
Sage Publications, 1978.
[11] P. Kline, An easy guide to factor analysis. London: Routledge, 1994.
260 Fuzzy Systems and Data Mining II
S.-L. Sun et al. (Eds.)
IOS Press, 2016
© 2016 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-722-1-260

Research on Weapon-Target Allocation


Based on Genetic Algorithm
Yan-Sheng ZHANG1, Zhong-Tao QIAO and Jian-Hui JING
The Fourth Department, Ordnance Engineering College, Shijiazhuang City, Hebei
Province, China

Abstract. Weapon-target allocation (WTA) is a typical constrained combinatorial


optimization problem, which is an important content of command and decision in
air defense operation. WTA is known to be NP-complete problem, and the
intelligent optimization methods are widely employed to solving it. A popular
coding length is n m corresponding to assigning n weapons to m targets. However,
the coding length will increase greatly with the problem scale growing, and the
computation is too heavy to meet the real-time requirements. This paper focuses
on designing a new gene coding to improve computational efficiency. In our study,
a sequence of weapons serves as gene coding, which is attached the two other
codes, target code and capacity code respectively. This coding length is n and
adapts to the constraints of WTA effectively. Then the operators of gene selection,
crossover and mutation are designed. On the other hand, the maximum operational
effectiveness is defined as the object function with the minimum consumption of
ammunition. This model is based on multi-objective optimization, and is more
realistic. An example shows that the method is feasible and can save computing
time greatly.

Keywords. Weapon-target allocation, GA, Multi-objective optimization

Introduction

The weapon target allocation (WTA) is to optimize the distribution of our forces and
weapons according to the characteristics and quantity of incoming targets for the best
operational effectiveness. The WTA is a typical constrained combinatorial optimization
problem, which is a hard Non-Polynomial optimizing problem. The model of WTA
based on multi-objective optimization is more realistic and a hot topic. At present, the
intelligent optimization methods[1-3], such as genetic algorithm (GA), particle swarm
algorithm (PSA), ant colony algorithm (CA), and simulated annealing (SA), are widely
employed to solving WTA.
These intelligent algorithms have been shown better solutions than the classic ones.
However, it is not enough to satisfy real-time requirement of air-defense. In this paper,
we focus on designing a new gene coding to improve computational efficiency. A
popular genetic coding length is n*m corresponding to assigning n weapons to m
targets. In our study, a sequence of weapons serves as gene coding, which is attached
the two other codes, target code and capacity code respectively. This coding length is n
and adapts to the constraints of WTA effectively. On the other hand, the maximum

1
Corresponding Author: Yan-Sheng ZHANG, Lecturer, Ordnance Engineering College, No.97 Heping
West Road, Shijiazhuang City, Hebei Province, China; E-mail: zhang_sheng_74@163.com.
Y.-S. Zhang et al. / Research on Weapon-Target Allocation Based on Genetic Algorithm 261

operational effectiveness is defined as the object function with the minimum


consumption of ammunition.

1. Mathematic Model

Our anti-aircraft equipment is represented by A=[a1, a2,…, an], in which ai means the ith
(1≤i≤n) weapon. R=[r1, r2,…, rn] represents the capacity of ammunition corresponding
to A=[a1, a2,…, an], and ri means the quantity of ammunition about ai. Target set is
given by T=[t1, t2,…, tm], and tj (1≤j≤m) is the jth incoming target. D=[d1, d2,…, dm]
shows threat levels corresponding to T=[t1, t2,…, tm], and dj represents threat degree of
tj. P=[pij]nm is a matrix of intercept probability, and pij gives the intercept probability of
ai to tj. The decision matrix is described by X=[xij]nm, and xij is the number of missile
about ai to tj.
Operational effectiveness f1(X) is expressed in Eqs. (1) [4]. The total number of
missiles f2(X) consumed is given in Eqs. (2). The optimization of WTA is to make f1(X)
as large as possible, and f2(X) as small as possible. The multiple objective
optimizations can be transformed into a single one as the following, shown in Eqs. (3).
Then f(X) is the objective function, in which L1 and L2 are weights. We expect f(X) as
large as possible, shown in Eqs. (4).
m n

¦ d j (1  – (1  pij ) ij )
x m
f1 ( X ) (1)
j 1
n m
i 1 ¦
j 1,觟j z k
xij 0 (5)

f2 ( X ) ¦¦ x ij
(2) n
i 1 j 1 s.t.
¦x ij t1 (6)
f (X ) L1 f1 ( X )  L2 f 2 ( X ) (3) i 1

1 d xij d ri (7)
max f ( X ) (4)

Usually, there are some constraints about f(X). The number of weapons is greater
than the number of targets, namely, n m. A weapon ai is allowed only to been allocated
to one target tk, which is indicated in Eqs. (5). It is concluded that there is only one
nonzero element in each line of X. Any target is assigned at least one weapon. This
tells that at least one element of each column is not zero in X, as shown in Eqs. (6). The
number of missiles xij, which ai is allocated to tj, shouldn’t exceed the capacity of the
missile ri, given in Eqs. (7).

2. Design of Genetic Algorithm

2.1. Gene Particle Coding

The decision matrix X is the solution of the objective function. It is very complicated to
perform gene crossover and mutation if X is directly encoded as gene particle. A
sequence of weapons, A=[a1, a2,…, an], serves as gene coding, and 1, 2,… n represents
a1, a2,…, an respectively. Additionally, each gene particle is set two other codes.
Corresponding to A=[a1, a2,…, an], one is the target set T=[t1, t2,…, tn], and the other is
the quantity set of ammunition C=[c1, c2,…, cn]. t1, t2,…, tn are described by ,
262 Y.-S. Zhang et al. / Research on Weapon-Target Allocation Based on Genetic Algorithm

,… m , and c1, c2,…, cn meet the conditions of c1< r1, c2< r2,…, cn< rn. For example,
the gene coding and its additional coding are shown in Figure 1.