0 views

Uploaded by Sweet Man

- Demand Management
- Matlab Fuzzy Logic Toolbox
- Fuzzy HRRN CPU Scheduling Algorithm
- A Fuzzy Logic Based Fault Classification Approach Using Current Samples Only
- THE USE OF SOFT COMPUTING FOR MEASURING THE QUALITY OF EDUCATION
- Fuzzy Indices for Environmental Conditions
- PSO-Based Optimal Fuzzy Controller Design for Wastewater Treatment Process
- Ejemplo Con Lógica Difusa
- Probabilistic Interpretation of Complex Fuzzy Set
- fig_2.pdf
- Literature Review on Vague Set Theory in Different Domains
- 8048.pdf
- Neutrosophic soft matrices and NSM-decision making
- 3.a Heuristic Based Multi-Objective
- 1-s2.0-S095741740900cccccccccccccc9683-main
- Design of a Hydraulic Anti-lock Braking Modulator and an Intelligent Brake Pressure Controller for a Light
- A Study on Finding the Key Motive of Happiness Using Fuzzy Cognitive Maps (FCMs)
- Presentation
- Adem Golec Esra Kahya - A Fuzzy Model for Competency-based Employee Evaluation and Selection - 2007
- adaptif behaviour

You are on page 1of 652

Applications

The book series Frontiers in Artificial Intelligence and Applications (FAIA) covers all aspects of

theoretical and applied Artificial Intelligence research in the form of monographs, doctoral

dissertations, textbooks, handbooks and proceedings volumes.

The FAIA series contains several sub-series, including ‘Information Modelling and Knowledge

Bases’ and ‘Knowledge-Based Intelligent Engineering Systems’. It also includes the biennial

European Conference on Artificial Intelligence (ECAI) proceedings volumes, and other EurAI

(European Association for Artificial Intelligence, formerly ECCAI) sponsored publications. An

editorial panel of internationally well-known scholars is appointed to provide a high quality

selection.

Series Editors:

J. Breuker, N. Guarino, J.N. Kok, J. Liu, R. López de Mántaras,

R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong

Volume 293

Recently published in this series

Vol. 292. H. Jaakkola, B. Thalheim, Y. Kiyoki and N. Yoshida (Eds.), Information Modelling

and Knowledge Bases XXVIII

Vol. 291. G. Arnicans, V. Arnicane, J. Borzovs and L. Niedrite (Eds.), Databases and

Information Systems IX – Selected Papers from the Twelfth International Baltic

Conference, DB&IS 2016

Vol. 290. J. Seibt, M. Nørskov and S. Schack Andersen (Eds.), What Social Robots Can and

Should Do – Proceedings of Robophilosophy 2016 / TRANSOR 2016

Vol. 289. I. Skadiņa and R. Rozis (Eds.), Human Language Technologies – The Baltic

Perspective – Proceedings of the Seventh International Conference Baltic HLT 2016

Vol. 288. À. Nebot, X. Binefa and R. López de Mántaras (Eds.), Artificial Intelligence Research

and Development – Proceedings of the 19th International Conference of the Catalan

Association for Artificial Intelligence, Barcelona, Catalonia, Spain, October 19–21,

2016

Vol. 287. P. Baroni, T.F. Gordon, T. Scheffler and M. Stede (Eds.), Computational Models of

Argument – Proceedings of COMMA 2016

Vol. 286. H. Fujita and G.A. Papapdopoulos (Eds.), New Trends in Software Methodologies,

Tools and Techniques – Proceedings of the Fifteenth SoMeT_16

Vol. 285. G.A. Kaminka, M. Fox, P. Bouquet, E. Hüllermeier, V. Dignum, F. Dignum and

F. van Harmelen (Eds.), ECAI 2016 – 22nd European Conference on Artificial

Intelligence, 29 August–2 September 2016, The Hague, The Netherlands – Including

Prestigious Applications of Artificial Intelligence (PAIS 2016)

ISSN 1879-8314 (online)

Fuzzy Systems and Data Mining II

Proceedings of FSDM 2016

Edited by

Shilei Sun

International School of Software, Wuhan University, China

Antonio J. Tallón-Ballesteros

Department of Languages and Computer Systems, University of Seville, Spain

Dragan S. Pamučar

Department of Logistic, University of Defence in Belgrade, Serbia

and

Feng Liu

International School of Software, Wuhan University, China

© 2016 The authors and IOS Press.

All rights reserved. No part of this book may be reproduced, stored in a retrieval system,

or transmitted, in any form or by any means, without prior written permission from the publisher.

ISBN 978-1-61499-722-1 (online)

Library of Congress Control Number: 2016958585

Publisher

IOS Press BV

Nieuwe Hemweg 6B

1013 BG Amsterdam

Netherlands

fax: +31 20 687 0019

e-mail: order@iospress.nl

IOS Press, Inc.

6751 Tepper Drive

Clifton, VA 20124

USA

Tel.: +1 703 830 6300

Fax: +1 703 830 2300

sales@iospress.com

LEGAL NOTICE

The publisher is not responsible for the use which might be made of the following information.

v

Preface

Fuzzy Systems and Data Mining (FSDM) is an annual international conference devoted

to four main groups of topics: a) fuzzy theory, algorithm and system; b) fuzzy applica-

tion; c) the interdisciplinary field of fuzzy logic and data mining; and d) data mining.

Following the great success of FSDM 2015, held in Shanghai, the second edition in the

FSDM series was held in Macau, China, where experts, researchers, academics and

participants from the industry were introduced to the latest advances in the field of

Fuzzy Sets and Data Mining. Macau was declared a UNESCO World Heritage Site in

2005 by virtue of its cultural importance. The historic centre of Macau is of particular

interest because of its mixture of traditional Chinese and Portuguese cultures. Macau

has both Cantonese (a variant of Chinese) and Portuguese as official languages.

This volume contains the papers accepted and presented at the 2nd International

Conference on Fuzzy Systems and Data Mining (FSDM 2016), held on 11–14 Decem-

ber 2016 in Macau, China. All papers have been carefully reviewed by programme

committee members and reflect the breadth and depth of the research topics which fall

within the scope of FSDM. From several hundred submissions, 81 of the most promis-

ing and FAIA mainstream-relevant contributions have been selected for inclusion in

this volume; they present original ideas, methods or results of general significance

supported by clear reasoning and compelling evidence.

FSDM 2016 was also a reference conference, and the conference programme in-

cluded keynote and invited presentations, oral and poster contributions. The event pro-

vided a forum where more than 100 qualified and high-level researchers and experts

from over 20 countries, including 4 keynote speakers, gathered to create an important

platform for researchers and engineers worldwide to engage in academic communica-

tion.

I would like to thank all the keynote and invited speakers and authors for the effort

they have put into preparing their contributions to the conference. We would also like

to take this opportunity to express our gratitude to those people, especially the program

committee members and reviewers, who devoted their time to assessing the papers. It is

an honour to continue with the publication of these proceedings in the prestigious series

Frontiers in Artificial Intelligence and Applications (FAIA) from IOS Press. Our par-

ticular thanks also go to J. Breuker, N. Guarino, J.N. Kok, R. López de Mántaras, J. Liu,

R. Mizoguchi, M. Musen, S.K. Pal and N. Zhong, the FAIA series editors, for support-

ing this conference.

Last but not least, I hope that all our participants have enjoyed their stay in Macau

and their time at the Macau University of Science and Technology (M.U.S.T.). We

hope you had a magnificent experience in both places.

Antonio J. Tallón-Ballesteros

University of Seville, Spain

This page intentionally left blank

vii

Contents

Preface v

Antonio J. Tallón-Ballesteros

Order Fuzzy Time Series Forecasting 3

Sukhdev S. Gangwar and Sanjay Kumar

Introduction to Fuzzy Dual Mathematical Programming 11

Carlos A.N. Cosenza, Fabio Krykhtine, Walid El Moudani

and Felix A.C. Mora-Camino

Forecasting National Football League Game Outcomes Based on Fuzzy

Candlestick Patterns 22

Yu-Chia Hsu

A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 28

Feng Ran, Ke-Wei Hu, Jing-Wei Zhao and Yuan Ji

Interval-Valued Hesitant Fuzzy Geometric Bonferroni Mean Aggregation

Operator 37

Xiao-Rong He, Ying-Yu Wu, De-Jian Yu, Wei Zhou and Sun Meng

A New Integrating SAW-TOPSIS Based on Interval Type-2 Fuzzy Sets

for Decision Making 45

Lazim Abdullah and C.W. Rabiatul Adawiyah C.W. Kamal

Algorithms for Finding Oscillation Period of Fuzzy Tensors 51

Ling Chen and Lin-Zhang Lu

Toward a Fuzzy Minimum Cost Flow Problem for Damageable Items

Transportation 58

Si-Chao Lu and Xi-Fu Wang

Research on the Application of Data Mining in the Field of Electronic

Commerce 65

Xia Song and Fang Huang

A Fuzzy MEBN Ontology Language Based on OWL2 71

Zhi-Yun Zheng, Zhuo-Yun Liu, Lun Li, Dun Li and Zhen-Fei Wang

State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets 81

De-Hua He, Jin-Ding Cai, Song Xie and Qing-Mei Zeng

Finite-Time Stabilization for T-S Fuzzy Networked Systems with State

and Communication Delay 87

He-Jun Yao, Fu-Shun Yuan and Yue Qiao

viii

Sets 94

Zhi-Ying Lv, Ping Huang, Xian-Yong Zhang and Li-Wei Zheng

Fuzzy Rule-Based Stock Ranking Using Price Momentum and Market

Capitalization 102

Ratchata Peachavanish

Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation 108

Huan Niu, Jie Yang and Jie-Ru Chi

Hesitant Bipolar Fuzzy Set and Its Application in Decision Making 115

Ying Han, Qi Luo and Sheng Chen

Chance Constrained Twin Support Vector Machine for Uncertain Pattern

Classification 121

Ben-Zhang Yang, Yi-Bin Xiao, Nan-Jing Huang and Qi-Lin Cao

Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics 131

Eunsuk Yang

Data Mining

Nourhan Abuzayed and Belgin Ergenç

Deep Learning with Large Scale Dataset for Credit Card Data Analysis 149

Ayahiko Niimi

Probabilistic Frequent Itemset Mining Algorithm over Uncertain Databases

with Sampling 159

Hai-Feng Li, Ning Zhang, Yue-Jin Zhang and Yue Wang

Priority Guaranteed and Energy Efficient Routing in Data Center Networks 167

Hu-Yin Zhang, Jing Wang, Long Qian and Jin-Cai Zhou

Yield Rate Prediction of a Dynamic Random Access Memory Manufacturing

Process Using Artificial Neural Network 173

Chun-Wei Chang and Shin-Yeu Lin

Mining Probabilistic Frequent Itemsets with Exact Methods 179

Hai-Feng Li and Yue Wang

Performance Degradation Analysis Method Using Satellite Telemetry Big Data 186

Feng Zhou, De-Chang Pi, Xu Kang and Hua-Dong Tian

A Decision Tree Model for Meta-Investment Strategy of Stock Based on Sector

Rotating 194

Li-Min He, Shao-Dong Chen, Zhen-Hua Zhang, Yong Hu

and Hong-Yi Jiang

Virtualized Security Defense System for Blurred Boundaries of Next

Generation Computing Era 208

Hyun-A. Park

ix

Yong Wang, Ya-Zhi Tao, Xiao-Yi Wan and Hui-Ying Cao

Characteristics Analysis and Data Mining of Uncertain Influence Based

on Power Law 226

Ke-Ming Tang, Hao Yang, Qin Liu, Chang-Ke Wang and Xin Qiu

Hazardous Chemicals Accident Prediction Based on Accident State Vector

Using Multimodal Data 232

Kang-Wei Liu, Jian-Hua Wan and Zhong-Zhi Han

Regularized Level Set for Inhomogeneity Segmentation 241

Guo-Qi Liu and Hai-Feng Li

Exploring the Non-Trivial Knowledge Implicit in Test Instance to Fully

Represent Unrestricted Bayesian Classifier 248

Mei-Hui Li and Li-Min Wang

The Factor Analysis’s Applicability on Social Indicator Research 254

Ying Xie, Yao-Hua Chen and Ling-Xi Peng

Research on Weapon-Target Allocation Based on Genetic Algorithm 260

Yan-Sheng Zhang, Zhong-Tao Qiao and Jian-Hui Jing

PMDA-Schemed EM Channel Estimator for OFDM Systems 267

Xiao-Fei Li, Di He and Xiao-Hua Chen

Soil Heavy Metal Pollution Research Based on Statistical Analysis and BP

Network 274

Wei-Wei Sun and Xing-Ping Sheng

An Improved Kernel Extreme Learning Machine for Bankruptcy Prediction 282

Ming-Jing Wang, Hui-Ling Chen, Bin-Lei Zhu, Qiang Li, Ke-Jie Wang

and Li-Ming Shen

Novel DBN Structure Learning Method Based on Maximal Information

Coefficient 290

Guo-Liang Li, Li-Ning Xing and Ying-Wu Chen

Improvement of the Histogram for Infrequent Color-Based Illustration Image

Classification 299

Akira Fujisawa, Kazuyuki Matsumoto, Minoru Yoshida and Kenji Kita

Design and Implementation of a Universal QC-LDPC Encoder 306

Qian Yi and Han Jing

Quantum Inspired Bee Colony Optimization Based Multiple Relay Selection

Scheme 312

Feng-Gang Lai, Yu-Tai Li and Zhi-Jie Shang

A Speed up Method for Collaborative Filtering with Autoencoders 321

Wen-Zhe Tang, Yi-Lei Wang, Ying-Jie Wu and Xiao-Dong Wang

Analysis of NGN-Oriented Architecture for Internet of Things 327

Wei-Dong Fang, Wei He, Zhi-Wei Gao, Lian-Hai Shan and Lu-Yang Zhao

x

Shi-Chao Zhang, Yong-Gang Li, De-Bo Cheng and Zhen-Yun Deng

Safety Risk Early-Warning for Metro Construction Based on Factor Analysis

and BP_Adaboost Network 341

Hong-De Wang, Bai-Le Ma and Yan-Chao Zhang

The Method Study on Tax Inspection Cases-Choice: Improved Support Vector

Machine 347

Jing-Huai She and Jing Zhuo

Development of the System with Component for the Numerical Calculation

and Visualization of Non-Stationary Waves Propagation in Solids 353

Zhanar Akhmetova, Serik Zhuzbayev, Seilkhan Boranbayev

and Bakytbek Sarsenov

Infrared Image Recognition of Bushing Type Cable Terminal Based on Radon

and Fourier-Mellin Transform and BP Neural Network 360

Hai-Qing Niu, Wen-Jian Zheng, Huang Zhang, Jia Xu and Ju-Zhuo Wu

Face Recognition with Single Sample Image per Person Based on Residual

Space 367

Zhi-Bo Guo, Yun-Yang Yan, Yang Wang and Han-Yu Yuan

Cloud Adaptive Parallel Simulated Annealing Genetic Algorithm

in the Application of Personnel Scheduling in National Geographic

Conditions Monitoring 377

Juan Du, Xu Zhou, Shu Tao and Qian Liu

Quality Prediction in Manufacturing Process Using a PCA-BPNN Model 390

Hong Zhou and Kun-Ming Yu

The Study of an Improved Intelligent Student Advising System 397

Xiaosong Li

An Enhanced Identity Authentication Security Access Control Model Based

on 802.1x Protocol 407

Han-Ying Chen and Xiao-Li Liu

Recommending Entities for E-R Model by Ontology Reasoning Techniques 414

Xiao-Xing Xu, Dan-Tong Ouyang, Jie Liu and Yu-Xin Ye

V-Sync: A Velocity-Based Time Synchronization for Multi-Hop Underwater

Mobile Sensor Networks 420

Meng-Na Zhang, Hai-Yan Wang, Jing-Jie Gao and Xiao-Hong Shen

An Electricity Load Forecasting Method Based on Association Rule Analysis

Attribute Reduction in Smart Grid 429

Huan Liu and Ying-Hua Han

The Improved Projection Pursuit Evaluation Model Based on Depso Algorithm 438

Bin Zhu and Wei-Dong Jin

HRVBased Stress Recognizing by Random Forest 444

Gang Zheng, Yan-Hui Chen and Min Dai

xi

Ke-Ming Tang, Hao Yang, Xin Qiu and Lv-Qing Wu

Research on the Application-Driven Architecture in Internet of Things 458

Wei-Dong Fang, Wei He, Wei Chen, Lian-Hai Shan and Feng-Ying Ma

A GOP-Level Bitrate Clustering Recognition Algorithm for Wireless Video

Transmission 466

Wen-Juan Shi, Song Li, Yan-Jing Sun, Qi Cao and Hai-Wei Zuo

The Analysis of Cognitive Image and Tourism Experience in Taiwan’s Old

Streets Based on a Hybrid MCDM Approach 476

Chung-Ling Kuo and Chia-Li Lin

A Collaborative Filtering Recommendation Model Based on Fusion

of Correlation-Weighted and Item Optimal-Weighted 487

Shi-Qi Wen, Cheng Wang, Jian-Ying Wang, Guo-Qi Zheng,

Hai-Xiao Chi and Ji-Feng Liu

A Cayley Theorem for Regular Double Stone Algebras 501

Cong-Wen Luo

ARII-eL: An Adaptive, Informal and Interactive eLearning Ontology Network 507

Daniel Burgos

Early Prediction of System Faults 519

You Li and Yu-Ming Lin

QoS Aware Hierarchical Routing Protocol Based on Signal to Interference

plus Noise Ratio and Link Duration for Mobile Ad Hoc Networks 525

Yan-Ling Wu, Ming Li and Guo-Bin Zhang

The Design and Implementation of Meteorological Microblog Public Opinion

Hot Topic Extraction System 535

Fang Ren, Lin Chen and Cheng-Rui Yang

Modeling and Evaluating Intelligent Real-Time Route Planning and Carpooling

System with Performance Evaluation Process Algebra 542

Jie Ding, Rui Wang and Xiao Chen

Multimode Theory Analysis of the Coupled Microstrip Resonator Structure 549

Ying Zhao, Ai-Hua Zhang and Ming-Xiao Wang

A Method for Woodcut Rendering from Images 555

Hong-Qiang Zhang, Shu-Wen Wang, Cong Ma and Bing-Kun Pi

Research on a Non-Rigid 3D Shape Retrieval Method Based on Global

and Partial Description 562

Tian-Wen Yuan, Yi-Nan Lu, Zhen-Kun Shi and Zhe Zhang

Virtual Machine Relocating with Combination of Energy and Performance

Awareness 570

Xiang Li, Ning-Jiang Chen, You-Chang Xu and Rangsarit Pesayanavin

Network Evolution via Preference and Coordination Game 579

En-Ming Dong, Jian-Ping Li and Zheng Xie

xii

for Collaborative Target Tracking in Wireless Sensor Network 585

Yong-Jian Yang, Xiao-Guang Fan, Sheng-Da Wang, Zhen-Fu Zhuo,

Jian Ma and Biao Wang

Generalized Hybrid Carrier Modulation System Based M-WFRFT with Partial

FFT Demodulation over Doubly Selective Channels 592

Yong Li, Zhi-Qun Song and Xue-Jun Sha

On the Benefits of Network Coding for Unicast Application in Opportunistic

Traffic Offloading 598

Jia-Ke Jiao, Da-Ru Pan, Ke Lv and Li-Fen Sun

A Geometric Graph Model of Citation Networks with Linearly Growing

Node-Increment 605

Qi Liu, Zheng Xie, En-Ming Dong and Jian-Ping Li

Complex System in Scientific Knowledge 612

Zong-Lin Xie, Zheng Xie, Jian-Ping Li and Xiao-Jun Duan

Two-Wavelength Transport of Intensity Equation for Phase Unwrapping 618

Cheng Zhang, Hong Cheng, Chuan Shen, Fen Zhang, Wen-Xia Bao,

Sui Wei, Chao Han, Jie Fang and Yun Xia

A Study of Filtering Method for Accurate Indoor Positioning System Using

Bluetooth Low Energy Beacons 624

Young Hyun Jin, Wonseob Jang, Bin Li, Soo Jeong Kwon,

Sung Hoon Lim and Andy Kyung-yong Yoon

Author Index 637

Fuzzy Control, Theory and System

This page intentionally left blank

Fuzzy Systems and Data Mining II 3

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-3

Computational Method for High Order

Fuzzy Time Series Forecasting

Sukhdev S. GANGWAR and Sanjay KUMAR1

Department of Mathematics, Statistics & Computer Science, G. B. Pant University of

Agriculture & Technology, Pantnagar-263145, Uttarakhand, India

logical relations and hunt of apposite defuzzification process have been an

important area of research in fuzzy time series forecasting since its inception. In

present study, cumulative probability distribution based computational scheme

with discretized of universe is proposed for fuzzy time series forecasting. In this

study, cumulative probability distribution decides the length of intervals using

characteristic of data distribution and proposed computational algorithm

minimizes calculations of complex fuzzy logical relations and search of suitable

defuzzification method. To verify the enhancement in forecasting accuracy of

developed model, it is applied to the benchmark problem of forecasting historical

student enrollments of University of Alabama. Accuracy in forecasted enrollments

of developed model is also compared with the other various methods using

different error measures. Coefficients of correlation and determination are used to

determine the strength between forecasted and actual enrollments.

forecasting

Introduction

ARMA, ARIMA etc.) are comprehensive Statistical techniques used for forecasting.

An important confine of these parametrical forecasting models is not to tackle issue of

uncertainty in time series data that occurs because of imprecision and vagueness. Song

and Chissom [1, 2, 3] integrated fuzzy set theory of Zadeh [4] with time series

forecasting and developed few models of fuzzy time series (FTS) forecasting to grip

the uncertainty in historical time series data to forecast enrollments of the University of

Alabama. Chen [5] and Hwang et. al. [6] used simple arithmetic operation and

variations of the enrollments between current and last year to develop more efficient

FTS forecasting methods than the ones presented by Song and Chissom [1, 2, 3]. Own

and Yu [7] proposed high order forecasting model to address the limitation of the

model developed by Chen [5].

1

Corresponding Author: Sanjay KUMAR, Department of Mathematics, Statistics & Computer Science,

G. B. Pant University of Agriculture & Technology, Pantnagar-263145, Uttarakhand, India; E-mail:

skruhela@hotmail.com.

4 S.S. Gangwar and S. Kumar / Cumulative Probability Distribution

Wong et al. [8] utilized window size of FTS to propose a time variant forecasting

model. Performance of this model was tested by using time series data of enrollments

of University of Alabama and TAIEX. Kai et. al. [9] used K-mean clustering technique

to discretize universe of discourse and proposed an enhanced fuzzy time series

forecasting model. Chen and Tanuwijaya [10] presented new methods to handle

forecasting problems using high-order fuzzy logical relationships and automatic

clustering techniques.

Cheng et al. [11] discretized universe of discourse (UD) using minimum entropy

principle and used trapezoidal membership functions for enhancing accuracy in FTS

forecasting. Hurang and Yu [12] used ratio-based method to identify the length of

intervals in fuzzy time series forecasting which was further enhanced by Yolcu et. al.

[13] using single-variable constrained optimization technique. Teoh et al. [14] used

cumulative probability distribution approach (CPDA) with rough set rule induction and

proposed a hybrid FTS model. Su et al. [15] proposed used MEPA, CPDA and a rough

set algorithm to develop a new model for FTS forecasting.

Fuzzy relational equation and suitable defuzzification process are the pivot

components in any fuzzy time series forecasting method. To minimize the time in

generating fuzzy relational equations using complex min-max composition operation

and to eliminate the search of suitable defuzzification process Singh [16, 17, 18]

proposed various computational methods using difference parameter as fuzzy relation

for FTS forecasting. Joshi and Kumar [19] also presented a computational method

using third order difference as fuzzy relation. To enhance the performance of

computational FTS forecasting method, Gangwar and Kumar [20] developed a

computational algorithm using high order difference parameters and implemented it in

discretized universe of discourse. Intuitionistic fuzzy set (IFS) were used with CPDA

by Gangwar and Kumar [21] to introduce hesitation in FTS forecasting with unequal

intervals.

UD in all computational methods was portioned into the intervals of equal length.

In some cases, the discretization of universe of discourse into equal length intervals

may not give correct classification of time series data. The motivation and intention of

this study is to present a computational method using high order difference parameters

as fuzzy relation with discretized UD in which length of the intervals are optimized

using CPDA. Proposed algorithm eliminates time of making relational equations by

using tedious min-max composition operations and defuzzification process. Developed

method of FTS forecasting has been applied to benchmark problem of forecasting

student enrollments data of University of Alabama and compared with the other recent

methods proposed by various researchers.

~

Let U = {u1, u2, u3, . . . , un,}be an UD. A fuzzy set Ai of U is defined as follows:

~

Ai P A~ (u1 ) / u1 P A~ (u 2 ) / u 2 P A~ (u3 ) / u3 ....... P A~ (u n ) / u n

i i i i

~

Here P A~ is membership function of fuzzy set Ai and assigns a value to each element

i

S.S. Gangwar and S. Kumar / Cumulative Probability Distribution 5

~

of U in [0, 1]. P A~ (u k )

i

(1 ≤ k ≤ n) is grade of membership of uk in Ai . Suppose

fuzzy sets fi(t), (i = 1, 2, . . .) are defined in the Universe of discourse Y(t). If F(t) is

the collection of fi(t), then F(t) is known as fuzzy time series on Y(t) [1]. F(t) and Y(t)

depend upon t and hence both are the function of time. If only F(t-1) causes F(t), i.e.

F (t 1) o F (t ), then relationship is denoted by fuzzy relational equation

F (t ) F (t 1)oR (t , t 1) and is called the first-order model of F(t). (‘‘o’’ is

Max–Min composition operator). If more than one fuzzy sets F(t-n), F(t-n+1), . . . ,F(t-

1) cause F(t) , then relationship is called nth order fuzzy time series model [1, 2].

Proposed FTS method uses CPDA to discretize UD. It uses the ratio formula [20] for

determining the number of partitions. Order of difference parameters used in forecast is

computed as follows:

x For year 1973 enrollment forecast, proposed computational method uses second

order difference parameter D2 | E2 E1 | .

x For year 1974 enrollment forecast, proposed computational method uses third

order difference parameter D3 | E3 E2 | | E2 E1 | .

x For year 1975 enrollment forecast, proposed computational method uses fourth

order difference parameter

D4 | E4 E3 | | E3 E2 | | E2 E1 | .

ith order difference parameter is defined as follows:

ª i 1 º

Di | Ei Ei 1 | «¦ | Ei c Ei ( c 1) |» | E1 E 0 | , 2 d i d N

¬c 1 ¼ (1)

The methodology of proposed computational algorithm based FTS forecasting

method is explained in following steps:

Step 1 Since normal distribution is essential constraint for CPDA. We use lilliefors test

of Dallal and Wilkinson [22] to verify whether time series data follow normal

distribution or not. If time series data follow normal distribution, go to step 2.

Step 2 Standard deviation (V) is main characteristic of normal distribution and is

implemented to define universe of discourse, U = [Emin - V, Emax+ V].

Step 3 U is discretized into n intervals. Length of these intervals is determined using

CPDA in following sub steps:

1 Calculate both lower (PLB )and upper bound (PUB) of cumulative probabilities

using following equations:

6 S.S. Gangwar and S. Kumar / Cumulative Probability Distribution

1

PLB 0 ½

°

2i 3 ¾

i

PLB , 2 d i d 3°

2n ¿ (2)

PUB 1, i n (3)

cumulative distribution function (CDF) with parameters mean (c) and standard

deviation (V) at the corresponding probabilities in P.

1 ( x c) 2 ½

x

V 2S ³f ¯ 2V

and, P F ( x | c, V ) exp ® 2 ¾dx (5)

¿

Step 4 Construct the triangular fuzzy sets Ãi in accordance with the intervals

constructed in step 3.

Step 5 Fuzzify observations of time series by choosing maximum membership grade

set up fuzzy logical relationships.

Step 6 Use ratio formula [20] for repartitioning time series into different partitions.

Step 7 Apply the following computational algorithm.

~ ~

For a fuzzy logical relation Ai o A j Ãi and Ãj are fuzzified enrollment of

current and next year. Ei and Fj are actual enrollment of current year and crisp

forecasted enrollment of the next year.

Computational algorithm: Forecasted enrollments of University of Alabama are

computed using the following computational algorithm with complexity of linear order.

This algorithm uses the difference parameters (Di) of various orders, lease and upper

~ ~

bounds of the intervals. For a fuzzy logical relation Ai o A j , it uses mid point of the

~

intervals ui and uj having supremum value in Ai and Ãj. The algorithm starts to forecast

enrollment for year 1973 in partition 1, 1981 in partition 2 and 1988 in partition 3 using

the second order difference parameter. In following computational algorithm [*Ãj] is

interval uj for which membership in Ãj is supremum (i.e. 1), L[*Ãj] and U[*Ãj] are

lower and upper bounds of interval uj respectively. l[*Ãj] and M[*Ãk] is length and mid

point of the interval uj whose membership in Ãj is supremum (i.e. 1).

Obtained fuzzy logical Relation for year i to i+1

~ ~

Ai o A j

P=0 and Q=0

Compute

S.S. Gangwar and S. Kumar / Cumulative Probability Distribution 7

ª i 1 º

Di Ei Ei 1 «¦ Ei c Ei ( c 1) » E1 E0

¬c 1 ¼

For a = 2, 3,......i

Fia = M[*Ãi] + Di/(2(a-1))

FFia = M[*Ãi] - Di/(2(a-1))

If Fia ≥ L[*Ãj] and Fia ≤ U[*Ãj]

Then P =P+ Fia and Q =Q+ 1

If Fia ≥ M[*Ãj]

Then P =P+ l[*Ãj]/( 2(i-1)*(2(a-1))**2)

Else P =P- l[*Ãj]/( 2(i-1)*(2(a-1))**2)

If FFia ≥ L[*Ãj] and FFia ≤ U[*Ãj]

Then P =P+ FFia and Q =Q+ 1

If FFia ≥ M[*Ãj]

Then P =P+ l[*Ãj]/( 2(i-1)*(2(a-1))**2)

Else P =P- l[*Ãj]/( 2(i-1)*(2(a-1))**2)

Next a

Fj = (P + M(*Ãj))/(Q + 1)

Next i

We use the root mean square error (RMSE) and average forecasting error (AFE) to

compare the forecasting results of different forecasting methods. Coefficients of

correlation and determination are used to determine the strength between actual and

forecasted enrollments of University of Alabama.

3. Experimental Study

Alabama. Online lilliefors calculator confirms that time series data obey normal

distribution. Emin and Emax are observed from actual enrollments at University of

Alabama (Table 1). UD is defined as U [ Emin V , Emax V ] and is

approximately equal to [11280, 21112]. UOD is further discretized into seven unequal

intervals. Both PLB and PUB for each interval are computed using the equations 3, 4, 5

and 6 given in section 3. Seven fuzzy sets Ã1, Ã2, Ã3, ..........., Ã7 are defined on UD. Time

series data is discretized into three parts using ratio expression [9]. Finally,

computational algorithm described in section III is applied to each partition to compute

forecasted enrollments of University of Alabama. The forecasted enrollments are

presented in Table 1. Table 2 (a & b) shows RMSE and AFE in forecasted enrollments.

Table 1. Actual and Forecasted enrollments of University of Alabama from year 1971 to year 1992.

Year

Actual Forecasted Actual Forecasted

1971 13055 - 1982 15433 15502

1972 13563 - 1983 15497 15332

1973 13867 13993 1984 15145 15332

1974 14696 14392 1985 15163 15332

8 S.S. Gangwar and S. Kumar / Cumulative Probability Distribution

1976 15311 15332 1987 16859 -

1977 15603 15332 1988 18150 18478

1978 15861 15875 1989 18970 19356

1979 16807 - 1990 19328 19356

1980 16919 - 1991 19337 19356

1981 16388 16696 1992 18876 19356

forecasting method, it has been implemented to forecast enrollments of University of

Alabama. RMSE and AFER in forecasting enrollments by proposed method are

observed 240.20 and 1.183 respectively (Table 2a) which is less than that of the

methods proposed by Liu [23], Cheng et. al [24], Wong et. al [8], Egrioglu [25], Singh

[18], Joshi and Kumar [19] and Gangwar and Kumar [21], Gang & Hong-wei [26],

Gangwar & Kumar [20]. Diminished amount of RMSE and AFER confirms that the

CPDA and computational algorithm based projected FTS forecasting method

outperforms the methods given by [23, 24, 8, 25, 18, 19, 21, 26, 20] Coefficient of

Correlation (R) and Coefficient of determination (R2) between actual and forecasted

enrollments were observed 0.994294 and 0.988622 that confirms the good strength of

association between actual and forecasted enrollments.

Table 2a: Comparison of proposed method in terms of error measures

Proposed [23] [8] [18] [20]

240.2 328.78 297.2 308.7 642.6

RMSE

1.183 1.32 1.52 1.53 2.97

AFER

240.2 478.4 484.6 440.6 251 419

RMSE

1.183 2.40 2.21 2.06 1.27 2.07

AFER

5. Conclusions

based method for high order FTS forecasting to enhance the accuracy in forecast. The

fusion of cumulative probability distribution with computational method is proposed to

give a hybrid fuzzy time series model. The computational algorithm based FTS

forecasting methods those are reviewed in the literature use intervals of equal length

and keep the order of difference parameters fixed. The major recompenses of this FTS

forecasting method are (i) it uses a computational algorithm whose complexity is of

S.S. Gangwar and S. Kumar / Cumulative Probability Distribution 9

linear order with partition mechanism of UD and thus forecasting of time series data

with large number of observations may not be a matter of concern, (ii) it uses CPDA to

determine the length of the intervals used in forecasting, (iii) it reduces intricate

computations of fuzzy relational matrices and eliminates need of defuzzification

method.

Even though the fusion of CPDA with computational approach in partitioned

environment enhances the accuracy in forecasted output, following are few limitations

with the proposed method.

1. It can not be applied to time series data that does not follow normal

distribution.

2. Time series data are partitioned using the ratio

U ( Emax Emin ) / 2( Emax Emin ) . If 0 U d 1 then there will be no.

In this case, difference parameters increases heavily to make the computation

very complex.

3. If U t N / 2 (N = no. of observations in time series data), there will not be

enough observations in partitions for subsequently forecast.

However, some preprocessing techniques can be explored to make time series data

approximately normally distributed to address the limitation of non-normally

distributed time series data. There is also scope to explore the proposed method with

well known k-mean or any exclusive clustering techniques for partitioning the time

series data rather than using ratio formula.

References

[1] Q. Song, B. S. Chissom, Fuzzy time series and its models, Fuzzy Sets and Systems, 54(1993), 269-277.

[2] Q. Song, B. S. Chissom, Forecasting enrollments with fuzzy time series - Part I, Fuzzy Sets and Systems,

54(1993), 1-9

[3] Q. Song, B. S. Chissom, Forecasting enrollments with fuzzy time series - Part II., Fuzzy Sets and Systems,

64(1994), 1-8.

[4] L. A. Zadeh, Fuzzy set, Information and Control, 8(1965), 338-353.

[5] S. M. Chen, Forecasting enrollments based on fuzzy time series, Fuzzy Sets and Systems, 81(1996), 311-

319.

[6] J. R. Hwang, S. M. Chen, C. H. Lee, Handling forecasting problem using fuzzy time series, Fuzzy Set

and System, 100(1998), 217-228.

[7] C. M Own, P. T Yu, Forecasting fuzzy time series on a heuristic high-order model, Cybernetics and

Systems: An International Journal, 36(2005), 705-717.

[8] W. K. Wong, E. Bai, A. W. C. Chu, Adaptive time variant models for fuzzy time series forecasting. IEEE

Transaction on Systems, Man and Cybernetics-Part B: Cybernetics, 40(2010), 1531-1542.

[9] K. Chi, F. P. Fu and W. G. Chen, A novel forecasting model of fuzzy time series based on K-means

clustering, IWETCS, IEEE, 2010, 223–225.

[10] S. M. Chen, K, Tanuwijaya, Fuzzy forecasting based on high-order fuzzy logical relationships and

automatic clustering techniques, Expert Systems with Applications, 38(2011), 15425-15437.

[11] C. H. Cheng, R. J. Chang, C. A. Yeh, Entropy-based and trapezoid fuzzification based fuzzy time series

approach for forecasting IT project cost, Technological Forecasting and Social Change, 73(2006), 524-

542.

[12] K. Huarng, T. H. K. Yu, Ratio-Based Lengths Of Intervals To Improve Fuzzy Time Series Forecasting,

IEEE Transactions on Systems, Man and Cybernetics-Part B: Cybernetics, 36(2006), 328–40.

[13] U, Yolcu, E. Egrioglu, V. R. Uslu, M. A. Basaran, C. H. Aladag, A new approach for determining the

length of intervals for fuzzy time series, Applied Soft Computing, 9(2009), 647-651.

[14] H. J. Teoh, C. H. Cheng, H. H. Chu, J. S. Chen, Fuzzy Time Series Model Based on Probabilistic

Approach and Rough Set Rule Induction for Empirical Research in Stock Markets, Data & Knowledge

Engineering, 67(2008), 103–17.

10 S.S. Gangwar and S. Kumar / Cumulative Probability Distribution

[15] C. H. Su, T. L. Chen, C. H. Cheng, Y. C. Chen, Forecasting the Stock Market with Linguistic Rules

Generated from the Minimize Entropy Principle and the Cumulative Probability Distribution

Approaches, Entropy, 12(2010), 2397-417.

[16] S. R. Singh, A robust method of forecasting based on fuzzy time series, Applied Mathematics and

Computation, 188(2007), 472-484.

[17] S. R. Singh, A simple time variant method for fuzzy time series forecasting, Cybernetics and Systems:

An International Journal, 38(2007), 305-321.

[18] S. R. Singh, A computational method of forecasting based on fuzzy time series, Mathematics and

Computers in Simulation, 79(2008), 539-554

[19] B. P. Joshi, S. Kumar, A Computational method for fuzzy time series forecasting based on difference

parameters, International Journal of Modeling, Simulation and Scientific Computing, 4(2013),

1250023-1250035.

[20] S. S. Gangwar, S. Kumar, Partitions based computational method for high-order fuzzy time series

forecasting, Expert Systems with Applications, 39(2012), 12158-12164.

[21] S. S Gangwar, S. Kumar, Probabilistic and intuitionistic fuzzy sets based method for fuzzy time series

forecasting, Cybernetics and Systems, 45(2014), 349-361.

[22] G. E. Dallal, L. Wilkinson, An Analytic Approximation to the Distribution of Lilliefors’s Test for

Normality, The American Statistician, 40(1986), 294-296.

[23] H. T Liu, An improved fuzzy time series forecasting method using trapezoidal fuzzy numbers, Fuzzy

Optimization and Decision Making, 6(2007), 63-80.

[24] C. H. Cheng, J. W. Wang, G. W. Cheng, Multi-attribute fuzzy time series method based on fuzzy

clustering, Expert Systems with Applications, 34(2008), 1235-1242.

[25] E. Egrioglu, A new time-invariant fuzzy time series forecasting method based on genetic algorithm,

Advances in Fuzzy Systems, 2012, 2.

[26] G. Chen, H. W. Qu, A new forecasting method of fuzzy time series model, Control and Decision,

28(2013) 105-109.

Fuzzy Systems and Data Mining II 11

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-11

Programming

Carlos A. N. COSENZA a, Fabio KRYKHTINE a, Walid El MOUDANI b

and Felix A. C. MORA-CAMINO c,1

a

Lab Fuzzy, COPPE, Universidade Federal do Rio de Janeiro, Centro de Tecnologia,

Ilha do fundão, CEP 21941-594 Rio de Janeiro, RJ, Brazil

b

Doctorate School of Sciences and Technologies, Lebanese University, Tripoli-Al

koubba, Lebanon

c

ENAC, Toulouse University,7 avenue Edouard Belin, 31055 Toulouse, France

fuzzy dual parameters and variables is introduced to cope with parametric or

implementation uncertainties. It is shown that fuzzy dual programming problems

generate finite sets of deterministic optimization problems, allowing to assess the

range of the solutions and of the resulting performance at an acceptable

computational effort.

programming

Introduction

coefficients, limit values for decision variables, boundary levels for constraints) are

perfectly known while very often for real problems this is not exactly the case [1].

Different approaches have been proposed in the literature to cope with this difficulty. A

first approach has been to perform around the nominal optimal solution numerical post

optimization sensibility analysis [2]. When some probabilistic information about the

values of the uncertain parameters is available, stochastic optimization techniques [3]

may provide the most expected optimal solution. When these parameters are only

known to remain within some intervals, robust optimization techniques [4] have been

developed to provide robust solutions. The fuzzy formalism has been also considered in

this case as an intermediate approach to represent the parameter uncertainties and

provide fuzzy solutions [5]. These different approaches result in general into a very

large amount of computation which turns them practically unfeasible.

In this communication, a new formalism based on fuzzy dual numbers is proposed

to diminish the computational burden when dealing with uncertainty in mathematical

programming problems.

The adopted formalism considers fuzzy dual numbers which have been introduced

recently by two of the authors [6] and which can be seen as a simplified version of

1

Corresponding Author: Felix A. C. MORA-CAMINO; ENAC, Toulouse University, 7 avenue

Edouard Belin, 31055 Toulouse, France , E-mail: felix.mora@enac.fr

12 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming

fuzzy numbers adopting some elements of classical dual number calculus [7] and [8].

Indeed, the proposed special class of numbers, dual fuzzy numbers, integrates the

nilpotent operator H of dual numbers theory while considering symmetrical fuzzy

numbers. Then uncertain values are characterized by only three parameters: a mean

value, an uncertainty interval and a shape parameter.

In this communication, first are introduced the elements of fuzzy dual calculus

useful to tackle the proposed issue: basic operations as well as strong fuzzy dual and

weak fuzzy dual partial orders and fuzzy dual equality. Then two classes of fuzzy dual

mathematical programming problems are considered: those where uncertainty relays

only in the parameters of the problem and those for which the implementation of the

solution is subject to uncertainty. In both situations, the proposed formalism is

developed and used to identify the expected performance of the solutions.

~

The set of fuzzy dual numbers is the set ' of numbers of the form u = a H b such as

aR, bR+ where r(u) = a is the primal part and d(u) = b is the dual part of the fuzzy

dual number.

A crisp fuzzy dual number will be such as b is equal to zero, losing its fuzzy dual

attribute. To each fuzzy dual number a H b is attached a fuzzy symmetrical number

whose membership function μ is such that:

0 if x d a b or x t a b

° (1)

P ( x) ® P ( x ) P ( 2a x )

° x [a b, a b]

¯

~ ~

Different basic operations can be defined on ' [9]. First, the fuzzy dual addition , is

given by:

~ (x H y )

( x1 H y1 ) ( x1 x 2 ) H ( y1 y 2 )

2 2 (2)

x , is given by:

( x1 H y1 ) ~x ( x 2 H y 2 ) ( x1 . x 2 H ( x1 y 2 x 2 y1 ))

(3)

The fuzzy dual product has been chosen here in a way to preserve the fuzzy

interpretation of the dual part of the fuzzy dual numbers, so it is different of the product

of dual calculus. The neutral element of fuzzy dual multiplication is (1 0 H ) , written ~1 .

C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 13

It is easy to check that internal operations such as fuzzy dual addition and fuzzy dual

multiplication are commutative and associative. The fuzzy dual multiplication is

distributive with respect to the fuzzy dual addition since operator ε is according to Eq.

(3) such as:

~

H ~x H 0 (4)

Comparing with common fuzzy calculus, fuzzy dual calculus appears to be much

less demanding in computer resource [10] and [11].

Let E be an Euclidean space of dimension p over R then we define the set of fuzzy dual

~

vectors E as the pairs of vectors which are taken from the Cartesian product E u E ,

~

where E+ is the positive half-space of E. Basic operations can be defined over E :

Addition:

~ ~

(O H P )( a H b) Oa H (O b P a) O H P ', a H b E (6)

~ (7)

u
v r (u ).r (v) H ( r (u ) .d (v) d (u ). r (v) ) u, v E

~

where "*" represents the inner product in E and "." represents the inner product in E.

With the objective to make possible the comparison of fuzzy dual numbers as well as

the identification of extremum values between fuzzy dual numbers, a new operator

~

from ' to R+, called fuzzy dual pseudo norm, is introduced.

~

a H B ' : a H b D

a U b R (8)

where U is a shape parameter associated with the considered fuzzy dual number which is

given by:

14 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming

1

U ³ P ( x)dx [0, 1]

2b xR

(9)

functions, U = ½ while for crisp fuzzy dual numbers, U 0 . In this paper it is supposed

that the considered fuzzy dual numbers have the same shape, i.e. a common ρ value.

It is straightforward to establish that the operator defined in Eq.(8), whatever the

value of the shape parameter, satisfies the characteristic properties of a norm:

~

a H b' : a H b t 0 (10)

a R, b R a H b 0a b 0 (11)

O (a H b) D O a H b D a, O R, b R (13)

~

However, since the set of dual numbers ' is not a vector space, the proposed

operator can be only regarded as a pseudo norm.

The fuzzy dual pseudo norm of a fuzzy dual vector u can be introduced as (here

is the Euclidean norm associated to E):

u D r (u ) U d ( u ) (14)

Partial orders between fuzzy dual numbers can be introduced using this pseudo norm.

Depending if fuzzy dual numbers overlap or not, strong and weak partial orders can be

introduced.

~

A strong fuzzy dual partial order written t is defined over ' by:

~

a1 H b1 , a2 H b2 ' : a1 H b1 t a2 H b2 (15)

a1 U b1 t a2 U b2

In that case there is no overlap between the membership functions associated with

the two fuzzy dual numbers and the first one is definitely larger than the second one.

~

A weak fuzzy dual partial order written t is defined over ' by:

~

a1 Hb1 , a 2 Hb2 ' : a1 Hb1 t a 2 Hb2 (16)

a1 Ub1 t a 2 Ub2 t a1 Ub1 t a 2 Ub2

In that case there is an overlap between the membership functions associated with

the two fuzzy dual numbers and the first one appears to be partially larger than the

second one.

C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 15

A fuzzy dual equality, written ~ , can be defined between two fuzzy dual numbers

by:

~

a1 H b1 , a 2 H b2 ' : a1 H b1 ~ (a 2 H b2 ) (17-a)

a 2 >a1 U b1 , a1 U b1 @ et a1 >a 2 U b2 , a 2 U b2 @

~

a1 Hb1 , a 2 Hb2 ' : a1 Hb1 # a 2 Hb2

a1 Ub1 t a 2 Ub2 t a 2 Ub2 t a1 Ub1 (17-b)

or a 2 Ub2 t a1 Ub1 t a1 Ub1 t a 2 Ub2

In this last case there is a complete overlap of the membership functions associated

with the two fuzzy dual numbers.

Then when considering two fuzzy dual numbers, they will be in one of the above

situations (no overlap, partial overlap or full overlap): strong fuzzy dual inequality,

weak fuzzy dual inequality or fuzzy dual equality.

The max and the min operators over two or more fuzzy dual numbers can now be

defined. Let c+H J be the fuzzy dual maximum of fuzzy dual numbers a + H α and b+ H E :

c H J max ^a H D , b H E ` (18)

then:

d H G min ^a H D , b H E ` (20)

then:

Observe that here the max and min operators produce new fuzzy dual numbers.

problems.

16 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming

3.1. Discussion

To illustrate the proposed approach the case of a linear programming problem with real

variables where all parameters are uncertain and described by fuzzy dual numbers, is

considered. The proposed approach can be easily extended to integer mathematical

programming problems or to nonlinear mathematical programming problems, or to

problems with different types of level constraints.

Let then define formally problem L~ given by:

n

minn

xR

¦ c~ x

i 1

i i

(22)

n

~

¦ a~ ki xi t bk k ^1,", m` (23)

i 1

and

xi R i ^1,", n` (24)

When the problem is a constrained cost minimization problem, the cost parameters

c~i , although uncertain, remains positive and the absolute operator can be retrieved

from expression of Eq. (22). Here is adopted the fuzzy dual hypothesis for the cost

coefficients ci , the technical parameters aki and the constraint levels bk . This opens

different perspectives to be considered when dealing with the parameter uncertainty.

Here are considered three different cases:

the nominal case (a standard deterministic linear programming problem) in which

the dual parts of the parameters are zero;

the pessimistic case where uncertainty adds to the cost and where constraints are

strong ones,

the optimistic case where uncertainty subtracts from the cost and the constraints

are weak ones.

The nominal case corresponds to a standard mathematical programming problem. The

analysis of the pessimistic case is developed here with more detail and can be transposed

easily to the study of the optimistic case.

programming problem with fuzzy dual constraints and real decision variables and is

written as:

n

minn

xR

¦ (c H d ) x

i 1

i i i

(25)

n

¦ (a ki H D ki ) xi t bk HEk k ^1, " , m` (26)

i 1

C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 17

This problem corresponds to the minimization of the worst estimate of total cost

with satisfaction of strong level constraints. Here variables xi are supposed to take real

positive values, but they could take also fully real or integer values. In the case in

which the di are zero, the uncertainty is relative to the feasible set. Problem L+ is

equivalent to the following problem in R n :

n n

minn

xR

¦

i 1

ci xi U ¦d x

i 1

i i

(28)

n

¦ (a ki U D ki ) xi t bk U E k k ^1,", m` (29)

i 1

and

xi t 0 i ^1,", n` (30)

values of the nominal criterion and of its degree of uncertainty. In the case in which

the cost coefficients are positive this problem reduces to a classical linear programming

problem over R n . In the general case, since the quantity n c x will have at solution a

¦

i 1

i i

particular sign, the solution x of problem L+ will be the one corresponding to the

minimum of:

^ min §¨ c xH U d xH ·¸ , min §¨ c xG U d xG ·¸ `

n n n n

(31)

xR n

¦

©i 1

i i ¦i 1

i i

¹ xR n

¦

©i 1

i i ¦

i 1

i i

¹

H

where x is solution of problem:

n n

xR

minn ( ¦c x U ¦d x )

i 1

i i

i 1

i i

(32)

n

i 1

¦c xi 1

i i t 0 and xi t 0 i ^1,", n` (34)

G

and where x is solution of problem:

n n

xR

minn ( U ¦ i 1

d i xi ¦c x )

i 1

i i

(35)

18 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming

n

i 1

¦c x

i 1

i i d0 and xi t 0 i ^1,", n` (37)

The fuzzy dual optimal performance of this program is then given by:

n n n

¦ (c

i 1

i H d i ) xi ¦c x

i 1

i i H ¦ d i xi

i 1

(38)

Problems of Eqs. (32), (33) and (34) and of Eqs. (35), (36) and (37) are classical

continuous linear programming problems which can be solved in acceptable time even

for large size problems.

n n

minn ¦ ci xi U ¦ d i xi (39)

xR

i 1 i 1

n

¦ (a ki U D ki ) xi t bk U E k k ^1,", m` (40)

i 1

and

xi t 0 i ^1,", n` (41)

n

minn ¦ ci xi (42)

xR

i 1 i

n

¦a x t bk

ki i k ^1,", m` (43)

i 1

and

xi t 0 i ^1,", n` (44)

0

Let x and x be the respective solutions of problems of Eqs.(39), (40) and (41) and of

Eqs. (42),(43) and (44), it will be instructive to compare in a first step the performances

of problems L+, L- and L0 where:

n n n n n

(45)

¦c x

i 1

i

i U ¦ d i xi d

i 1

¦c x

i 1

i

0

i d ¦c x

i 1

i

i U ¦ d i xi

i 1

C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 19

This allows to display the dispersion of results between the pessimistic view of problem

L+, the optimistic view of problem L- and the neutral view of problem L0.

Then in a second step, since x is feasible for problems L- and L0, it is of interest to

compare the different performances when adopting the x solution:

n n n n n

(46)

¦c x

i 1

i

i U ¦ d i xi d

i 1

¦c x

i 1

i

i d ¦c x

i 1

i

i U ¦ d i xi

i 1

Now we consider fuzzy dual programming problems with fuzzy dual parameters and

decision variables as well. In that case problem V is formulated as:

n

min ( c H d )( x H y )

xi R , yi R

¦

i 1

(47)

i i i i

n (48)

¦ (a ki H D ki ) ( xi H yi ) t bk HEk k ^1, ", m`

i 1

and

xi R, yi t 0 i ^1, ", n` (49)

The above problem corresponds to the minimization of the worst estimate of total cost

with satisfaction of strong level constraints when there is some uncertainty not only on

the values of the parameters but also on the ability to implement exactly what should be

the optimal solution. According to Eq. (3), problem V can be rewritten as:

n

min

xR , y R

n n ¦ (c x

i 1

i i H ( xi d i ci y ))i (50)

n

¦ (a ki xi H (D ki xi aki yi )) t bk HEk k ^1, ", m` (51)

i 1

n n

min C ( x, y )

xR , yR n

¦c x

i 1

i i U ¦ (d i xi ci yi )

i 1

(52)

n

i 1

x R n , y R n : ½

°n ° (54)

° °

A( x, y ) ®¦ ( a ki x i U ( D ki x i a ki y i )) t bk U E k ¾

°i 1 °

° k ^1, " , m` °

¯ ¿

then

20 C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming

x R n , y R n A( x, y) A( x, 0) and C ( x, y) t C ( x, 0) (55)

It appears, as expected, that the case of no diversion of the nominal solution is

always preferable. In the case in which the diversion from the nominal solution is fixed

to yi , i ^1, ", n`, problem V has the same solution than problem V’given by:

n n

minn ¦ ci xi U ¦ d i xi (56)

xR

i 1 i 1

n n

¦ (a

i 1

ki x i U D ki x i ) t bk U ( E k ¦ a ki y i )

i 1

(57)

k ^1, " , m`

The fuzzy dual optimal performance of problem of Eq. (46) will be given by:

n n

¦c

i 1

i xi* H ¦ ( xi* d i ci yi )

i 1

(58)

~

Here also other linear constraints involving the other partial order relations over '

(weak inequality and fuzzy equality) could be introduced in the formulation of problem

V while the consideration of the integer version of problem V will lead also to solve

families of classical integer linear programming problems.

The performance of the solution of problem V will be potentially diminished by the

reduction of the feasible set defined by Eq. (54).

5. Conclusion

uncertainty on the values of their parameters or in the implementation of the values for

the decision variables. A special class of fuzzy numbers, fuzzy dual numbers, has been

defined in such a way that the interpretation of their dual part as an uncertainty level

remains valid through the basic operations on these numbers. A pseudo norm has been

introduced, allowing the comparison between fuzzy dual expressions and leading to the

definition of hard and weak constraints to characterize fuzzy dual feasible sets.

Mathematical programming problems with uncertain parameters and variables have

been considered under this formalism. The proposed solution approach leads to solve a

finite collection of classical mathematical programming problems corresponding to

nominal and extreme cases, allowing to the characterization of the expected optimal

performance and solution. These results in a rather limited additional computational

effort compared with classical approaches. The above approach could be easily

extended to cope with fuzzy dual numbers of different shapes present in the same

mathematical programming problem.

References

[1] M. Delgado, J. L. Verdegay and M. A. Vila, Imprecise costs in mathematical programming problems,

Control and Cybernetics, 16(1987), 114-121.

[2] T. Gal, H. J. Greenberg (Eds.), Advances in Sensitivity Analysis and Parametric Programming, Series:

International Series in Operations Research & Management Science, Vol. 6, Springer, 1997.

C.A.N. Cosenza et al. / Introduction to Fuzzy Dual Mathematical Programming 21

[3] A. Ruszczynski and A. Shapiro. Stochastic Programming. Handbooks in Operations Research and

Management Science, Vol. 10, Elsevier, 2003.

[4] A. Ben-Tal, L. El Ghaoui and A. Nemirovski, Robust Optimization. Princeton Series in Applied

Mathematics, Princeton University Press, 2009.

[5] H. J. Zimmermann, Fuzzy Sets Theory and Mathematical Programming, in A. Jones et al. (eds.), Fuzzy

Sets Theory and Applications, D. Reidel Publishing Company, 99-114, 1986.

[6] C. A. N. Cosenza and F. Mora-Camino, Nombres et ensembles duaux flous et applications, in French,

Technical repport, Labfuzzy laboratory, COPPE/UFRJ, Rio de Janeiro, August 2011.

[7] W. Kosinsky, On Fuzzy Number Calculus, International Journal of Applied Mathematics and Computer

Science, 16(2006), 51-57.

[8] H. H. Cheng , Programming with Dual Numbers and its Application in Mechanism Design, Journal of

Engineering with Computers, 10(1994), 212-229.

[9] Mora_Camino F., O. Lengerke and C. A. N. Cosenza, Fuzzy sets and dual numbers, an integrated

approach, Fuzzy sets and Knowledge Discovery Conference, Chongqing, China, 28-31 May 2012.

[10] H. Nasseri, Fuzzy Numbers: Positive and Nonnegative, International Mathematical Forum, 3(2006),

1777-1780.

[11] E. Pennestrelli and R. Stefanelli, Linear Algebra and Numerical Algorithms using Dual Numbers,

Journal of Multibody Systems Dynamics, 18(2007), 323-344.

22 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-22

Game Outcomes Based on Fuzzy

Candlestick Patterns

Yu-Chia HSU1

Department of Sports Information and Communication, National Taiwan University of

Sport, Taichung City, Taiwan.

metric candlestick and fuzzy pattern recognition is proposed. The sports gambling

market data are gathered and processed to form the candlestick chart, which has

been widely used in financial time series analysis. Unlike the traditional

candlestick is composed of the price for financial market analysis, the candlestick

for sports metric is determined by the point spread, total point scored, and the

gambling shock which measures the bias of gambling line and real total point

scored. The fluctuation behaviors of sports outcome are represented by the

fuzzification of candlestick for pattern recognition. The decision tree algorithm is

applied on the fuzzified candlesticks to find the implicit knowledge rules, and used

these rules to forecasting the sports outcome. The National Football League is

introduced to our empirical study to verify the effectiveness of forecasting.

gambling markets

Introduction

Sports outcome prediction is an important area of betting on sports events, which has

gained a lot of popularity recently. American Football, such as National Football

League (NFL) games, uses a complex scoring system that the resulting scores are

hardly to model using standard modeling approaches. There are five ways to score in

American Football, giving 2 points, 3 points, 6 points, and 7 points under different

touchdown situation. Other sports, such as soccer, baseball, basketball, are relatively

much simpler to give different points under few situations. Consequently, the standard

modeling approaches, such as Poisson-type regression models, can provide impressive

performance when modeling scores in soccer, but it may perform worse when applied

to American Football scores due to the peculiar distribution [1].

Many researches on sport forecasting have demonstrated that the win/lose results

of the game may be affected by the past score, offense/defense statistics, player

absence [2], and etc. Even the temperature, wind speed and moistures in the

competition venues may potentially influence player performance. Most research

adopts these influencing factors for quantitative analysis to estimate the pointed score

1

Corresponding author: Yu-Chia Hsu, Dep. of Sports Information and Communication, National Taiwan

University of Sport, No. 16, Sec. 1, Shuang-Shih Rd., Taichung, Taiwan; E-mail: ychsu@ntupes.edu.tw.

Y.-C. Hsu / Forecasting National Football League Game Outcomes 23

academics and professionals and growing exponentially since the appearance of the

“Moneyball” phenomenon [3]. Although, considering these explosive data as the

variables that influence performance of players are important for coaches and managers

of sports teams, there have been very few studies conducted on modeling the betting

market data to forecast the winner of the game using non-parametric model based on

computational intelligence techniques.

The market data in sport betting, such as odds, point spread, over/under, offer a type of

predictor and source of expert advice and expectation probability regarding sports

outcome. Adopting the betting market data published by bookmakers in the prediction

model could provide a rather high forecasting accuracy [4]. It is reasonable because

betting companies would not survive with inefficient odds and spread.

The betting market has many similar characteristics to financial markets [5]. Three

variants of the efficient market hypothesis (EMH): "weak", "semi-strong", and "strong"

form, which reflect the relationship between the current prices and the information

rationality and instantaneousness, have also been extended to betting market to reflect

the line incorporates all relevant information contained, all public information, and

inside information in the past game outcome [6].

Moreover, the price fluctuation were followed the mechanism known as the

random walk model under some restriction and condition, that the profitable

forecasting models were not persisting for a long time. However, both in financial and

betting market, the profitable forecasting models existed during the periods of market

inefficiency, but require extensive modeling innovations [7].

Candlestick charting originates back to the Japanese rice future market in the 18th

century. It provides a visual aid for looking at data differently and forecasting near term

equity price movement, and then develops insight into market psychology. Recently,

Japanese candlestick theory is one of the most widely used technical analysis

techniques that based on the empirical model for investment decision. The trend of

financial time series was assumed to be predictable by recognizing the specific

candlestick patterns.

The candlestick is produced with the opening, highest, closing, lowest prices over

a given time interval. Each candlestick includes both a body and a wick that extends

either above or below the body. Figure 1 illustrates the candlestick line. The body is

shown as a box to represent the difference between the opening and closing price, and

the wick is shown as a line to represent the highest and the lowest price range during

the opening and closing. The body is filled with either black or white color, according

to the condition that weather the opening price is above or below the closing price,

respectively. In some particular time interval, the highest /lowest price is marked by the

top/bottom of the body. However, a candlestick may or may not have a wick.

24 Y.-C. Hsu / Forecasting National Football League Game Outcomes

visualized interface for experienced chartist easy to identify the patterns. In a decade,

this analysis technique was extended to apply in other field, such as in predicting teen’s

stress level change on a micro-blog platform [8], and in sports metric [9] to forecasting

the game outcomes. However, the graphic patterns, such as the size of body, and the

relationship of position between two successive candlestick are hardly to be

represented. Some researchers have propose to utilize the fuzzy logic to solve the

problem [10-12].

The sports metric candlestick charts provide simple graphics of game outcomes relative

to the gambling line, which have been proposed by Mallios [9]. Similarly with the

candlestick chart used in financial equity price analysis, each candlestick of sports

metric includes both a body and a wick that extends either above or below the body.

But the open, high, close, and low price, which constitute the body and wick of

candlestick chart in finance are not appropriate for sports. For sports metric, the

candlestick charts are composed by the winning/losing margin, the total points scored,

and their corresponding gambling line. Figure 2 illustrates the sports metric candlestick.

The body of candlestick is determined by the winning/losing margin, denoted D, and

the gambling line on the wining/losing margin, denoted LD, for a certain team. If D >

LD, the body’ color is white, and the body’s maximum and minimum values are

defined by D and LD. If LD > D, the body’ color is black, and the body’s maximum

and minimum values are defined by LD and D. The length of the candlestick wick is

determined by the gambling shock of line on total points scored, denoted GST. GST is

calculated by the difference between total points scored in the game and the

corresponding line on total points scored. If GST > 0, the wick extends above the body,

and below the body when GST <0. There is no wick when GST = 0.

The lengths of the wicks and the body can reflect the price fluctuation during a time

interval which are considered as the critical characteristics for candlestick pattern

recognition. In traditional technique analysis, the size of chart as short, medium or long

Y.-C. Hsu / Forecasting National Football League Game Outcomes 25

candlestick more appropriately, four fuzzy linguistic variables used in fuzzy set are

adopt to describe the length of the wicks and the body: Very Short, Short, Long, and

Very Long. Figure 3 illustrates the membership function of the linguistic variables.

Two type of membership function are adopt to define the linguistic variables, linear

function is used for Very Long and Very Short, and triangle function is used for Short

and Long. In Figure 3, the footnote of x-axis indicate the real length of body or wick,

and the unit of x-axis is the normalized scale from 0 to 1. In this study the result of

evaluating the input values through the membership functions are obtained by

calculating the length of bodies or wicks with min-max normalization to be between 0

and 1.

The size of candlestick line only reflects the characteristics of the price fluctuation

during a time interval, which is not enough to model valuable candlestick patterns. In

order to capture the characteristics of consequent trend of candlestick, the relationship

between two adjacent candlestick lines should be considered. Compared with the

previous candlestick line, the related position of the opening and closing price are used

to model the open style and the close style. Five linguistic variables, Low, Equal Low,

Equal, Equal High, and High, are defined to represent the open and close style. Figure

4 shows the membership function of the linguistic variables of the open style and close

style. The unit of x-axis is the prices in previous time interval and the y-axis is the

possible values of the membership function. The parameters in the function to describe

the linguistic variables depend upon the previous candlestick line, which is illustrated

by the previous candlestick line in the bottom of figure 4.

The candlestick charts are characterized with fuzzy linguistic variables by applying

subordinate function maximum method. When more than one fuzzy set matched for a

single crisp value, the fuzzy set with the maximum membership value will be selected.

Table 1 shows the example of fuzzy candlestick pattern at time t-i to t for forecasting

the next game outcome.

To mining the rule of candlestick pattern for forecasting next game outcome, we

extract the historical data, consist of the point spread line, total point line, the actual

box score, and the outcome, at time t, t-1,…to t-i. Then, we translate these data to the

candlestick char entity, and symbolize the time series by fuzzification. The fuzzy

26 Y.-C. Hsu / Forecasting National Football League Game Outcomes

candlestick patterns are then recognized by using the random forests algorithm to

achieve the optima decision tree. Finally, the next game outcomes are predicted by

using the optimal decision tree.

Figure 4. The membership function of the linguistic variables of the open style and close style

Table 1. Example of fuzzy candlestick pattern

Time Body Upper Lower Body Open style Close style Outcome

frame length wick wick color

Length length

t-i Short VeryShort VeryShort Black EqualHigh Low Win

…

t VeryLong Long Short White EqualLow Equal Lose

For demonstration the effectiveness of forecasting game outcome, we use the NFL data

gathered from the covers.com in the 2011-2012 season. We arbitrarily choose the

champion of the Super Bowl in the year, New York Giant, as the team for empirical

study. The data covers the regular season, and after seasons data for the year. Total 20

games that New York Giant have joined were held in the year, including 17 regular

games from week 1 to week 17, and 4 after season games including the Wildcard,

Divisional, Conference, and Super Bowl. The data in the year are divided into two sets

according to the NFL season. The data in regular season is considered as the training

set, and the data in play-off or super bowl is considered as the testing set. The rules of

candlestick patterns is found by the regular season data, and used to forecast the

outcome of super bowl.

The effect of the prediction is evaluated based on four performance measurements,

precision, recall, and F-measure, which are widely used in data mining. The formulas

are shown in Eqs. (1) – (3).

TP (1)

precision u 100%

TP FP

TP (2)

recall u 100%

TP FN

recall u precision (3)

F measure 2 u

recall precision

where TP, FP, and FN denote true positive, false positive, and false negative.

Y.-C. Hsu / Forecasting National Football League Game Outcomes 27

The empirical results of the forecasting are presented in Table 2. The results

revealed that the precision, recall, and F-measure of the outcome prediction for win are

extreme high. This may be occurred due to the small size of samples, which is the

innate limitation of sports outcome forecasting. Most team of NFL only played almost

20 games in one season. So, it is reasonable that only 17 samples are used for training,

and left 4 samples are for testing. In fact, the New York Giant wins the all 4 games of

after season, including the Super Bowl.

Table 2. The results of prediction

Time frame of Number of Outcome Precision Recall F-measure

the input data input variables prediction

t 7 Win 100% 100% 1

Lose 0% 0% 0

t-1, t 14 Win 100% 75% 0.857

Lose 0% 0% 0

5. Conclusion

to predict the champion of NFL super bowl. This model combine the advantage of

candlestick chart analysis for financial time series and pattern recognition technique by

applying fuzzy set and random forests algorithm. Unlike most sports forecasting

models which are focus on the athletes’ performance, we adopt the beating market data

for considering the psychology and behavior of beating market maker and sport fans.

The original beating market data are transformed into candlestick chart and

characterized by fuzzification, and then be classified to find the implicit patterns for

forecasting. Empirical results show that this idea is feasible and obtains acceptable

accurate of prediction.

References

[1] R. D. Baker, I. G. McHale, Forecasting exact scores in National Football League games, International

Journal of Forecasting, 29 (2013), 122-130.

[2] W. H. Dare, S. A. Dennis, R. J. Paul, Player absence and betting lines in the NBA, Finance Research

Letters, 13 (2015), 130-136.

[3] M. Lewis, Moneyball: The Art of Winning an Unfair Game, W. W. Norton & Company, New York, 2003.

[4] D. Paton, L. V. Williams, Forecasting outcomes in spread betting markets: can bettors use ‘quarbs’ to

beat the book, Journal of Forecasting, 24 (2005), 139-154.

[5] S. D. Levitt, Why are gambling markets organized to differently from financial markets, The Economic

Journal, 114 (2004), 223–246.

[6] L. V. Williams, Information efficiency in betting markets: A survey, Bulletin of Economic Research, 51

(1999), 1-39.

[7] W. S. Mallios, Forecasting in Financial and Sports Gambling Markets. Wiley, New York, 2011.

[8] Y. Li, Z. Feng, L. Feng, Using candlestick charts to predict adolescent stress trend on micro-blog,

Procedia Computer Science, 63 (2015), 221-228.

[9] W. Mallios, Sports Metric Forecasting, Xlibris Corporation, 2014.

[10] C.-H. L. Lee, A. Liu, W.-S. Chen, Pattern discovery of fuzzy time series for financial prediction, IEEE

Transactions on Knowledge and data Engineering, 18 (2006), 613-625.

[11] Q. Lan, D. Zhang, L. Xiong, Reversal pattern discovery in financial time series based on fuzzy

candlestick lines, Systems Engineering Procedia, 2 (2011), 182-190.

[12] P. Roy, S. Sharma, M. K. Kowar, Fuzzy candlestick approach to trade S&P CNX NIFTY 50 index

using engulfing patterns, International Journal of Hybrid Information Technology, 5 (2012), 57-66.

28 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-28

Valley Equalization Circuit

Feng RANa, Ke-Wei HUb, Jing-Wei ZHAOb and Yuan JIa,1

a

Department of Microelectronics Center, Shanghai University, Shanghai, China

b

School of Mechatronic Engineering and Automation, Shanghai University, Shanghai,

China

Abstract. Aiming at the problem of high cost and slow equalization speed in

traditional circuit, a parallel filling valley equalization circuit based on fuzzy

control is proposed in this paper. A fuzzy controller suitable for the circuit is

designed. The average voltage, voltage range and the balance electric quantity of

the battery pack are modeled by fuzzy model. The fuzzy reasoning and

defuzzification is produced to optimize the circuit control logic, which can be

adapted to the nonlinearity of the battery pack and the uncertainty of the battery

parameters. The simulation and experiment results show that, in the process of

charging and discharging, the fuzzy control based parallel filling valley

equalization circuit has the advantage of fast and efficient equalization which can

improve the use efficiency of the battery pack.

utilization, lithium battery

Introduction

As the continuous environmental pollution and the deterioration of oil, the vehicles

energy system structure has become a hot issue of the global concern and research [1].

In recent years, people are committed to the development of safe, efficient and clean

transport. The electric vehicle represents the development direction of the new

generation of environmentally friendly vehicles. As the power source of electric

vehicles, the power battery directly affects the use of electric vehicles [2]. The lithium

battery is one of the best choices for the power source of electric vehicle because of its

advantages, such as the high voltage, low self-discharge rate, high efficiency and

environmental protection [3]. However, the lithium battery in the production, long

standing and times of the charge and discharge process, battery charge amount of the

gap increases, so that within the battery pack cells dispersion increased dispersion

increases, individual cell performance degradation intensified, eventually leading to the

whole group batteries failure [4]. Therefore, the battery equalization technology is an

indispensable technology to ensure the safety of the battery and extend the service life

of the battery pack [5].The battery equalization can be roughly divided into active and

passive equalization [6-7]. Active balance in the process will not consume the battery

energy and has become a hot research topic today [8]. In the active balance scheme, the

1

Corresponding author: Yuan JI, Department of Microelectronics Center, Shanghai University,

Shanghai, China; E-mail: jiyuan@shu.edu.cn.

F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 29

highest energy cell of the battery pack adds energy to the lowest battery cell through

the converter. The super capacitor equalization, the inductance balance, and the

converter equalization are the most common ways to achieve the parallel filling valley

equalization [9]. However, problems of active equalization that needs to be solved

urgently, such as the high cost, the complex control circuit and the slow equalization

speed.

At this stage, the research of battery equalization technology mainly includes two

aspects. On the one hand it is the equalization strategy [10], about how to build a

common evaluation system of the battery group and then obtains the control strategy

equilibrium basis. On the other hand it is the design of the equalization circuit topology

[11], mainly research on the hardware implementation. In view of these two aspects,

the researchers put forward many different equalization solutions. Tian et al. [12]

proposed an energy tap charging and discharging equalization control strategy but did

not give a specific implementation of the program. Wu et al. [13] proposed that the

SOC based equilibrium of the battery can effectively eliminate the inconsistency of the

battery. But due to the SOC estimate accuracy is not guaranteed, it is only suitable for

the offline mode equalization. Fu et al. [14] proposed a control strategy based on the

battery voltage as the criterion of equilibrium, and the goal is to achieve the relative

consistency of the SOC of a single cell. It is widely used because of its clear goal and

simple control, but the ability to deal with nonlinear problems needs to be improved in

this method.

Generally, the lithium battery shows the nonlinear characteristic. In order to make

the battery maintain good system stability and fast balancing speed in different

environments with uncertain parameters, this paper proposes a parallel filling valley

equalization scheme based on the fuzzy control. A balanced fuzzy controller is used to

optimize the balance strategy. Simulation results show that the balancing speed and the

efficiency of the proposed parallel fill valley equalization scheme has been improved,

compared with the traditional inverse excitation filling valley control strategy. Thirteen

general E-bike lithium batteries (rated 48V) were used as the object of the series

battery for charging and discharging experiments. The experimental results show that,

the voltage difference between the lithium battery converges to less than 10mV with

the fuzzy control based parallel fill valley equalization strategy when the large voltage

difference are initialized between the battery packs.

determine all related parameters with a precise mathematical model. By using the fuzzy

control method, the system can make reasonable decision under uncertain or imprecise

conditions. In this paper, the fuzzy control technique is used to adjust the equilibrium

current and time. The fuzzy logical system uses Sugeno type. Sugeno method is

computationally effective and works well with optimization and adaptive techniques,

which makes it very attractive in control problems, particularly for dynamic nonlinear

systems. The inference calculation of the input to the output is realized by a set of

inference rules prior mastered. A typical fuzzy control system is composed of the rule

base, data base, inference engine, fuzzy unit and the defuzzification unit. Figure 1

shows the structure of filling valley equalization fuzzy controller, a typical two input

and one output fuzzy control system.

30 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit

equalization electricity quantity of the battery cell by controlling the equalization

current and time. The rule base is used to collect control rules used to describe the

battery equalization control algorithm. The database is used to store some of the data

that has been mastered.

The fuzzy controller has two inputs, the average voltage (AV) and the voltage

difference (VD), of the battery pack. The output is the equalization electricity quantity

(QBAL). As the input of fuzzy controller, the value of AV and VD are transformed into

the fuzzy language μ1(x)ޔμ2(x) after the fuzzification process. Then the inference

engine will generate language control logic μ0(z) according to the pre-established rule

base and the input fuzzy language. At last, the language control logic is transformed

into the control output signal QBAL by the defuzzification process. The relationship

among the equalization electricity quantity QBAL, the equilibrium current IBAL and the

equilibrium time TBAL can be expressed as:

VS S M L VL

Membership degree relation

Average voltage

F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 31

Figure 2 and Figure 3 show the filling valley equalization fuzzy controller

input/output membership function. The equalization current and the equalization time

are determined by the measured average voltage AV and the voltage range VD of the

fuzzy controller. The triangle function is chose to be the membership function of the

average voltage (AV) and voltage difference (VD) as it’s easy to be calculated,

compared with other membership function. The average voltage (AV) is divided into 5

fuzzy subsets: average very large (AVL), average large (AL), average medium (AM),

average small (AS), average very small (AVS), for covering the domain [2.7V, 4.2V].

Input variable voltage difference VD is also divided into 5 fuzzy subsets: difference

very large (DVL), difference large (DL), difference medium (DM), difference small

(DS), difference very small (DVS), which used to cover the domain [0mv, 20mv].

System is in accordance with the 20mV input when the voltage range is greater than

20mV. The output variable equalization electricity QBAL is divided into subsets: VL

(very large), L (large), M (medium), S (small), VS (very small). Fig.2 and Fig. 3 shows

the membership function of fuzzy control system in the horizontal coordinates VD, AV,

QBAL. For example, when the AV is 3.4V, it is one hundred percent in the S

membership and zero percent in the M and VS membership. Figure 4 shows the surface

relationship of fuzzy controller. From the figure, the relationship among AV, VD and

the balance capability could be seen. The rule base can be described in Table 1.

Output DVS DS DM DL DVL

AVS OVS OVS OS OM OL

AS OVS OS OM OL OVL

AM OVS OM OVL OVL OVL

AL OVS OS OM OL OVL

AVL OVS OVS OS OM OL

VS S M L VL

Membership Degree Relation

VD

QBAL

Figure 3. Membership functions for voltage difference and balancing electric quantity

32 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit

Table 1 shows the fuzzy control system has a total of 25 rules, write them

separately R1㧘R2…R25. Fuzzy rule expression can be given by

°R 2 (AVS and DS) o OVS

°

®R3 (AVS and DM) o OS

°

°

¯ R 25 (AVS and DVS) o OVS

(2)

In the theory of fuzzy control, there are many kinds of operations. So it has many

kinds of choices in the practical application. According to the demand of the definition,

filling valley equalization fuzzy controller operation rules are shown as follows: fuzzy

variable "and" is used for and operation, "min" is used to take the minimum value,

fuzzy variable "or" is used for or operation, "max" is used to take the maximum value.

Implication relation operation use the "min", output synthesis calculation use the "max"

and the centroid method is used in the output defuzzification process. All of the above

rules can be expressed as:

25

R R1 R2 ... R25 Ri

i 1 (3)

The exact values of AV and VD is known, the fuzzy quantity of output QBAL can

be given by

(4)

F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 33

25

Po QBAL ( VB u VD)

(VB V Ri

i 1

25

(V u VD)

(VB V ( Ai and Bi o Ci )

i 1 (5)

Because the "min" method is used in the calculation of the implication relation

25

Po QBAL [[V ( Ai o Ci )] [VD

[VB [[V ( Bi o Ci )]

i 1 (6)

Finally, the output fuzzy variables are accurate through the solution of the

defuzzification module

QBAL

³ Q P Q dQ

BAL o BAL BAL

³ P Q dQ

o BAL BAL

(7)

The maximum equalization current Ieq_max and equalization current IBAL can be

given by

U dc U M U 0 U dio LP T K

2

I eq _ max 2

2 ª¬U dc U M LP LSK U 0 U dio LP Lx º¼

(8)

§ QBAL ·

I BAL min ¨ , Ieq _ max ¸ (9)

¨ TBAL _ MIN ¸

© ¹

Then calculate the PWM wave duty cycle V and equalization time TBAL.

2 I BAL U min U dio L p Lx

2

V

U dc U M

2

Lp K T

(10)

°Q ½

TBAL min ® BEC , TBAL _ MAX ¾ (11)

° I BEC

¯ ¿

Where TBAL_MAX shows the maximum equilibrium time, here TBAL_MAX = 10s.

TBAL_MAX is set to limit the length of the equilibrium time, in case the battery pack

charging and discharging voltage changes too large in the time period.

34 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit

The simulation of battery charging and discharging is carried out under the MATLAB

and the ordinary fly-back control is compared with the fuzzy control according to the

simulation results.

Figure 5 shows the lithium battery charging and discharging process under

MATLAB simulation. The battery pack has 13 sections batteries in different initial

voltage, the 13 curves of different colors in the figure are corresponding. According to

the algorithm, the charge and discharge process differences, distinguish the figure into

the Fig.5A, Fig.5B, Fig.5C and Fig.5D. The horizontal axis shows the simulation time,

and the vertical axis shows the battery voltage value of the battery pack. Where the

fuzzy control is used in Fig.5B and Fig.5D, the ordinary fly-back control is used in

Fig.5A and Fig.5C. By the comparison, in the charging process, as shown in Figure 5A,

the battery pack needs 190min to reach the energy balance, while fuzzy control based

controller achieve energy balance only in 120min, as shown in Figure 5B. In the static

discharge process, as shown in Figure 5C, the battery pack needs 232min to reach the

energy balance, while in Figure 5D, only in 150min, equilibrium state were reached.

The simulation results shows that the fuzzy control based parallel filling valley

equalization strategy the has a faster equalization speed compared with normal fly-back

controller control.

The battery charge and discharge experiment are carried out in this paper in the

background of the filling valley equalization fuzzy control. The initial voltage values of

these cells vary from 2.9V to 3.4V.

F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit 35

Charging experiment is according to the way that constant current charging first

and then constant voltage charging. The full charging process is shown in Figure 6. In

order to see the equalization time clearly, Figure 7 shows the first 50min equalizing

charge diagram. It can be seen from the figure, only 30min the battery pack from the

initial state of disequilibrium can enter into the equilibrium, which has remarkably

improve compared with general equalization technology. As shown in Figure 8, the

equalizing discharging process can balance the power of batteries in different initial

voltage and make the batteries equalization.

Battery Pack Parameters Balanced discharge Balanced charge

Voltage range before balanced 535.2070mV 117.0849mV

Voltage range after balanced 8.9998mV 9.7893mV

The time to reach the balance About 30min About 98min

Charge and discharge time 237min 160min

From Table 2, the experiment result shows that, the fuzzy control based parallel

filling valley equalization circuit can clearly reduce the voltage difference and have

good performance in solving the nonlinear problem and equalization speed. However,

36 F. Ran et al. / A Fuzzy Control Based Parallel Filling Valley Equalization Circuit

the fuzzy rule and data base in the realistic fuzzy control process need adequate

accuracy to be reliable and better rules, inference process will certainly be found in the

later research.

3. Conclusion

The proposed fuzzy control based parallel filling valley equalization circuit can

fast reaching equalization. It has good ability of solving the nonlinear problem,

compared with traditional circuit. With the development of electrical vehicle, people

require high quality cell equalization. Lossless equalization can achieve lossless energy

transfer between different batteries to avoid the waste of energy. Filling valley

equalization is one of the schemes in lossless equalization. But how to improve the

energy flow efficiency and change the imbalance in multi- string-parallel battery pack

should be concerned in future research of lossless equalization. In addition,

equalization circuit is supposed to be as succinct as possible. How to reduce the size of

the chip and enhance the applicability deserve enough attention in further study.

References

[1] E. Kim, K. G. Shin, J. Lee. Real-time battery thermal management for electric vehicles. Cyber-Physical

Systems (ICCPS). Berlin: IEEE, (2014):72-83.

[2] C. L. Wey, P. C. Jui. A unitized charging and discharging smart battery management system. Connected

Vehicles and Expo (ICCVE). Las Vegas: IEEE, (2012):903-909.

[3] B. B. Qiu, H. P. Liu, J. L. Yang, et al. An active balance charging system of lithium iron phosphate

power battery group, Advanced Technology of Electrical Engineering and Energy, 2014.

[4] J. Cao, N. Schofield, A. Emadi. Battery Balancing Methods: A Comprehensive Review. Vehicle Power

and Propulsion Conference (VPPC). Harbin: IEEE, (2008):1-6

[5] B. T. Kuhn, G. E. Pitel, P. T. Krein, et al. Electrical properties and equalization of lithium-ion cells in

automotive applications. Vehicle Power and Propulsion Conference (VPPC): IEEE, 2005

[6] B. Lindemark. Individual cell voltage equalizers (ICE) for reliable battery performance.

Telecommunications Energy Conference,: INTELEC, (1991):196-201

[7] A. Baughman, M. Ferdowsi. Analysis of the Double-Tiered Three-Battery Switched Capacitor Battery

Balancing System. Vehicle Power and Propulsion Conference (VPPC). Harbin: IEEE, (2006):1-6

[8] W. G. Ji, X. Lu, Y. Ji, et al. Low cost battery equalizer using buck-boost and series LC converter with

synchronous phase-shift control. Annual IEEE Applied Power Electronics Conference and Exposition

(APEC). Long Beach: IEEE, 331(2013):1152-1157

[9] M. Daowd, N. Omar, DBP Van, et al. Passive and Active Battery Balancing comparison based on

MATLAB Simulation. IEEE Vehicle Power and Propulsion Conference (VPPC). Chicago, IL: IEEE,

(2011):1-7

[10] H. R. Liu, S. H. Zhang, et al. Lithium-ion battery charge and discharge equalizer and balancing

strategy.Transactions of China Electrotechnical Society, 16(2015):186-192.

[11] W. G. Ji, X. Liu, Y. Ji, et al. Low cost battery equalizer using buck-boost and series LC converter with

synchronous phase-shift control. In 2013 28th Annual IEEE applied Power Electronics Conference and

Exposition (APEC). Long Beach. CA, USA, (2013): 1152-1157

[12] R. Tian, D. T. Qin, M. H. Hu, et al. Research on battery equalization balance strategy. Journal of

Chongqing University (Nature Science Edition), (2005):1-4

[13] Y. Y. Wu, H. Liang. Research on electric vehicle battery equalization method. Automotive Engineering,

(2004): 384-385.

[14] J. J. Fu, B. J. Wu, H. J. Wu, et al. Dynamic bidirectional equalization system to a vehicle hang-ion

battery weave. China Measurement Technology, (2005): 10-11.

Fuzzy Systems and Data Mining II 37

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-37

Bonferroni Mean Aggregation Operator

Xiao-Rong HEa,1, Ying-Yu WUa, De-Jian YUb, Wei ZHOUc and Sun MENGc

a

School of Economics and Management, Southeast University, Nanjing, China

b

School of Information, Zhejiang University of Finance and Economics, Hangzhou,

China

c

Yunnan University of Finance and Economics, YNFE

Kunming, China

Abstract. Hesitant fuzzy set (HFS) is one of the most common used techniques for

expressing the decision maker’s subjective evaluation information. Interval-valued

hesitant fuzzy set (IVHFS) is the extension of HFS and can reflect our intuition

more objectively. In this paper we focus on the IVHF information aggregation

methods based on Bonferroni mean (BM). We proposed the IVHF geometric BM

operator (IVHFGBM) and weighted IVHFGBM operators. Some numerical

examples for the operators are designed for showing their effectiveness. The

desirable properties of weighted IVHFGBM operator are also discussed in detail.

These operators can be applied in many areas especially in decision making

problems.

Introduction

There are various methods available for decision making. One of the common features

for decision making methods is the information aggregation techniques [1-7]. Using

information aggregation operator in decision making, we can obtain the comprehensive

performance values of alternatives, which are used to compare alternatives. The

alternative with the biggest comprehensive performance value is the best option. The

Bonferroni mean (BM) [8-10] is a widely used technique in information aggregation

and decision making area. At present, it has been extended to interval-valued

uncertainty environment, intuitionistic fuzzy (IF) environment, interval-valued

intuitionistic fuzzy (IVIF) environment, fuzzy environment, uncertain linguistic fuzzy

environment and hesitant fuzzy environment.

However, we found that the BM operator cannot be used to aggregate interval-

valued hesitant fuzzy information [11] which is the research focus of this paper. In the

rest of this paper, we first review the basic concept about interval-valued hesitant fuzzy

set (IVHFS) and then extend the BM to interval-valued hesitant fuzzy environment.

The numerical examples are presented to better understand these interval-valued

hesitant fuzzy information aggregation methods based on BM operators.

1

Corresponding Author: Xiao-Rong HE, School of Economics and Management, Southeast University,

Nanjing, China; E-mail: shelley526@126.com.

38 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator

1. Preliminaries

In this section, a briefly review of the interval-valued hesitant fuzzy set (IVHFS) is

presented.

Definition 1 [11]. Let X be a referenced set. An IVHFS on X can be represented

as the following mathematical form:

E { x, f E ( x) !| x X } (1)

element x to the set E.

The IVHFS has a strong practical value in the situations where the membership

degree is difficult to determine. For example, a patient often on a regular abdominal

pain and he/she go to the hospital to consult three doctor independently. After

understanding of his/her illness, the first doctor thinks the possibility that patient with

stomachache is [0.6, 0.7]. The second doctor believes the possibility that patient with

stomachache is [0.1, 0.2] and likely to suffer from other diseases. The point of the third

doctor is similar to the first doctor and believes the possibility that patient with

stomachache is [0.7, 0.8]. In this case, the possibility that the patient with stomachache

can be represented by an interval-valued hesitant fuzzy element (IVHFE)

^[0.6,0.7], [0.1,0.2], [0.7, 0.8]` . Obviously, other kinds of the extended fuzzy set

theory cannot deal with this case effectively. Furthermore, IVHFE is the basic element

of IVHFS.

For any IVHFEs, Chen et al. [11] defined the operations and given the comparison

rules.

Definition 2. Suppose that h J h ^ª¬J L

`

, J U º¼ , h1 J 1h1 ^ª¬J 1

L

`

, J 1U º¼ and

h2 J 2 h2 ^ª¬J 2

L

`

, J 2U º¼ be three IVHFEs. O is a real number bigger than 0. Then the

operations are defined as follows.

1)

hO J h ^ª«¬J , J

L O U O º

»¼`

2) O h J h ^ªª1¬1 (1 J L O

( J U )O º¼

,1 (1

) ,1 `

3) h1 h2 J 1h1 ,J 2 h2 ^[J 1

L

J 2L J 1LJ 2L , J 1U J 2U J 1U J 2U ]`

4) h1 h2 J 1h1 ,J 2 h2 ^[J J , J 1U J 2U ]`

L L

1 2

`

, J U º¼ , ʶh is the number of the elements

in h .

1 1 § L JU J L · 1 §J L JU ·

S (h ) ¦ J

# h J h

¦ ¨J

# h J h © 2 ¹

¸ ¦ ¨

# h J h © 2 ¹

¸

(2)

X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator 39

S (h1 ) ! S (h2 ) , then h1 h2 ; if S (h1 ) S (h2 ) , then h1 h2 .

Example 1. Suppose that h1 ={[0.5,0.6], [0.6,0.7]}, h2

={[0.7,0.8],[0.4,0.6],[0.7,0.9]} and h3 ={[0.5,0.6]} be three IVHFEs. According the

score function and comparison rules defined in Definition 3, we have

1 1 § 0.5 0.6 0.6 0.7 ·

S (h1 )

ʶh1

¦ J

J 1h1 1

2©

¨

2

2

¸ 0.6

¹

1 1 § 0.7 0.8 0.4 0.6 0.7 0.9 ·

S (h2 ) ¦ J 2 3 ¨© 2 2 2 ¸¹ 0.68

ʶh2 J 2 h2

1 1 § 0.5 0.6 ·

S (h3 ) ¦ J 3 1 ¨© 2 ¸¹ 0.55

ʶh3 J 3h3

Since S (h2 ) ! S (h1 ) ! S (h3 ) , then h2 h1 h3

After the concepts of IVHFS and IVHFE are proposed, the aggregation operators for

aggregating IVHFEs are forwarded correspondingly, such as IVHFWA operator,

IVHFWG operator, IVHFOWA operator, IVHFOWG operator, GIVHFWA operator,

GIVHFWG operator, induced IVHFWA operator, induced IVHFWG operator, and so

on [12-13]. It should be noted that the above IVHF information aggregation operators

cannot be used to fuse the correlated arguments. On the other hand, the geometric mean

(GM) is the common aggregation operator and has been widely used in the information

fusion field. Based on the GM, the geometric BM (GBM) operator has been proposed

and investigated by some researchers. However, it seems that the researchers have no

concern with the investigation on GBM for aggregating IVHFEs which is the concern

of the following studies.

Definition 4. Let h j J j h j ^ª¬J j

L

`

, J jU º¼ ( j 1, 2,..., n) be a group of IVHFEs. If

1 § n ·

1

p q ¨ ii ,zj j 1 ¸

(3)

© ¹

operator (IVHFGBM).

Theorem 1. Let p, q ! 0 , and h j J j h j ^ª¬J j

L

`

, J jU º¼ ( j 1, 2,..., n ) be a group of

IVHFEs. After using IVHFBM operator, the aggregated IVHFE is obtained as follows.

IVHFGBM h1 , h2 , , hn

40 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator

ª 1

°« § · pq

° n 1

® «1 ¨1 1 (1 J i ) ((1 J j ) n(

= ¨ L p L q n ( n 1) ¸

J i hi ,J j h j

« ¸

° ¨ i 1, j 1 ¸

°¯ «¬ © iz j ¹

1

º½

§ · pq » °

°

n 1

1 ¨1 1 (1 J iU ) p (1

( J Uj )q n(

n ( n 1) ¸ »

¨ i 1, j 1 ¸ »¾

¨ ¸ °

© iz j ¹ »°

¼¿ (4)

[0.37,0.66], [0.68,0.87]}, h2 ={[0.69,0.81]} and h3 ={[0.57,0.69], [0.63,0.77]}. Based

on the IVHFGBM operator, the aggregated IVHFE for h1 , h2 and h3 can be obtained.

Since there are two parameters p and q in the IVHFGBM, the values of p and q may

change the aggregated results to a certain extent. For example,

(1) When p=1, q=10, then

IVHFGBM h1 , h2 , h3

= {[0.5516, 0.6758], [0.5627, 0.6850], [0.5902, 0.7153], [0.6232, 0.7801], [0.4622,

0.6925], [0.4640, 0.7098], [0.6020, 0.7190], [0.6506, 0.7868]}

the score of the aggregated IVHFE is 0.6419.

(2) When p=3, q=7, then

IVHFGBM h1 , h2 , h3

= {[0.5534, 0.6777], [0.5655, 0.6889], [0.5958, 0.7277], [0.62326, 0.7823], [0.4678,

0.6949], [0.4711, 0.7126], [0.6163, 0.7337], [0.6568, 0.7945]}

the score of the aggregated IVHFE is 0.6476.

As can be seen from the Definition 4, the IVHFGBM is symmetrical about

parameters p and q which is the same with IVHFBM operator. In order to describe this

phenomenon figuratively, Figure 1 is provided as followed.

Figure 1. Scores for IVHFEs obtained by the IVHFGBM operator (p∊ (0, 10), q∊ (0, 10))

X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator 41

Figure 2 shows the changing trend of the scores for the aggregated IVHFEs based

on IVHFGBM operator when the two parameters are fixed.

L

`

, J jU º¼ ( j 1, 2,..., n ) be a group of IVHFEs,

w ( w1 , w2 ,..., wn ) T

1,2,..., n) , satisfying wi ! 0

is the weight vector of h j ( j 1,2,

n

( i 1,2,..., n ), ¦w

i 1

i 1 . If

.. hn )

1 § n ·

1

wi wj n(( n 1)

n

¨
ph hi qh j ¸

p q ¨ ii ,zj j 1 ¸

© ¹ (5)

BM operator.

Theorem 2. Let h j J j h j ^ª¬J j

L

`

, J jU º¼ ( j 1, 2,..., n ) be a group of IVHFEs,

w ( w1 , w2 ,..., wn ) T

1,2,..., n) , satisfying wi ! 0

is the weight vector of h j ( j 1,2,

n

( i 1,2,..., n ), and ¦ wi 1 . Then the IVHFWBM and IVHFWGBM operators can be

i 1

transformed as follows:

IVHFWBM (h1 , h2 ,..., hn )

ª 1

° «§ 1 · pq

° ¨

n

¸

® «¨1 1 (1 (1 J i ) i ) ((1 ((1 J j ) )

L w p L wj q n ( n 1))

J i hi ,J j h j

« ¸ ,

° ¨ i 1, j 1 ¸

° «© iz j ¹

¯¬

§ 1 ·º ½

¸» °

n

¨1

¨¨ i

1 (1 (1 J iU ) wi ) p (1 ( J Uj ) j ) q

( (1

w n ( n 1))

¸¸ » ¾

(6)

1, j 1 °

© iz j ¹ ¼» ¿

42 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator

ª 1

°« § 1 · pq

°« ¨

n

¸ ,

® 1 ¨1 1 (1 (J i ) i ) ((1 (J j ) )

L w p L wj q n ( n 1))

J i hi ,J j h j

« ¸

° ¨ i 1, j 1 ¸

° « © iz j ¹

¯¬

1

º½

§ · pq » ° 1

¸ »°

n

1 ¨ 1 1 (1 (J iU ) wi ) p (1

( (J Uj ) j ) q

w n ( n 1))

¸¸ » ¾

(7)

¨¨ i 1, j 1

°

© iz j ¹ »°

¼¿

Example 3. Suppose there are three IVHFEs, h1 ={[0.31,0.45], [0.46,0.71] }, h2

={[0.34,0.47]} and h3 ={[0.23,0.35], [0.46,0.58], [0.65,0.73]} and the weight of the

three IVHFEs is 0.3, 0.4, 0.3 . Based on the IVHFWGBM operator, the aggregated

T

IVHFE can be obtained when the values of p and q were assigned to specific numbers.

For example, when pp=0.1, qq=10, then

IVHFWGBM h1 , h2 , h3

= {[ 0.6512, 0.7377], [0.6829, 0.7648], [0.6831, 0.7650], [0.6528, 0.7390], [0.6861,

0.7672], [0.6864, 0.7673] }

the score of the aggregated IVHFE is 0.7153.

Example 4. Suppose there are four IVHFEs, h1 ={[0.2,0.4], [0.2,0.7]}, h2

={[0.5,0.6], [0.3,0.6]} , h3 ={[0.3,0.5]} and h4 ={[0.5,0.6],[0.3,0.6]}, the weight of the

four IVHFEs are supposed as 0.2, 0.3, 0.3, 0.2 . Based on the IVHFWGBM

T

(1) when p 0.001, q 10 , the score is 0.7766;

(2) when p q 5 , the score is 0.7848;

(3) when p 10, q 0.001 , the score is 0.7765;

When the parameters p and q changed from 0 to 10 simultaneously, the scores are

shown in Figure 3 in detail.

X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator 43

h2 ={[0.3,0.4], [0.6,0.7]} , h3 ={[0.4,0.6]} and h4 ={[0.3,0.5],[0.4,0.4]}, the weight of

the four IVHFEs are supposed as 0.2, 0.3, 0.3, 0.2 . Based on the IVHFWGBM

T

operator, the scores are shown in Figure 4 when the parameters p and q changed from 0

to 10 simultaneously.

3. Conclusions

In this paper, we have extended the traditional BM and proposed the IVHFGBM and

IVHFGWBM operators to aggregate IVHFEs. Some numerical examples for these

operators are also presented to show the practicality and effectiveness. In the future

research, we intend to consider the extensions of some other BMs and study their

relationships, pay attention to the application of the proposed operators to the real

application area such as sustainable development evaluation, science and technology

project review, group decision making [14-16]and so on.

References

[1] J. J. Peng, J. Q. Wang, J. Wang, et al. Simplified neutrosophic sets and their applications in multi-criteria

group decision-making problems. International Journal of Systems Science, 47(2016), 2342-2358.

[2] D. Yu, D. F. Li and J. M. Merigó, Dual hesitant fuzzy group decision making method and its application

to supplier selection. International Journal of Machine Learning and Cybernetics, In press. DOI:

10.1007/s13042-015-0400-3

[3] H. Zhao, Z. Xu and S. Liu, Dual hesitant fuzzy information aggregation with Einstein t-conorm and t-

norm. Journal of Systems Science and Systems Engineering, In press.DOI:10.1007/s11518-015-5289-6.

[4] X. F. Wang, J. Q. Wang and W. E. Yang. Group decision making approach based on interval-valued

intuitionistic linguistic geometric aggregation operators. International Journal of Intelligent

Information and Database Systems, 7(2013), 516-534.

[5] M. Xia, Z. Xu and N. Chen. Induced aggregation under confidence levels. International Journal of

Uncertainty, Fuzziness and Knowledge-Based Systems, 19(2011), 201-227.

44 X.-R. He et al. / Interval-Valued Hesitant Fuzzy Geometric BM Aggregation Operator

[6] G. Wei. Interval valued hesitant fuzzy uncertain linguistic aggregation operators in multiple attribute

decision making. International Journal of Machine Learning and Cybernetics, In press. DOI:

10.1007/s13042-015-0433-7

[7] H. Liu, Z. Xu and H. Liao. The multiplicative consistency index of hesitant fuzzy preference relation.

IEEE Transactions on Fuzzy Systems, 24(2016), 82-93.

[8] C. Bonferroni, Sulle medie multiple di potenze, Bolletino Matematica Italiana, 5 (1950), 267-270.

[9] M. M. Xia, Z. S. Xu, and B. Zhu. Geometric Bonferroni means with their application in multi-criteria

decision making. Knowledge-Based Systems, 40 (2013), 88-100.

[10] W. Zhou and J. M. He. Intuitionistic fuzzy geometric Bonferroni means and their application in multi-

criteria decision making. International Journal of Intelligent Systems, 27(2012), 995-1019.

[11] N. Chen, Z. S. Xu, and M. M. Xia. Interval-valued hesitant preference relations and their applications

to group decision making. Knowledge-Based Systems, 37(2013), 528-540.

[12] R. M. Rodríguez, B. Bedregal, H. Bustince, et al. A position and perspective analysis of hesitant fuzzy

sets on information fusion in decision making. Towards high quality progress. Information Fusion,

29(2016), 89-97.

[13] R. Pérez-Fernández, P. Alonso, H. Bustince, et al. Applications of finite interval-valued hesitant fuzzy

preference relations in group decision making. Information Sciences, 326(2016), 89-101.

[14] D. Yu. Group decision making under intervaĺvalued multiplicative intuitionistic fuzzy environment

based on Archimedean t́conorm and t́norm. International Journal of Intelligent Systems, 30(2015),

590-616.

[15] D. Yu, W. Zhang and G. Huang. Dual hesitant fuzzy aggregation operators. Technological and

Economic Development of Economy, 22(2016), 194-209.

[16] W. Zhou and Z. S. Xu. Generalized asymmetric linguistic term set and its application to qualitative

decision making involving risk appetites. European Journal of Operational Research, 254(2016), 610-

621.

Fuzzy Systems and Data Mining II 45

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-45

Interval Type-2 Fuzzy Sets for Decision

Making

Lazim ABDULLAH1 and CW Rabiatul Adawiyah CW KAMAL

School of Informatics and Applied Mathematics, Universiti Malaysia Terengganu,

Malaysia

(MADM) used type-1 fuzzy sets to represent uncertainties. Recent theory has

suggested that interval type-2 fuzzy sets (IT2 FS) could be used to enhance

representation of uncertainties in decision making problems. Differently from the

typical integrated MADM methods which directly used type-1 fuzzy sets, this

paper proposes an integrating simple additive weighting - technique for order

preference similar to ideal solution (SAW-TOPSIS) based on IT2 FS to enhance

judgment. The SAW with IT2 FS is used to determine the weight for each

criterion, while TOPSIS method with IT2 FS is used to obtain the final ranking for

the attributes. A numerical example is used to illustrate the proposed method. The

numerical results show that the proposed integrating method is feasible in solving

MADM problems under complicated fuzzy environments. In essence, the

integrating SAW-TOPSIS is equipped with IT2 FS in contrast to type-1 fuzzy sets

for solving MADM problems. The proposed method would make a great impact

and significance for the practical implementation. Finally, this paper provides

some recommendations for future research directions.

decision making , TOPSIS, preference order

Introduction

Decision making based on multi-criteria evaluation has been used with great success

for many applications. Most of these applications are characterized by high levels of

uncertainties and vague information. Fuzzy set theory has provided a useful way to

deal with vagueness and uncertainties in solving multi-criteria decision making

(MCDM) problem. During the last two decades, MCDM methods that integrated with

fuzzy sets have been one of the fastest growing research areas. Abdullah [1] presents a

brief review of category in the integration of fuzzy sets and MCDM. In general,

MCDM can be categorized into multi-attribute decision making (MADM) and multi-

objective decision making (MODM). Naturally, MADM problem is related to multiple

attributes. The attributes of MADM represent the different dimensions from which the

alternatives can be viewed by decision makers. There are many fuzzy MADM methods

that have been discussed in the literature, and fuzzy technique for order preference

1

Corresponding Author: Lazim ABDULLAH, School of Informatics and Applied Mathematics,

Universiti Malaysia Terengganu; E-mail: lazim_m@umt.edu.my.

46 L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS

decision derived from FTOPSIS is made by observing the degree of closeness to ideal

solution. Add to this method, fuzzy simple additive weighting (FSAW) is another type

of fuzzy MADM methods. It is an extension of the SAW method, where it employs

trapezoidal fuzzy numbers to represent imprecision in judgements.

Lately, the integration of MADM method has received considerable attention in

literature. Integrated method is simply defined as two or more methods that are

concurrently employed to solve decision making problems. For example, the TOPSIS

is integrated with fuzzy analytic hierarchy process (FAHP) model to propose a new

integrated model for selecting plastic recycling method [2]. Rezaie et al., [3] present an

integrating model based on FAHP and VIKOR method for evaluating cement firms.

Wang et al., [4] develop an integrating OWA–fuzzy TOPSIS to tackle fuzzy MADM

problems. Kharat et al., [5] applied an integrated fuzzy AHP–TOPSIS to municipal

solid waste landfill site selection problem. Pamučar and Ćirović [6] applied the new

integrated fuzzy DEMATEL–MABAC method in making investment decisions.

Tavana et al., [7] proposed an integrated fuzzy ANP-COPRAS-Grey method to

determine the selection of social media platform.

Most of these integrating methods employed type-1 fuzzy sets to represent

uncertainties in decision making. However, the type-1 fuzzy sets have some extent of

limitation in dealing with uncertainties. Recent theories suggest that interval type-2

fuzzy sets (IT2 FSs) are more flexible than the interval type-1 fuzzy sets in representing

uncertainties. Therefore, in contrast to these methods, this paper introduces linguistic

terms based on IT2 FS for proposing a new integrating MADM method. The IT2 FS is

incorporated within the framework of FSAW and FTOPSIS to develop a new

integrating fuzzy MADM method. Specifically, Interval Type-2 Fuzzy Simple

Additive Weighting (IT2 FSAW) method is integrated with Interval Type-2 Technique

for Order Preference Similar to Ideal Solution (IT2 FTOPSIS) method for solving

MADM problems. In the proposed method, the judgements made by decision makers

over the relative importance of alternatives are determined using IT2 FSAW procedure

and the final preference is obtained using IT2 FTOPSIS. The ranking method of IT2

FTOPSIS approach preserves the characteristics of fuzzy numbers where the linguistic

terms can easily be converted to fuzzy numbers.

1. Proposed Method

This paper integrates the IT2 FSAW with IT2 FTOPSIS to establish a new MADM

method. In this proposed method, the IT2 FSAW is used to find weights of the criteria,

whereas IT2 FTOPSIS is used to establish preference of alternatives. The definitions

of IT2 FS [8], upper and lower memberships of IT2 FS [9], and ranking values of the

trapezoidal IT2 FS [10] are used in the proposed method. The detailed procedure of the

proposed method is described as follows.

Step 1: Construct the decision matrix Y p of the p-th decision maker and construct the

average decision matrix Y , respectively.

L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS 47

x1 x2 xn

ª p f12p f11nnp º

f1 « f11 »

f2 « f p f 22p p »

f 22nn »

Yp ( fijp ) mun « 21

« »

fm « p »

«¬ f m

m1

1 f mp2 f mnp »

m ¼

Y ( fij )mun ,

(1)

§ f1 f 2 fijk ·

fij ¨ ij ij ¸,

¨ k ¸ f1 , f 2 , , f m represent the criteria and

where © ¹ .

x1 , x2 , , xn represents alternatives.

Step 2: Construct the aggregated fuzzy weight W , from the weighting matrix Wp of

the attributes provided by p-th decision maker.

p

Let wi (ai , bi , ci , di ), i 1, 2, , m be the linguistic weight given to the subjective

criteria C1 , C2 , , Ch and objective criteria Ch 1 , Ch 1 , , Cn by decision maker Dt .

f1 f2 fm

Wp ( wip )1um ª w1p w2p wmp º,

¬ ¼ (2)

W ( wi )1um , (3)

wi1 wi2 wik

wi , wi

where k is an interval type-2 fuzzy set.

Defuzzification of W is represented as:

1

d (W j ) ( w1j w2j w3j w4j ), j 1

1, 2, ,n

4 (4)

The crisp value for criteria W . is given by:

d (W j )

Wj n

, j 1, 2, ,n

¦ d (W j )

j 1 (5)

n

¦ Wj 1

where j 1 . Therefore, the weight vector W [W1 , W2 , , Wn ] is constructed.

48 L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS

x1 x2 xn

« »

f 2 « v21 v22 2n »

Yw v ij

mun «

v2n

»,

« »

f m «v vmn »¼

¬ m1 vm 2 (6)

ij,, using Eq (7). Create

the ranking for weighted decision matrix Yw* ,

Yw* Rank v iij

mun

,

(7)

x

v1 , v2 , , vm and the negative-

ideal solution

x

v1 , v2 , , vm , where

max{Rank

°1d j d n

v },}

ij if fi F1

vi ®

° min{Rank

¯1d j d n

v },

ij if fi F2

(8)

and

min{Rank

°1d j d n

v },

ij if fi F1

vi ®

° max{Rank

¯1d j d n

v },

ij if fi F2

(9)

F F

where 1 denoted the set of benefit attributes, and 2 denotes the set of cost attributes.

d xj

between each alternative x j and the positive ideal

solution x , using the Eq (10).

¦ Rank v v

m

2

d xj ij

i ,

i 1

(10)

d xj

between each alternative x j and

the negative-ideal solution x , using the following equation.

¦ Rank v v

m

2

d xj ij

i ,

i 1

(11)

C xj

of x j with respect to the positive

ideal solution x , using the following equation.

L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS 49

xj

d

C xj ,

d xj d xj (12)

where 1 d j d n.

Step 8: Arrange the values of in a descending order, and the larger value of

C xj

C x j , indicates the higher preference of the alternative x j ,

.

2. Numerical Example

For the purpose of illustration and to show the feasibility of the proposed method, an

example is presented. This example is retrieved from Chou et al. [5].

Researchers intend to identify the facility location alternatives to build a new plant.

The team has identified three alternatives which are alternative 1 ( A1 ) , alternative 2

( A2 ) , and alternative 3 ( A3 ) . To determine the best alternative site, a committee of

four decision makers is created; decision maker 1 ( D1 ) , decision maker 2 ( D2 ) ,

decision maker 3 ( D3 ) and decision maker 4 ( D4 ) . Three selection criteria are

deliberated: transportation availability (C1 ) , availability of skilled workers (C2 ) and

climatic conditions (C3 ) . Table 1 shows the linguistic terms used to rate criteria with

respect to alternatives and also the weights for criteria.

Table 1. Linguistic terms and IT2 FS

Linguistic Terms Interval Type-2 Fuzzy Sets

Very Poor (VP) ((0,0,0,0.1;1,1),(0,0,0,0.05;0.9,0.9))

Poor (P) ((0.0,0.1,0.1,0.3;1,1),(0.05,0.1,0.1,0.2;0.9,0.9))

Medium Poor (MP) ((0.1,0.3,0.3,0.5;1,1),(0.2,0.3,0.3,0.4;0.9,0.9))

Fair (F) ((0.3,0.5,0.5,0.7;1,1),(0.4,0.5,0.5,0.6;0.9,0.9))

Medium Good (MG) ((0.5,0.7,0.7,0.9;1,1),(0.6,0.7,0.7,0.8;0.9,0.9))

Good (G) ((0.7,0.9,0.9,1;1,1),(0.8,0.9,0.9,0.95;0.9,0.9))

Very Good (VG) ((0.9,1,1,1;1,1),(0.95,1,1,1;0.9,0.9))

Based on the ratings given by decision makers , the example is solved using the

proposed method. The final degree of closeness and preference are shown in Table 2.

Table 2. Degree of closeness and preference

Degree of closeness Preference order

C ( A1 ) 0.4112 3

C ( A2 ) 0.4605 2

C( A3 ) 0.4778 1

It can be seen that the preference order of the alternatives is A3 ; A2 ; A1. The

proposed method therefore decided that the best alternative is A3. This preference is

slightly inconsistent with the result obtained using the FSAW where the preference is

A2 ; A3 ; A1.

50 L. Abdullah and C.W.R.A.C.W. Kamal / A New Integrating SAW-TOPSIS

3. Conclusions

This paper proposed a novel method, which integrate IT2 FSAW and IT2 FTOPSIS to

solve MADM problems. Decision makers used interval type-2 linguistic variables to

assess the importance of the criterion. The ranking weighted decision matrix obtained

from IT2 FSAW was then used as an input to the IT2 FTOPSIS where ideal solutions

could be computed. Finally, preference of alternatives was obtained as a result of the

implementation using the integration method. To illustrate the feasibility of the

proposed method, a numerical example, that formerly solved using the FSAW method

was considered. The results showed that A3 is the most preferred alternative. Detailed

comparative analysis between the results obtained using the integrated method and

other decision making methods is left for future research. Future research may also

include sensitivity analysis where the uncertainty of the final preference of the

integrating model can be investigated.

Acknowledgments

This work is part of the research grant project FRGS 59389. We acknowledged the

financial support provided by Malaysian Ministry of Education and Universiti

Malaysia Terengganu.

References

[1] L. Abdullah, Fuzzy Multi Criteria Decision Making and its Application: A Brief Review of Category.

Procedia-Social and Behavioral Sciences, 97 (2013), 131-136.

[2] S. Vinodh, M. Prasanna, N. Hari Prakash, Integrated fuzzy AHP-TOPSIS for selecting the best plastic

recycling method: A case study. Applied Mathematical Modelling, 39 (2014),4662-4672.

[3] K. Rezaie, S. S. Ramiyani, S Nazari-Shirkouhi, A. Badizadeh, Evaluating performance of Iranian

cement firms using an integrated fuzzy AHP-VIKOR method. Applied Mathematical Modelling, 38

(2014), 5033-5046.

[4] T. Wang, J. Liu, J. Li, C. Niu, An integrating OWA–TOPSIS framework in intuitionistic fuzzy settings

for multiple attribute decision making, Computers & Industrial Engineering, 98(2016), 185-194.

[5] M. G. Kharat, S. J. Kamble, R. D. Raut, S. S Kamble, S. M. Dhume, Modeling landfill site selection

using an integrated fuzzy MCDM approach . Earth Systems and Environment, 2(2016), 53.

[6] D. Pamučar, G. Ćirović, The selection of transport and handling resources in logistics centers using

Multi-Attributive Border Approximation area Comparison (MABAC), Expert Systems with

Applications, 42(2015), 3016-3028.

[7] M. Tavana, E. Momeni, N. Rezaeiniya, S. M. Mirhedayatian, H. Rezaeiniya, A novel hybrid social media

platform selection model using fuzzy ANP and COPRAS-G, Expert Systems with Applications,

40(2013), 5694-5702.

[8] Y. C. Chang, S. M. Chen, A new fuzzy interpolative reasoning method based on interval type-2 fuzzy

sets. IEEE International Conferencte on Systems, Man and Cybernetics, (2008), 82-87.

[9] J. M. Mendel, R. I., John, F. Liu, Interval Type-2 Fuzzy Logic Systems Made Simple. IEEE

Transactions of Fuzzy Systems, 14 (2006), 808-821.

[10] L. Lee, S. Chen S, Fuzzy Multiple Attributes Group Decision-Making Based On The Extension Of

Topsis Method And Interval Type-2 Fuzzy Sets. Proceedings of the Seventh International Conference on

Machine Learning and Cybernetics, (2008), 3260-3265.

[11] J. S.Yao, K. Wu, Ranking fuzzy number based on decomposition principle and signed distance. Fuzzy

Sets and Systems, 116(2000), 275-288.

Fuzzy Systems and Data Mining II 51

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-51

of Fuzzy Tensors

Ling CHEN a,b,1 Lin-Zhang LU a,c

a

School of Mathematical Sciences, Guizhou Normal University, GuiYang, Guizhou, P.

R.China 550001

b

School of Science, Shandong jianzhu University, JiNan, Shandong, P.R.China 250101

c

School of Mathematical Sciences, Xiamen University, Xiamen, Fujian, P.R.China

361005.

Abstract. In this paper, we focus on describing the oscillation period and index

of fuzzy tensor. The deﬁnition of the induced third-order fuzzy tensor is proposed.

By using this notion, ﬁrstly, the oscillation period and index of fuzzy tensor are

obtained on the basis of Power Method with max-min operation. Secondly, we rely

on Minimal Strong Component to ﬁnd the oscillation period of fuzzy tensor. It

is a more practical graph theory method that the number of nonzero elements is

less than half of the sum of fuzzy tensor elements. Furthermore, numerical results

demonstrate the Power Method and the Minimal Strong Component two algorithms

for solving the period and index of fuzzy tensor which are effective and promising.

Introduction

In fuzzy mathematics, the study of fuzzy matrix is very complex but quite important

since it has a wide range of applications, especially in fuzzy control and fuzzy decision.

The object of fuzzy control is fuzzy system which is one of the important aspects of

fuzzy control system in which one can reach the stable state in limited time, and study

it’s stability by using the periodicity of fuzzy matrix. In order to study the multi-objective

fuzzy decision making and dynamic multiple objective fuzzy control, it is necessary to

investigate the higher order forms of fuzzy matrix.

The periodicity is one of the most important characteristics of fuzzy matrices.

Thomason [1] ﬁrst proposed the powers of fuzzy matrix with convergence period or os-

cillation period. Fan and Liu [2] got the conclusion that the period of fuzzy matrix is

equal to the least common multiple of the period of its cutting matrix. Li [3] discussed

the periodicity of fuzzy matrices in the general case. Liu and Ji [4] described the peri-

odicity of square fuzzy matrices. Furthermore, in the same paper [3], and perfected the

conclusion of the upper bound of powers convergence index of fuzzy matrix, obtained

the greatest periodicity value of any square fuzzy matrix. So they solved the problem of

estimating period in the general fuzzy matrix.

1 Corresponding Author: Ling CHEN, School of Mathematical Sciences, Guizhou Normal University,

¯

52 L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors

Owing to the literatures [5,6,7], nowadays, many practical problems can be modeled

as tensor problems. For example, vector is one order tensor, matrix is a second order

tensor, three order or higher is called higher order tensor. [8] explain fast algorithm ex-

ponential data ﬁtting and its application to exponential data ﬁtting. [9]considered inﬁnite

and ﬁnite dimensional Hilbert tensors and researched its periodicity. So generalizing the

tensor to the fuzzy tensor is practical and meaningful.

In this paper, we deal with the oscillation period and index of fuzzy tensor with max-

min operation. We ﬁnd the oscillation period and index of fuzzy tensor by Power Method

and Minimal Strong Component in section 1 and in section 2, respectively. Our numerical

examples show the feasibility of the two proposed algorithms. Finally, in section 3, we

will give Conclusions.

In this section, we ﬁrst describe some concepts and results about fuzzy matrices in the

literatures [1,2,3,4,10,11,12,13] , which will be used in the section. We give the deﬁnition

of fuzzy tensor, and analyze the periodicity and index of fuzzy tensor.

Let A = (aij ) and B = (bij ) be n × n fuzzy n matrices. We have the following

product deﬁnition: A × B = C = (cij ) = ( k=1 (aik ∧ bkj )), where aij ∧ bij =

min{aij , bij }, aij ∨ bij = max{aij , bij }. And Ak+1 = Ak × A, k = 1, 2, · · · . A = B

if aij = bij for all i, j ∈ {1, 2, · · · , n}.

Consider a ﬁnite number of fuzzy matrices A1 , A2 , · · · , An with any Ai ∈

F n×n , where F n×n denotes the set of all of n × n fuzzy matrices. We have F =

{A1 , A2 , · · · , An }.

Let Z + = {x|x be a position integer} and [n] be the least common multiple of

1, 2, · · · , n.

Referring to the relevant literatures [1,2,3,4,11], for convenience in application, we

propose an equivalent deﬁnition of the period of oscillation and the index of fuzzy matrix.

As+t = As , then we call d = min{t|As+t = As } the period of oscillation of A, and

k = min{s|As+d = As } the index of A.

Remark 1. The possible range of the period of fuzzy matrix is from 1 to [n], that is ,

1 ≤ d ≤ [n] and d|[n]. If d = 1, we say A is convergence.

will present the deﬁnition of fuzzy tensor as follows.

0 ≤ ai1 · · · in ≤ 1, where ij = 1, · · · , n for j = 1, · · · , m.

For our purposes, throughout this paper, we always consider i1 · · · , im be the same

dimensional.

From the above deﬁnition of fuzzy tensor, clearly, a fuzzy tensor is higher order

generalization of a fuzzy matrix, and is also a tensor extension of characteristic function.

Next, we discuss the third order clustering of fuzzy tensor by using the slice-by-slice

method. For fuzzy tensor, we obtained second-dimensional sections by ﬁxing all indices

L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors 53

(b)Lateral slices

expect for tow indices. Each slice is a fuzzy matrix. Fixing all indices but three indices,

we will deﬁne the induced third-order fuzzy tensor.

Deﬁnition 3. Let an order m dimension n fuzzy tensor A = (ai1 ···im ). Multiple third-

order fuzzy tensors clustering (Aij ik ih , A) of A are constructed by ﬁxing all but three

indices. We call Aij ik ih is the induced third-order fuzzy tensor of A. Where ij , ik , ih ∈

{i1 · · · , im }.

By the third-order clustering theory, we shall explore the period and index of higher

order fuzzy tensor, which is converted into the study of third-order fuzzy tensor. A third-

order fuzzy tensor has the horizontal, lateral and frontal slices. Each direction contains

3 m−3

a set of fuzzy matrices. We obtain Cm n the induced third-order fuzzy tensors and

3 m−3

3Cm n sets of fuzzy matrices sequences by an order m dimension n fuzzy tensor.

Figure 1 shows the horizontal, lateral and frontal slices of the third-order fuzzy tensor

Aij ik ih , denoted by Aij :: , A:ik : and A::ih , respectively.

On the whole, it is far more intuitive and simpler to investigate higher order fuzzy

tensor with the help of geometric signiﬁcance of third-order fuzzy tensor. Furthermore,

it is convenient to apply them to various ﬁelds.

Now, we introduce the period of induced third-order fuzzy tensor and the given fuzzy

tensor. The following result follows immediately by Diﬁnition1.

common multiple (l.c.m) the period of A1 , A2 , · · · , An , and the index of F is the largest

of the index of A1 , A2 , · · · , An .That is, suppose d1 , · · · , dn , dF and k1 , · · · , kn , kF is

the period oscillation and index of A1 , A2 , · · · , An , F respectively. We get

dF = l.c.m[d1 , · · · , dn ], kF = max{k1 , · · · , kn }.

n) as a block fuzzy matrix, we have block diagonal matrix F = diag (A1 , A2 , · · · , An ).

by Deﬁnition 1 then dF = l.c.m[d1 , · · · , dn ] and kF = max{k1 , · · · , kn }.

From the geometric signiﬁcance of 3-order fuzzy tensor, we state easily the main

conclusion as follows.

Theorem 2. Let Aij ik ih is the induced third-order fuzzy tensor of order m dimension n

fuzzy tensor A. Suppose d, dij , dik , dih and k, kij , kik , kih is the oscillation period and

index of A, Aij :: , A:ik : and A::ih , respectively. Then

d = l.c.m[d1 , · · · , dn ], k = max{k1 , · · · , kn }.

54 L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors

i3 i4 = 1 i4 = 2 i4 = 3 i4 = 4

⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞

0.3 0.1 0.8 0.9 0.1 0.2 0.1 0.4 0.7 0.9 0.8 0.9 0.3 0.9 0.5 0.5

⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎜ 0.1 0.9 0.2 0.8 ⎟ ⎜ 0.8 0.1 0.1 0.4 ⎟ ⎜ 0.3 0.1 0.1 0.6 ⎟ ⎜ 0.6 0.5 0.4 0.8 ⎟

1 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎜ 0.5 0.1 0.2 0.7 ⎟ ⎜ 0.9 0.2 0.4 0.3 ⎟ ⎜ 0.4 0.6 0.6 0.6 ⎟ ⎜ 0.9 0.9 0.9 0.8 ⎟

⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠

⎛ 0.1 0.3 0.4 0.6 ⎞ ⎛ 0.3 0.6 0.8 0.4 ⎞ ⎛ 0.1 0.7 0.8 0.4 ⎞ ⎛ 05 0.4 0.2 0.7 ⎞

0.8 0.9 0.9 0.1 0.3 0.8 0.4 0.5 0.6 0.8 0.8 0.4 0.3 0.6 0.8 0.7

⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎜ 0.3 0.9 0.8 0.6 ⎟ ⎜ 0.5 0.7 0.4 0.8 ⎟ ⎜ 0.3 0.4 0.9 0.3 ⎟ ⎜ 0.3 0.9 0.3 0.7 ⎟

2 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎜ 0.7 0.1 0.7 0.2 ⎟ ⎜ 0.9 0.6 0.9 0.4 ⎟ ⎜ 0.3 0.9 0.8 0.8 ⎟ ⎜ 0.7 0.3 0.6 0.3 ⎟

⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠

0.2 0.2 0.9 0.1 0.5 0.7 0.6 0.1 0.7 0.9 0.7 0.2

⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ 0.5 0.7 0.4 0.9 ⎞

0.5 0.9 0.9 0.6 0.5 0.2 0.7 0.6 0.5 0.8 0.1 0.3 0.5 0.3 0.4 0.7

⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎜ 0.9 0.5 0.3 0.1 ⎟ ⎜ 0.6 0.9 0.4 0.7 ⎟ ⎜ 0.4 0.5 0.2 0.4 ⎟ ⎜ 0.9 0.8 0.6 0.1 ⎟

3 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎜ 0.4 0.8 0.3 0.8 ⎟ ⎜ 0.8 0.7 0.6 0.3 ⎟ ⎜ 0.5 0.4 0.7 0.8 ⎟ ⎜ 0.8 0.3 0.3 0.2 ⎟

⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠

⎛ 0.4 0.2 0.8 0.2 ⎞ ⎛ 0.7 0.8 0.3 0.7 0.9 0.4 0.3 0.8 0.3 0.7 0.4 0.8

⎞ ⎛ ⎞ ⎛ ⎞

0.2 0.8 0.8 0.7 0.9 0.6 0.6 0.9 0.5 0.3 0.2 0.6 0.7 0.1 0.2 0.8

⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎜ 0.9 0.9 0.7 0.3 ⎟ ⎜ 0.9 0.5 0.7 0.3 ⎟ ⎜ 0.9 0.5 0.8 0.3 ⎟ ⎜ 0.3 0.2 0.5 0.2 ⎟

4 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎜ 0.6 0.9 0.7 0.1 ⎟ ⎜ 0.3 0.7 0.5 0.5 ⎟ ⎜ 0.1 0.7 0.4 0.5 ⎟ ⎜ 0.5 0.5 0.1 0.4 ⎟

⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠

0.2 0.5 0.7 0.8 0.8 0.2 0.7 0.9 0.7 0.5 0.2 0.7 0.6 0.1 0.1 0.2

Proof. Using the block fuzzy matrices theory as Theorem 1 can prove the theorem.

Clearly, based on Deﬁnition 3 and Theorem 2, we have the following result.

Theorem 3. Let an order m dimension n fuzzy tensor A = (ai1 ···im ). Multiple third-

order fuzzy tensors clustering (Aij ik ih , A) of A, Aij ik ih is the induced third-order fuzzy

tensor of A, then the oscillation period D of fuzzy tensor A is the least common multiple

of the oscillation period all of induced third-order fuzzy tensors, and the index K of A is

the max of the index all of induced third-order fuzzy tensors.

By the algorithm in [3] of ﬁnding the oscillation period and index of fuzzy matrix,

we give here a power method for the oscillation period and index of fuzzy tensor. From

the above discussion, the following algorithm can be given naturally.

Algorithm 1(A power method ﬁnding oscillation period and index of fuzzy tensor).

Input: An order n dimension fuzzy tensor A = (ai1 ···im ).

Output: The oscillation period and index of fuzzy tensor as D and K.

Step 1. Choose ij , ik , ih ∈ {i1 , · · · , im }, let Aij ik ih = (aij ik ih ).

Step 2. By using Deﬁnition 1 and Theorem 1, compute dij , dik , dih and kij , kik , kih .

Step 3. By Theorem 2 compute d and k.

Step 4. Until ij , ik , ih throughout all i1 , · · · , im , repeat Step 1-Step 3.

Step 5. Based on Theorem 3 calculate D and K from the above all d and k.

Next, to demonstrate that Algorithm 1 works for fuzzy tensor, we test the following

example whose codes are written in R language.

Example 1. Let A be a 4-order fuzzy tensor with dimension four which is deﬁned by

Table 1. For m = 4, we have the induced 3-order fuzzy tensor Ai1 i2 i3 , Ai1 i2 i4 , Ai1 i3 i4

and Ai2 i3 i4 . For Ai1 i2 i3 , if i4 = 1, we have the induced 3-order fuzzy tensor Ai1 i2 i3 1

L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors 55

contains data denoted by Ai1 i2 i3 1 = (A(:, :, 1, 1), A(:, :, 2, 1), A(:, :, 3, 1), A(:, :, 4, 1)),

and we obtain three sets of fuzzy matrices F11 , F21 , F31 by ﬁxed one indices in turn i1 , i2 ,

i3 , where Fi1 = {A1 , A2 , A3 , A4 }, i = 1, 2, 3.

Consider all of fuzzy matrices Fi1 (i = 1, 2, 3) by Deﬁnition 1 and Theorem 1:

dF1 = [1, 1, 1, 1] = 1, kF11 = max{4, 4, 4, 4} = 4; dF21 = [2, 1, 1, 1] = 2, kF21 =

1

the oscillation period d1 and index k 1 of fuzzy tensor Ai1 i2 i3 1 are as follows: d1 =

[dF11 , dF21 , dF31 ] = [1, 2, 2] = 2, k 1 = max{kF11 , kF21 , kF31 } = max{4, 4, 5} = 5.

If i4 = 2, i4 = 3, i4 = 4 we have: d2 = [2, 1, 1] = 2, k 2 = max{3, 6, 5} = 6; d3 =

[2, 1, 2] = 2, k 3 = max{3, 4, 4} = 4; d4 = [2, 3, 2] = 6, k 4 = max{5, 5, 4} = 5.

Hence, the oscillation period d1 and index k1 of fuzzy tensor Ai1 i2 i3 are: d1 =

[d1 , d2 , d3 , d4 ] = [2, 2, 2, 6] = 6, k1 = max{k 1 , k 2 , k 3 , k 4 } = max{5, 6, 4, 5} = 6.

Similar to the above analysis for Ai1 i2 i4 , Ai1 i3 i4 , Ai2 i3 i4 we obtain d2 = [2, 2, 2, 2] =

2, k2 = max{4, 4, 6, 5} = 6; d3 = [2, 2, 2, 6] = 6, k2 = max{6, 5, 6, 4} = 6; d4 =

[2, 2, 2, 2] = 2, k2 = max{5, 5, 5, 5} = 5. Based on Theorem 3 we get the oscillation

period D and index K of fuzzy tensor A: D = [d1 , d2 , d3 , d4 ] = [6, 2, 6, 2] = 6, K =

max{k1 , k2 k3 , k4 } = {6, 6, 6, 5} = 6.

This example veriﬁed the feasibility and correctness of the Algorithm 1.

In this section, by using graph theory tools, we give a method to ﬁnd oscillation period

of fuzzy tensor. When m and n are not large and the number of nonzero elements is less

than half of the sum of fuzzy tensor elements, with the minimal strong component than

Power Method for oscillation period is simple and does not need much calculation. The

following deﬁnition in [11].

Let ΦA denote the set of all nonzero elements of fuzzy matrix A, for any λ ∈ ΦA ,

we call Aλ = (aλ )ij the cut matrix of A, where (aλ )ij = 1 if aij ≥ λ, else (aλ )ij = 0.

We follow [14,15,4] to show the period of Boolean matrix by strong components

and express the period of fuzzy matrix with minimal strong component. Furthermore, we

shall ﬁnd the period of fuzzy tensor based on the minimal strong component.

if there is a λ ∈ ΦA such that S is a strong component of cut matrix Aλ .

the period of S, Ω denote the set of minimal strong component of fuzzy matrix A.

then d(A) = [d(si )], where si ∈ Ω.

According to the above discussion, we can develop the following algorithm for the

oscillation period of fuzzy tensor by minimal strong component.

Input: An order n dimension fuzzy tensor A = (ai1 ···im ).

Step 1. Establish the induced 3-order fuzzy tensor Aij ik ih by Deﬁnition 3.

56 L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors

A(:, :, 1) A(:, :, 2) A(:, :, 3)

⎛ ⎞ ⎛ ⎞ ⎛ ⎞

0.5 0.3 0.4 0 0 0 0 0.8 0 0 0 0.1 0.8 0 0.5 0

0 0

⎜ 0 0.3 0 0 0 ⎟ ⎜ 0.5 0 0 ⎟ ⎜ 0 0 0 0.4 0 0.5 ⎟

⎜ 0 ⎟ ⎜ 0 0.3 0 ⎟ ⎜ ⎟

⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎜ 0.4 0 0 0.3 0 0 ⎟ ⎜ 0 0.7 0 0 0.5 0 ⎟ ⎜ 0.5 0 0 0 0.4 0 ⎟

⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎜ 0 0 0 0.3 0 ⎟ ⎜ 0.3 0 0.2 ⎟ ⎜ 0 0.5 0 0 0 0.4 ⎟

⎜ 0 ⎟ ⎜ 0.3 0 0 ⎟ ⎜ ⎟

⎜ ⎟ ⎜ ⎟ ⎜ ⎟

⎝ 0 0 0 0.4 0 0 ⎠ ⎝ 0 0 0 0.5 0 0 ⎠ ⎝ 0 0 0.4 0 0 0 ⎠

0 0 0 0 0.3 0 0 0 0.2 0 0 0 0 0 0 0 0 0

a

1 a2

a1

a4

a

3

a

3

a

a 5 a

1 4 a6 a5

(a)Digraph of D0.5 (b)Digraph of D0.4 (c)Digraph of D0.3

Step 3. Compute the period of fuzzy matrix of Fij , Fik , Fih by minimal strong compo-

nent .

Step 4. According to Theorem 1, obtaining the period of Fij , Fik , Fih .

Step 5. Calculate the period of the induced 3-order fuzzy tensor Aij ik ih by Theorem 2 .

Step 6. Figure out the oscillation period of fuzzy tensor A by Theorem 3 .

To illustrate the Algorithm 2 works for fuzzy tensor, we test the following example.

Example 2. Let A be a third order six dimensional fuzzy tensor A = (A(:, :, 1), A(:, :

, 2), A(:, :, 3)) deﬁned by Table 2.

For A(:, :, 1) See Figure 2 we have λ1 = 0, λ2 = 0.3, λ3 = 0.4, λ4 = 0.5 then

digraph Di (i = 1, 2, 3) can be represented as follows.

In D0.5 there is only one strong component S1 = {a1 }. In D0.4 there is one

strong component S2 = {a1 , a3 }. In D0.3 there are two strong components S3 =

{a1 , a2 , a3 }, S4 = {a4 , a5 }.

Notice that S4 is a strong component which has no common vertices with S1 , S2 , S3 .

Hence, we say that S4 is a newly appeared strong component. Moreover, we obtain that

the set of minimal strong components of fuzzy matrix A(:, :, 1) is Ω = {S1 , S4 }. Then

d(A(; , ; , 1)) = [d(s1 ), d(s4 )] = [1, 2] = 2.

Consider A(:, :, 2) and A(:, :, 3) we have d(A(; , ; , 2)) = [2, 3] = 6, d(A(; , ; , 3)) =

[1, 2, 2] = 2. Then d(A) = [d(A(:, :, 1)), d(A(:, :, 1), d(A(:, :, 1)] = [2, 6, 2] = 6.

This example illustrates Algorithm 2 has one great advantage that only uses the

directed graph of spare fuzzy matrix can ﬁnd it’s oscillation period , and there is no need

the troublesome calculations.

L. Chen and L.-Z. Lu / Algorithms for Finding Oscillation Period of Fuzzy Tensors 57

3. Conclusions

In this paper, we proposed fuzzy tensor which is a new class of nonnegative tensor and

which is a form of higher order of fuzzy matrix. We gave the deﬁnition of the induced

third-order fuzzy tensor that has an advantage of intuitive geometric signiﬁcance. Based

on these concepts, we investigated the oscillation period and index of fuzzy tensor with

the help of Power Method and Minimal Strong Component, respectively. Our numerical

results showed that two methods are feasible and favourable. Hence, it is necessary to

research many more properties of fuzzy tensors. In the future, we will continue to mull

over all aspects of fuzzy tensor.

Acknowledgements

The work of the ﬁrst author was supported by Innovation Foundation of Guizhou Nor-

mal University for Graduate Students(201529, 201528), and the Shandong province Col-

lege’s Outstanding Young Teachers Domestic Visiting Scholar Program(2013). The work

of the second author was supported by the National Science Foundation of China (Grant

Nos.11261012).

References

[1] M.G.Thomason, Convergence of powers of a fuzzy matrix, Journal of Mathematical Analysis and Appli-

cations, 57(1977), 476–480.

[2] Z.T.Fan, D.F.Liu, On the oscillating power sequence of a fuzzy matrix, Fuzzy Sets and Systems, 93(1998),

75–85.

[3] J.X.Li, Periodicity of powers of fuzzy matrices, Fuzzy Sets and Systems, 48(1992), 365–369.

[4] W.B.Liu, Z.J.Ji, The periodicity of square fuzzy matrices based on minimal strong components, Fuzzy

Sets and Systems, 126(2002), 233–240.

[5] L.Q.Qi, Eigenvalues of a real supersymmetric tensor, Journal of Symbolic Computation, 40(2005), 1302–

1324.

[6] L.H.Lim, Singular values and eigenvalues of tensors: A variational approach, Proceeding of the IEEE

Internatinal Workshop on Computation advances in multi-tensor adaptive processing, 1(2005), 129–132.

[7] T.G.Kolda , B.W.Bader, Tensor decomposition and applications, SIAM Review, 51(2009), 455–500.

[8] W.Y.Ding, L.Q.Qi, YM Wei, Fast Hankel tensorĺCvector product and its application to exponential data

ﬁtting, Linear Algebra and its Applications, 22(2015), 814–832.

[9] Y.Song, L.Q.Qi, Inﬁnite and ﬁnite dimensional Hilbert tensors, Linear Algebra and its Applications,

451(2014), 1–14.

[10] C.Z.Luo, Introduction to fuzzy sets (Vol.1), Beijing Normal University Press,Beijing,(In Chinese), 1989.

[11] Z.T.Fan, D.F.Liu, On the power sequence a fuzzy matrix-Convergent power sequence, Journal of Com-

putational and Applied Mathematics, 4(1997), 147–165.

[12] L.A.Zadeh, Fuzzy sets, Information and Control, 8(1965), 338–353.

[13] S.G.Guu, Y.Y.Lur, C.T.Pang, On inﬁnite products of fuzzy matrices, SIAM Journal on Matrix Analysis

and Applications, 22(2001), 1190–1203.

[14] B.De.Schutter, B.DE.Moor, On the sequence of consecutive powers of a matrix in a Boolean algebra,

SIAM Journal on Matrix Analysis and Applications, 21(1999), 328–354.

[15] K.H.Kim, Boolean matrix theory and application, Marcel Dekker, New York, 1982.

58 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-58

Problem for Damageable Items

Transportation

Si-Chao LU1 and Xi-Fu WANG

School of Traffic and Transportation, Beijing Jiaotong University, Beijing, China

minimum cost flow problem for damageable items transportation. For the

imprecise model, capacity, cost, percentage of unit damage of each route have

been considered as triangular fuzzy numbers. This problem has been solved by

using the k-preference integration method, the area compensation method, and the

signed distance method. Finally, to show the validity of the proposed model, a

numerical example is provided and solved with Wolfram Mathematica 9.

signed distance

Introduction

As a classic combinatorial problem, the minimum cost flow problem has a wide range

of applications and ramifications. In the logistics industry, it is common for decision

makers to generate a plan to optimally transport damageable items from multiple

sources to multiple destinations through transshipment stations. Furthermore,

impreciseness in defining parameters such as the cost per unit on one route is another

commonly appeared problem in realistic environment. Therefore, this paper is devoted

to solve this problem.

With respect to the fuzzy minimum cost flow problem [1], there exist a lot of

fruitful outcomes. In the fuzzy minimum cost flow problem proposed by Shih and Lee

[2], the cost parameter and capacity constraints are taken as fuzzy numbers. In addition,

they proposed a fuzzy multiple objective minimum cost flow problem and used

minimization of the total passing time as the second objective in an example. Ding

proposed an ǩ-minimum cost flow problem to deal with uncertain capacities [3].

However, few studies refer to adaption of this problem to the damageable items

transportation. A close related problem is the multi-objective, multi-item intuitionistic

fuzzy solid transportation problem for damageable items which was proposed by

Chakraborty et al [4]. To defuzzify the imprecise parameters, we use the k-preference

integration method, the area compensation method, and the signed distance method

respectively. Computations to solve the problem are done by using the Wolfram

Mathematica 9.

1

Corresponding Author. Si-Chao LU, School of Traffic and Transportation, Beijing Jiaotong

University, Beijing, China; E-mail: lusichao@163.com.

S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem 59

The remaining of the paper is organized as follows: The next section offers a brief

introduction to fuzzy numbers and three defuzzification methods. The mathematical

model of fuzzy minimum cost flow problem for damageable items is proposed in

Section 2. A simulated problem instance is given and solved in Section 3. Finally, the

paper is concluded in Section 4.

1. Fuzzy Preliminaries

={(x, (x)) | x∈X} (1)

where : X [0,1] is a mapping called the membership function of x∈X in . [5]

which is shown in Figure 1. The membership function of is determined in Eq. (2) [5].

0 x ≤ a1

⎧

⎪ x − a1

⎪ a1 ≤ x ≤ a2

() = aa2 − a1

⎨ 3−x (2)

a2 ≤ x ≤ a3

⎪a3 − a2

⎪

⎩ 0 a3 ≤ x

Figure 1. A triangular fuzzy number.

The k-preference integration method was introduced by Chen and Hsieh [6]. Based on

this method, the k-preference integration representation of a general TFN = (a1 , a2 , a3 )

is defined as:

1 1

1

= h[kL-1 (h)+(1-k)R-1 (h)] dh h dh = [ka1 +2a2 +(1-k)a3 ]

Pk (3)

0 3

From Eq.(3), it can be obviously seen that the k-preference integration is fairly

flexible compared with other defuzzification methods, because the value of k is

determined by the decision maker. It has been used to handle the fuzzy cold storage

problem [7] and the constrained knapsack problem in fuzzy environment [8].

If k=0.5 then the result generated by k-preference integration method will be the

same as that obtained by graded mean integration (GMI) method which was introduced

by Chen and Heieh [9].

=(a1 , a2 , a3 ) can be

Based on the area compensation method [10], the TFN A

defuzzified as:

60 S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem

a

2 3 a

∫a h

(x)dh+ ∫a h

(x)dh (a3 − a1 )(a1 +a2 +a3 )⁄6 a1 +a2 +a3

=

ΦA A 1

a2 a3

= = (4)

∫a

(x)dh+ ∫a

(x)dh (a3 − a1 )⁄2 3

1

The left and the right α-cuts of the TFN A = (a1 , a2 , a3 ) are L-1 (α) = a+(b-a)α and

R-1 (α) = c-(c-b)α [11]. Based on the ranking system for fuzzy numbers proposed by

is defined as follows:

Yao and Wu [12], the signed distance of the A

1 1

1 1 1

,0) = [L-1 (α)+R-1 (α)] dα = [a+(b-a)α+c-(c-b)α] dα= (a+2b+c)

d(A (5)

2 0 2 0 4

Shekarian et al. [11] combined this method with existing economic production

quantity models to find optimal production quantities.

2. Mathematical Formulation

The fuzzy minimum cost flow problem for damageable items transportation blends the

fuzzy set theory and the minimum cost flow problem. The objective of the proposed

problem is to minimize the total cost of sending the available supply through

transshipment nodes to satisfy the demand. It is also necessary to introduce constraints

that guarantee the feasibility of flows.

Let G=(N, A) be a directed network with node set N={1,2,3,…,n} and arc set A.

Each arc aij ∈ A stands for a route and has a positive upper bound capacity uij and a

positive cost cij . Both uij =(ulij ,uij , urij ) and cij =(clij ,cij , crij ) are taken as triangular fuzzy

numbers, because some vehicles may provide a small degree of leeway of capacity [5]

and the transportation cost of each route tends to vary. Each node i ∈ N has a bi , which

represents the nature of node n. If node i is a supply node then bi >0, if node i is a

demand node then bi <0, if node i is a transshipment node then bi =0. We use a TFN

l r

α

=(α

ij ij ,αij ,αij ) to denote the percentage of unit damage products for the route aij due to

physical vibration caused by bad road condition or improper driving behaviors etc. xij is

the decision variable which denotes the flow quantity through route aij .

Based on the above descriptions, the mathematical formulation can be developed

as follows.

min Z = ∑ni=1 ∑nj=1 cij xij (6)

s.t.

∑nj=1 xij - ∑nj=1 (1-α

ji )xji =bi , ∀i ∈ {i|bi ≥0} (7)

∑nj=1 xij - ∑nj=1 (1-α

ji )xji ≤bi , ∀i ∈ {i|bi <0} (8)

0≤ xij ≤ uij , ∀i, ∀j (9)

∑ni=1 bi -∑ni=1 ∑nj=1 α

ij xij ≥0 (10)

Here (6) indicates the cost minimization objective function. Constraint (7) and

constraint (8) represent the net flow of node i under two different situations

respectively. In addition, constraint (8) implies that demand nodes can be satisfied with

excessive items. Constraint (9) ensures that the total amount of transported damageable

items is less or equal to the capacity of route aij . Constraint (10) guarantees that the

S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem 61

total amount of items provided by the supply nodes is no less than the amount of

damaged items plus the total amount of items required by the demand nodes.

Based on the k-preference integration method, Eq. (6)-Eq. (10) can be redefined as

follows, where kc , kα , ku can be determined differently under the decision maker’s

preference.

r

min Z = ∑ni=1 ∑nj=1 [kc clij +2cij +(1-kc )cij ]xij ⁄3 (11)

s.t.

r

∑nj=1 xij - ∑nj=1 {1- [kα αlji +2αji +(1-kα )αji ]⁄3 }xji =bi , ∀i ∈ {i|bi ≥0} (12)

r

∑nj=1 xij - ∑nj=1 {1- [kα αlji +2αji +(1-kα )αji ]⁄3 }xji ≤bi , ∀i ∈ {i|bi <0} (13)

0≤ xij ≤ [ku ulij +2uij +(1-ku )urij ]⁄3, ∀i, ∀j (14)

n n n l r

∑i=1 bi -∑i=1 ∑j=1 (kα αij +2αij +(1-kα )αij )xij ⁄3 ≥0 (15)

Applying the area compensation method, Eq. (6)-Eq. (10) can be written in the

following form:

min Z = ∑ni=1 ∑nj=1 (clij +cij +crij )xij ⁄3 (16)

s.t.

∑nj=1 xij - ∑nj=1 [1- (αlji +αji +αrji )⁄3 ]xji =bi , ∀i ∈ {i|bi ≥0} (17)

n n l r

∑j=1 xij - ∑j=1 [1- (αji +αji +αji )⁄3 ]xji ≤bi , ∀i ∈ {i|bi <0} (18)

0≤ xij ≤ (ulij +uij +urij )⁄3, ∀i, ∀j (19)

∑ni=1 bi -∑ni=1 ∑nj=1 (αlij +αij +αrij )xij ⁄3 ≥0 (20)

Similarly, with the help of the signed distance method, Eq. (6)-Eq. (10) can be

expressed as:

min Z = ∑ni=1 ∑nj=1 (clij +2cij +crij )xij ⁄4 (21)

s.t.

∑nj=1 xij - ∑nj=1 [1- (αlji +2αji +αrji )⁄4 ]xji =bi , ∀i ∈ {i|bi ≥0} (22)

n n l r

∑j=1 xij - ∑j=1 [1- (αji +2αji +αji )⁄4 ]xji ≤bi , ∀i ∈ {i|bi <0} (23)

l r ⁄

0≤ xij ≤ (uij +2uij +uij ) 4, ∀i, ∀j (24)

∑ni=1 bi -∑ni=1 ∑nj=1 (αlij +2αij +αrij )xij ⁄4 ≥0 (25)

3. Numerical Experiment

The case in this section is adapted from an example in [1], which copes with the crisp

model of the minimum cost flow problem. Assume 60 units and 40 units of damageable

items are supplied by node A and node B, whereas no less than 30 units and 60 units of

damageable items are required by node D and node E respectively. Node C is a

transshipment node. Capacities and costs of the routes cannot be determined precisely

in advance. If the route aij has no specified capacity, then uij can be regarded as a large

number and hence be ignored in the mathematical model. Critical parameters of this

problem instance are shown in Figure 2.

Given that this problem is small-scale and hence can be solved by exact algorithms,

we use the Wolfram Mathematica 9 to generate optimal solutions. The imprecise

parameters are defuzzified using three methods, which are the k-preference integration

method, the area compensation method, and the signed distance method. To simplify

the problem, we let k=kc =kα =ku . Mathematical formulations and results by using the

62 S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem

GMI method and the area compensation method are shown in Figure 3 and Figure 4.

Computational results are shown in Table 1.

Figure 2. Network representation of a fuzzy minimum cost flow problem for damageable items

transportation.

Figure 3. Mathematical formulation and results by using the GMI method with Mathematica.

Figure 4. Mathematical formulation and results by using the area compensation method with Mathematica.

From Table 1, it can be clearly seen that the GMI method, the area compensation

method, and the signed distance method generated similar results. Furthermore,

decremented cost can be obtained when k is increased, which proves the correctness of

the fuzzification by using the k-preference integration method.

S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem 63

Table 1. Solutions obtained using the k-preference integration method, the signed distance method, and the

area compensation method

Variable k=0 k=0.2 k=0.8 k=1

(GMI) Distance Compensation

xAB 0.00 0.00 0.00 0.00 0.00 0.00 0.00

xAC 40.50 40.11 39.52 38.93 38.53 39.28 39.03

xAD 19.50 19.89 20.48 21.07 21.47 20.72 20.97

xBC 40.00 40.00 40.00 40.00 40.00 40.00 40.00

xCE 80.50 80.11 79.52 78.93 78.53 79.28 79.03

xDE 0.00 0.00 0.00 0.00 0.00 0.00 0.00

xED 11.22 10.92 10.48 10.04 9.75 10.21 9.94

Z 573.17 569.32 563.29 557.28 553.20 564.35 563.96

4. Conclusion

In this paper, we have presented a minimum cost flow problem for damageable items

transportation in imprecise environment. After defuzzifying the fuzzy parameters with

k-preference integration method, area compensation method, and the signed distance

method, the optimal flow can be obtained with Wolfram Mathematica.

There are three major avenues for future work. First, more defuzzification methods

such as the credibility measure method [8] or using the violence tolerance level [13]

can be used and the results could be compared to a further step. Second, more objective

functions could be added and more item properties can be considered. Finally, given

that Das et al. successfully solved a multi-objective solid transportation problem with

type-2 fuzzy variable [14], some parameters in this models could also be taken as type-

2 fuzzy numbers to better describe the problem and defuzzified to generate optimal

solutions.

References

[1] F. S. Hillier and G. J. Lieberman, Introduction to Operations Research (Ninth Edition), McGraw-Hill,

New York, 2010.

[2] H. S. Shih and E. S. Lee, Fuzzy multi-level minimum cost flow problems, Fuzzy Sets & Systems,

107(1999), 159-176.

[3] S. Ding, Uncertain minimum cost flow problem, Soft Computing, 18 (2014), 2201-2207.

[4] D. Chakraborty, D. K. Jana, T. K. Roy, Expected value of intuitionistic fuzzy number and its application

to solve multi-objective multi-item solid transportation problem for damageable items in intuitionistic

fuzzy environment, Journal of Intelligent & Fuzzy Systems, 30 (2016), 1109-1122.

[5] H. J. Zimmermann, Fuzzy Set Theory-and Its Applications, Fourth Edition, Luwer Academic Publishers,

Norwell, 2001

[6] S. H. Chen and C. H. Hsieh, A new method of representing generalized fuzzy number, Tamsui Oxford

Journal of Management Sciences, 13-14 (1998), 133-143.

[7] S. Lu and X. Wang, Modeling the Fuzzy Cold Storage Problem and Its Solution by a Discrete Firefly

Algorithm, Journal of Intelligent and Fuzzy Systems, 31(2016), 2431-2440.

[8] C. Changdar, G. S. Mahapatra, and R.K. Pal, An improved genetic algorithm based approach to solve

constrained knapsack problem in fuzzy environment, Expert Systems with Applications 42 (2015),

2276-2286.

64 S.-C. Lu and X.-F. Wang / Toward a Fuzzy Minimum Cost Flow Problem

[9] S. H. Chen and C. C. Wang, Representation, ranking, distance, and similarity of fuzzy numbers with step

form membership function using k-preference integration method, Joint 9th. IFSA World Congress and

20th NAFIPS International Conference, 2 (2001). IEEE, 801-806.

[10] S. K. De and I. Beg, Triangular dense fuzzy sets and new defuzzification methods, Journal of

Intelligent and Fuzzy Systems, 31(1) (2016), 469-477.

[11] E. Shekarian, C. H. Glock, S.M.P. Amiri, K. Schwindl, Optimal manufacturing lot size for a single-

stage production system with rework in a fuzzy environment, Journal of Intelligent and Fuzzy Systems

27 (2014), 3067-3080.

[12] J. S. Yao, K. Wu, Ranking fuzzy numbers based on decomposition principle and signed distance, Fuzzy

Sets and Systems, 116 (2000), 275-288.

[13] J. Brito, F. J. Martinez, J. A. Moreno, J. L. Verdegay, Fuzzy optimization for distribution of frozen

food with imprecise times, Fuzzy Optimization and Decision Making, 11 (2012), 337-349.

[14] A. Das, U. K. Bera, M. Maiti. Defuzzification of trapezoidal type-2 fuzzy variables and its application

to solid transportation problem, Journal of Intelligent and Fuzzy Systems, 30 (2016), 2431-2445.

Fuzzy Systems and Data Mining II 65

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-65

Mining in the Field of Electronic

Commerce

Xia SONG 1 and Fang HUANG

Shandong Agricultural Engineering Institute, Jinan, Shandong, China

technology. Data mining techniques are widely used in E-Commerce for digging

out patterns and retrieving information from large scale noisy datasets. The

booming of E-Commerce enables businesses to collect large amount of data which

could be analyzed for enhancing revenues. The abundant data collected online is

the foundation of big data analysis. How to employ data mining models on

strategizing and making business decisions is an important topic in recent years.

This paper talks about data mining and its application in E-Commerce. The

application of data mining in electronic commerce developed based on data mining

technology of electronic commerce system to strengthen the ability of business

information analysis, it is concluded that the intrinsic relationship between data

and extract the useful information, to provide the expected information of the

electronic commerce for the business management personnel, to ensure the

effective operation of the electronic commerce. Data mining techniques could be

used for automated data analysis, pattern identification, information retrieving,

business strategizing as well as providing personalized services.

Introduction

Electronic commerce is a new commerce mode in the field of business, refers to the use

of digital electronic technology to carry out all the business activities, it to the Internet

as the main body, with information technology as the core. Electronic commerce

appear to businesses and individuals to bring new opportunities and challenges,

promote the process of network of the traditional business model, change the business

activities of enterprises and personal consumption, to achieve the business activities of

the digital and intelligent.

Electronic commerce development prompted internal collected a lot of data, is in

urgent need of these data into useful information and knowledge, for enterprises to

create more potential profit. Internet access from the massive data enable data mining

with rich data base, using data mining technology can effectively help the enterprise to

highly automated analysis data, makes an inductive reasoning, found hidden in a

subsequent regularity, extract the effective information, guide enterprises to adjust their

marketing strategies, make the right business decisions, at the same time, provide the

1

Corresponding Author: Xia SONG, Shandong Agricultural Engineering Institute, Jinan Shandong,

China; E-mail: 643549139@qq.com

66 X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce

dynamic personalized and efficient service for customers, improve the core

competitiveness of enterprises.

the traditional business mode and individual’s consumption patterns as more trades and

deals are carried out online. The booming of E-Commerce enables businesses to collect

large amount of data which could be analyzed for enhancing revenues. The abundant

data collected online is the foundation of big data analysis. Data mining techniques

could be used for automated data analysis, pattern identification, information retrieving,

business strategizing as well as providing personalized services. E-Commerce business

could develop online business system which uses data mining techniques to analyze

online business data, identifying correlations within the data and making predictions of

the market.

1.1. E-commerce

the digital electronic exchanges business data and business activities [1]. E-Commerce

attracts users by its low cost, convenience, high reliability and free of time and space

constraint. There are lots of E-Commerce activities in China nowadays which include

online advertising, electronic business notes exchange, online shopping and online

payment as well as B2B, B2C and C2C business mode.

With the rapid development of network technology and database technology,

electronic commerce is more and more strong vitality, the amount of online

transactions rose year by year, but the development of electronic commerce, but also to

the traditional enterprise has brought many new problems. To increase the enterprise

electronic commerce, e-commerce platform, the emergence of a large number of

shopping websites, have all kinds of business information, these "big data" in a huge

commercial value. However, in the face of the huge amount of structural diversity,

different types of information, enterprises should be how to organize and utilize, to get

to their own value or is associated with their own demand information? The application

of data mining technology in electronic business has become an inevitable choice. The

data mining technology from noisy, disorderly data extraction, to extract the potential

unknown and useful data, and gives the logical reasoning and Visual interpretation, to

facilitate business decision-makers in a timely manner to grasp the market dynamics, to

make a reasonable decision in real time.

The application of data mining in electronic commerce developed based on data

mining technology of electronic commerce system to strengthen the ability of business

information analysis, it is concluded that the intrinsic relationship between data and

extract the useful information, to provide the expected information of the electronic

commerce for the business management personnel, to ensure the effective operation of

the electronic commerce. Many large electronic business enterprises (such as Taobao,

Jingdong Mall, etc.) provide a variety of data mining tools for managers to use in order

to increase sales, for customer relationship management also have very good help.

X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce 67

Data mining (DM), also known as the database knowledge discovery (knowledge

discovery in database (KDD), is from a large number of, completely, noisy, fuzzy and

stochastic data, extraction implied in them, people do not know in advance, but is

potentially useful information and knowledge process [2]. Data mining is a cross

discipline, it is a gathering of the database technology, artificial intelligence, machine

learning, data visualization, pattern recognition, parallel computing multiple fields of

knowledge.

Data mining is a new business information processing technology, is according to

the enterprise established business objectives, a large number of enterprise database

business data extraction, conversion, analysis and handling of other models, the

extraction key data that is helpful to business decision, reveal the hidden, unknown or

validation of known regularity and further advanced and effective method of the model.

In the electronic commerce data mining, Web mining, is to use data mining

technology from WWW resources (Web document) and behavior (Web service) to

automatically find and extract interesting and useful patterns and information [3].Web

data have 3 types: the Web data of HTML markers, Web document in the connection

structure data and user access data. According to the corresponding data type, Web

mining can be divided into 3 categories: Web content mining, is from the Web

document or the description of the process of knowledge selection; Web structure

mining, closed system is derived from the organizational structure and knowledge link

Web, its purpose is through clustering and analysis of Web links, web page structure

and useful patterns, find authoritative web pages; Web usage mining, is through mining

storage access log on Web, to discover patterns and potential customers users access

the Web, and other information of the process.

Correlation analysis digging out hidden correlation within the dataset. For example, it

could analyze the correlation of different items in one online purchase. If the customer

buys an item A then the model could predict the probability of the customer buys item

B based on the correlation of A and B. A Prior algorithm is the most commonly used

method for correlation analysis [4].

Cluster analysis is a technique that clustering objects into different groups. It could be

used to cluster customers with similar interests or items with common characteristics.

The most widely used clustering algorithms include: hierarchical clustering, centroid-

based clustering, distribution-based clustering and density-based clustering [5].

Cluster analysis is commonly used in E-Commerce for sub-dividing client groups.

The algorithm could cluster clients into different subgroups by analyzing the

similarities of their consumption patterns. The business owner could then make

different strategies and provide personalized services for different target groups.

68 X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce

properties [6]. It solves for the optimal categorizing rules based on training data and

uses the rule to categorize data other than the training set. The most popular

categorizing algorithms include genetic algorithm, Bayesian classification and neural

network.

The goal of data categorization is to classify an item into a specific class. It could

be used for analyzing existing data as well as making predictions. The algorithm builds

a classification model based on existing training data and use the model to predict

possible reactions of customers with different characteristics. Business could make

personalized service for different categories of customers.

different items. Serial pattern analysis focuses on analyzing time series data; it makes

prediction based on time series models. For example, it may discover that within a

certain time period, the purchase pattern of buying A then buying B then buying C has

a high occurring frequency [7]. It could dig out such bundles with high frequencies by

analyzing the purchasing data.

Serial pattern analysis enables the business to predict inquiry patterns of the

customers and then pushing advertisement and services that may meet customers’

demand accordingly.

Data mining is a powerful tool and provides informed guidance in the decision making

process of E-Commerce. It seeks pattern in the sea of unorganized internet traffic and

discovers valuable information to support decision making and strategy development.

Data mining is widely used in product positioning and purchasing behavior

analysis to formulate marketing strategy. It can also be applied to forecast sales market

by analyzing purchasing patterns. Currently, all the major data companies start to

embed the data mining function into their own products. For example, those giant

players such as IBM and Microsoft all incorporate the online analysis function into

their corresponding products. By mining the customer information including

customers’ visit behavior, visit content and visit frequency, the E-commerce

recommendation system based on data mining is able to analyze customer features and

to conclude their visiting patterns in order to offer tailor-made service and product

recommendation catering to customer need.

Data mining technique can figure out the correlation algorithm among products by

analyzing the portfolio in shopping cart and then fixing customers’ purchasing

behaviors accordingly, hereby generating the marketing strategy associated with

commodity displays, bundling sales and marketing promotion. The major task for

correlation analysis is to digging out hidden correlations within the dataset. One

example is the purchase of bread and butter implies the purchase of milk: over 90% of

customers who buy bread and butter will purchase milk as well. The business owner

could make better item bundling by analyzing the correlation among different goods.

X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce 69

The sales manager in Wal-Mart has found a surprising fact that the beer and

nappies, two apparently irrelevant products, are always purchased together [8]. This

phenomenon is frequently observed among young fathers who are used to pick up beers

when they are required to buy nappies in supermarket. This motivated the grocery store

to move the beer isle closer to the diaper, instant increase in sales of both.

In the E-commerce, by figuring out all the similar association rules through data

mining, online vendors can recommend commodities to customers based on their

existing products in shopping cart, thus enhancing the cross-selling. Furthermore, by

offering personalized commodity information and advertisement, customer interest and

loyalty are also expected to be increased.

4. Application Cases

Data mining techniques are used in E-Commerce for analyzing online inquiries, online

trades and registration information. It usually takes steps such as define business scope,

data collection, data preprocessing, model construction and evaluation, output analysis

and evaluation [9]. The steps above are usually repeated and iterated to get more

accurate results.

Data mining is playing more and more important role in E-Commerce. There are

successful cases applying data mining related theory and technology to the E-

Commerce [10].This section discusses the application of data mining in customer

segmentation on Taobao.com. Purchase behavior and sales behavior coexist on Taobao

platform. Experts suggest using the following 15 key factors and weights for

classifying customers and predict their behaviors, as shown in Table 1.

Influence Factors Weight

Purchase Voluntary phone inquire or onsite help 11.2

Behavior Show interest in product and inquire about promotion 10.3

69% Budget for web promotion 8.5

Have or in the process of hiring trade specialists 8.1

Used to e-commerce 7.5

Respond to EFAX/EDM/Phone Promotion 6.5

Participated in Alibaba conferences such as marketing, training and business 6.1

development

Experience with third party B2B web platform 5.4

Experience with overseas trade exhibition or domestic export exhibition 5.3

Sales Attempt to sign sales contract in the coming month 8.1

Behavior In direct competition with competitors 7.5

31% Presence of director, manager or colleague in sales process 5.4

Made proposal to clients 4.4

Client visit in one year 2.9

Open house in one year 2.6

the four tiers of clients: (1) S҆50, 90% potential client㧔2) 23҅S㧨50㧘50% potential

client㧔3) 11҅S㧨23㧘25% potential client㧔4) 0҅S㧨11㧘first time visit client.

By segmenting current customers and studying their responses to existing

marketing and promotion strategies, companies could design more targeted strategies

on how to communicate to each segment of customers.

70 X. Song and F. Huang / Research on the Application of DM in the Field of Electronic Commerce

5. Conclusion

E-Commerce is developing rapidly and generating tons of data to analyze. Data mining

enables businesses to predict market trend and customer’s behaviors, it also helps to

provide personalized service and push personalized advertisements. Business may

enhance the revenue by forming better strategies with the help of data mining analysis.

Data mining in E-Commerce will enjoy further development with the progress on

hardware technology and algorithms research as the accumulation of application

experiences.

References

[1] S. Z. Zhang, X. K. Qu, L. Zhang, Research on the Web data mining based on Electronic-Commerce,

Modern Computer, 03(2015),12–17.

[2] H. M. Wu, Sales data mining technology and e-commerce application research, Guangdong University of

Technology,2014

[3] Y. N. Zhang. Application of web data mining in e-commerce. Fujian computer, 05(2013),138-140

[4] J. X. Wu, Research on web data mining and its application in E-Commerce, Information System

Engineering, 01 (2010), 15–18.

[5] X. J. Chen, Research on data mining in electronic commerce, Information and Computer, 05(2014),135

[6] H. Y. Lu, Application of data mining techniques in e-commerce, Network and Information Engineering

(2014),73-75

[7] L. Huang, Research on the application of Web data mining in e-commerce, Hunan University,2014

[8] Y. Gao. Beer and diapers. Tsinghua University press, 2008

[9] S. Liu, Application of Web data mining technology for e-commerce analysis, Electronic technology and

software engineering, 07(2014),216-7

[10] China statistics web. Application of data mining in e-commerce.

http://www.itongji.cn/datamining/hangye/dianzishangwuzhongshujuwajuefangfadeyingyong/ 2010

Fuzzy Systems and Data Mining II 71

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-71

on OWL2

Zhi-Yun ZHENG, Zhuo-Yun LIU, Lun LI, Dun LI1 and Zhen-Fei WANG

School of Information Engineering, Zhengzhou University㧘

Zhengzhou 450001, China

Abstract. With the rapid development of the Semantic Web research, the demand

for representation and reasoning with uncertain information increases. Despite

ontology is capable of modeling the semantic and knowledge in knowledge-based

system, classical ontology languages are not appropriate to deal with uncertainty in

knowledge, which is inherent in most of the real world application domains. In this

paper, we address this issue by extending the power of expression in current

ontology language, that is, proposing a Fuzzy Multi Entity Bayesian Networks

ontology language which extends the PR-OWL and based on combination of

Fuzzy MEBN and ontology, defining and studying its syntax and semantics, and

showing representation of domain knowledge by RDF graphs. The proposed

language Fuzzy PR-OWL will move beyond the current limitation of modeling the

knowledge with fuzzy semantic or fuzzy relation in PR-OWL. By providing a

principled means of uncertainty representation and reasoning, Fuzzy PR-OWL can

serve for many applications with fuzzy and probability knowledge.

Introduction

collection, data storage, and high performance computing have gained significant

improvement. As some recent surveys, the amount of data around the world doubles

every 20 months. The mountainous amounts and various types of data complicate the

data relations. To enable the computer to automatically process and integrate valuable

data from the Internet, the semantic web, which aimed at seamless interoperability and

information exchange among web applications, and rapid and accurate identification

and invocation of appropriate web services [1], is put forward.

Nevertheless, there are several immature aspects in this area which need further

improvement. Specifically, as semantic services become more ambitious, there is

increasing demand for principled approaches to the formal representation under

uncertainty, such as incompleteness, randomness, vagueness, ambiguity and

inconsistency [2]. All these require reasonable semantic expression and enhanced

semantic inference. However, there are not enough existing theories and practices to

solve these problems well.

1

Corresponding Author. Dun LI, School of Information Engineering, Zhengzhou University,

Zhengzhou 450001, China ; E-mail: ielidun@zzu.edu.cn; iedli@zzu.edu.cn.

72 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2

expressively handles semantics analysis and effectively model uncertainty management.

Albert its practical usefulness in many aspects, but MEBN lacks the capability of

modeling the fuzzy knowledge and concept. To address this problem, the Fuzzy MEBN

(Fuzzy Multi Entity Bayesian Networks) [4-5] has been proposed in recent years, it is

able to deal with ambiguous semantics and uncertain causal relationships between

knowledge entities [5]. In this paper, we present an ontology-based Fuzzy MEBN

solution termed as Fuzzy PR-OWL (Fuzzy Probability Web Ontology Language), an

extension to the OWL2 [6]. This is an attempt at modeling both probability and fuzzy

information by ontology.

To present more details, the structure of this paper is as follows. Section 1

comparative analyzes BN and MEBN and illustrates the advantages of MEBN as well

as Fuzzy MEBN. Then the Fuzzy MEBN ontology (Fuzzy PR-OWL) is illustrated in

Section 2. Section 3 presents the representation of domain ontology using Fuzzy

PR-OWL in RDF graph form. Finally, we set out some conclusions along with future

works in Section 4.

1. Related Research

The main models of the current uncertainty representation and reasoning of semantic

web are Probabilistic and Dempster-Shafer Models, Fuzzy and Possibilistic Models [1],

etc. The representative models in probabilistic models are mainly BN and MEBN,

which ontology languages based on are BayesOWL [7] and PR-OWL2 [8-9].

BN has the ability to deal with uncertain and probabilistic events and incomplete

data set according to the causality or other type of relationships in events. However,

standard BNs has limitations of the relational information representation. Figure 1a

shows a BN that represents the probability knowledge of bronchitis, that is, smoking

may cause bronchitis, and colds that may be incurred by factors like bad weather can

also lead to airway inflammation. The BN clearly shows causation of the patient’s

illness, but it cannot represent relational information such as the effect of harmful gas

produced by others smoking on the patient. While MEBN takes advantage of first-order

logic that makes it overcome the limitations of BN. Figure 1b, where ovals present

resident node, trapezoids present input node, and pentagons present context node,

shows that person and other are entities of the class Person and context rule

other=peopleAround (person), which may link to another MFrag, defines other is

people around the person. So MEBN can represent the relationship between entities

and take effect of others’ smoking on the probability of the patient having bronchitis

into account via parent node getCold (other).

In reality, however, the experience or knowledge of human beings is

characteristic of fuzziness that can’t be dealt with by MEBN. As the example above,

the impact of slight cold must differ from a bad cold. Though MEBN can represent the

possibility of getting a cold, for instance, getCold{true 1, false 0} where 1 and 0

represent probabilities, it cannot represent the degree of cold. Another situation of

value of resident node is state. For example, suppose the weather has two states {sunny,

cloudy}. MEBN assigns the probability to these states, say {sunny 0.5, cloudy 0.5}, but

situations like partly cloudy can’t be dealt with by MEBN.

Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2 73

concepts of First-order Fuzzy Logic (FOFL) [10]. Therefore, contextual constraints of

MEBN are generalized in a way to represent the ambiguity which is usually delivered

with the imperfect semantic information. Moreover, Fuzzy MEBN updates regular BN

of MEBN to Fuzzy Bayesian Networks (FBN). Therefore, the fuzzy or ambiguity

information in section 1.1, that MEBN lacks the capability to process, can be dealt with

by Fuzzy MEBN. For example, slight cold can be represented as {true0.3 1, false0.7 0}

where the subscripts denote true values and partly cloudy can be set {sunny0.6 0.5,

cloudy0.4 0.5} where the subscripts denote membership degree.

The major differences between Fuzzy MEBN and MEBN are, in Fuzzy MEBN,

phenomenal (non-logical) constant symbols and Entity identiﬁer symbols are followed

by a real-valued membership degree subscript within region [0,1], such as Vehicle0.85

and !V4280.75, and truth value symbols or logical findings are assigned a truth value or

from the predefined finite chain of truth values = 〈! , " , … , # 〉.

The building blocks of a MEBN Theory (MTheory) are MEBN Fragments

(MFrags) that semantically and causally represent a speciﬁc notion of the knowledge.

Basic model of Fuzzy MEBN is similar to those of regular MEBN. The FMFrag can

define a probability distribution and some fuzzy rules of a resident node given

input/parent and context nodes.

A Fuzzy MFrag (FMFrag) [5] F = (C, I, R,G,D,S) consists of three kinds of nodes,

that is context nodes C, input nodes I and resident nodes R. Context nodes using

FOFL sentences to represent semantic structures of knowledge. Input nodes connect

resident nodes in other FMFrags, Finally, resident nodes are random variables

conditional on the values of the context and input nodes. Besides, G represents an

FMFrag graph set, D contains local distributions one for each resident node, and the

set S of fuzzy if-then rules used by the Fuzzy Inference System (FIS). It is worth

noted that the sets C, R and I are pairwise disjoint, and graph G is a directed acyclic

graph whose nodes belong to I∪R, and the root nodes correspond to members of I

only.

74 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2

2. Fuzzy PR-OWL

2.1. Elements

protégé-4.1 [11]. Fuzzy PR-OWL extends PR-OWL with some properties and classes

such as fuzzy random variable, fuzzy states, membership degree, and fuzzy rule sets

(FRS) to increase expressive power.

2.Finding 2.Finding

2.Domain Resident Input

Resident

2.Generative

2.Finding Input

2.Resident

FMFrag 2.Input 2.If-Part

2.Domain 4.Membership

FMFrag 2.Context 2.If-Then Rules

1.FMFrag 1.Node

2.FExemplar 2.Then-Part

Argument 1.FRS

2.OVArgument 4.Probability 2.State Assignment

Assignment

4.FArgument 1.FMTheory

2.FConstant 2.Declarative

Argument 1.Probability Distribution

Distribution

2.FMapping 4.conditioning

Argument 2.FMExpression

1.Fuzzy state

Argument 2.FPR-OWL

1.FRandom Variable MExpression

table

3.Ordinary

Variable 3.Fuzzy 2.TrueValue

LogicalOperator 2.Simple FMExpression

FMExpression

2.TrueValue

Random Variable 1. (Main Classes/elements)

3.FExemplar 2. (SubClasses)

4. (Reified Relationships)

Table 1 presents corresponding relationships between the elements of Fuzzy

MEBN, FOFL [12] and FuzzyPR-OWL. As shown in Table 1, the ontology proposed

in this paper can be represented as the sentence of Fuzzy MEBN based on FOFL.

Table 1. Corresponding relationships between FOFL, Fuzzy MEBN and FuzzyPR-OWL

Symbols for the general ,

Quantifiers Class:Quantifier

existential quantifiers ∃

Ordinary variable symbols Variables x, y,… Class:OrdinaryVariable

Phenomenal constant symbols Constants: c, d,… Class: ConstantArgument

Truth value symbols Symbols for truth value: a; Class: TrueValueRandomVariable

Data Property: hasUID

Entity identiﬁer symbols

(Range: Thing, Domain: string)

Binary connectives: v*, ∧∗ ,&*,

Logical connectives Class: FLogicalOperator

* ,equality operator =

Class: FindingFMFrag,

Findings

FindingResidentNode

Domain- Logical random

n- ary predicate symbols p, q,… Class: TrueValueRandomVariable

speciﬁc variables

random Phenomenal random

n-ary functional symbols f, g,… Class: FRandomVariable

variable variables

2.2. Syntax

An overview about the basic model of the Fuzzy PR-OWL is described in Figure 3. In

this diagram, an oval and an arrow represent general class and major relationship

Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2 75

class FMTheory which contains a group of FMFrags. For the syntax of FuzzyPR-OWL,

the link is expressed via the object property has FMFrag.

Individuals of class FMFrag are comprised of nodes. Each individual of class

Node is a random variable. Compared with PR-OWL, the major difference of this

ontology is using the FRS to define membership degrees of fuzzy states. The object

property hasFRS links one node to one or many FRS. Represented by class Probability

Distribution, the unconditional or conditional probability distributions of the random

variable are linked to its respective nodes via the object property hasProbDist. Finally,

logical expressions based on FOFL or simple expression that can describe random

variables in which arguments that may refer to entities are represented by class

FMExpression and linked to nodes via the object property hasFMExpression.

Includes

FMTheory FMFrag

(hasFMFrag)

Is built from

(hasNode)

FMExpression Node

(hasFMExpression) (hasProbDist) Distribution

Has rules

(hasFRS)

Syntax of Fuzzy PR-OWL extends the abstract syntax of OWL. The syntax rules

can be defined as the Extended Backus Naur Form(EBNF) where definition symbols

can be represented via sign ::=, terminal symbols are enclosed with quotation marks

followed by a semicolon as terminating character, an alternative symbol can be

represented by the vertical bar | , an option where everything may present just once or

never and expressions that may be omitted or repeated can be represented through

squared brackets [...] and curly braces {...}, respectively. In this paper, expressions

that repeat or only present once can be represented through curly braces {...} + and

FMTheory needs URI reference to be identified. The fundamental structure of Fuzzy

PR-OWL presented as follows:

FMTheory ::= ‘FMTheory(’ [URI reference] |annotation| {FMFrag}+ ‘)’;

FMFrag ::= ‘FMFrag(’ FMFrag_id ‘,’ {Node}+ ‘,’{ ParentRel} ‘)’;

Node ::= ‘Node (’ Node_id ‘,’ FMExpression [‘,’ ProbilityDistribution ‘,’ { If-ThenRule } ] ‘)’;

ParentRel ::= ‘hasParent (’ Node ‘,’ Node ’)’;

*_id ::=’ UID(‘ letter { letter | digit } ’)’;

In literature [2], FRS can be defined in the form of if-then rules. For example, the

conditional probability of a variable with two parents can be shown as -(. = /! |0! =

-, 0" = 1). For the form of an if-then rule, such relation can be indicated as: ‘If P 1 is p

and P 2 is q, then V is v 1 ’, wherein all the states are fuzzy states which are denoted as

4 = [{056 }78 ] where 056 and 6 are the probability distribution and membership

degree of ith state, respectively. While according to the FBN formula of MEBN in

literature [4], the FRS only defines membership degree. Therefore, probabilities and

membership degrees are respectively defined through conditional distributions and

if-then rules in this paper. Next we will present models of conditional probability

distributions, FRS and fuzzy expression to show the syntactic structure of Fuzzy

PR-OWL.

76 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2

A node’s probability distribution depends on the state configuration of its parents.

PR-OWL2 uses string to present the probability distribution, but this approach need

syntactic parser to analyze declarative syntax. To embed more probability information

to ontology, this paper describes probability distribution by ontology.

In FuzzyPR-OWL, class ProbabilityAssignment indicates the assignment of

probabilities in condition of parents’ state which presented by class ConditioningState.

Class StateAssignment indicates assignments of a state, such as state name and

probability, as illustrated in Figure 4. The basic structure is defined below:

ProbabilityDistribution ::= FPR-OWLTable;

FPR-OWLTable ::= ‘FPR-OWLTable(’PRTable_id‘,’{ProbabilityAssignment }+‘)’;

ProbabilityAssignment ::=‘ProbabilityAssignment(’ProbabilityAssignment_id‘,’

{StateAssignment}+ [’,’ { CondingtioningState}+ ] ‘)’;

StateAssignment ::= ‘State (stateName(’ string’)’ [‘, stateProbability (’float’)’] ‘)’;

CondingtioningState ::= ‘CondState(’ Node_id ‘,’ StateAssignment ‘)’;

string ::= letter { letter | digit };

Probability ResidentNode

Distribution FRS

/InputNode

ResidentNode * 1 *

/InputNode FPR-OWL Table

1 1 1 1 If-Then Rule

* *

1 1

* 1 1 1

Probability

*

ConditioningState StateAssignment * 1

Assignment

1 1 If-Part StateAssignment Then-Part

1 1 1 *

x FRS

Fuzzy PR-OWL adopts If-Then Rule to define FRS and constrain membership

degree of fuzzy states. As showed in Figure 5, an If-Then Rule of resident nodes may

include one or more If-Part and a Then-Part. Every instance of If-Part corresponds to

an assumption of a parent node and the instance of Then-Part corresponds to the

assignment of fuzzy states in resident node. The structure of If-Then Rule is defined

below:

FRS ::= If-ThenRule;

If-ThenRule ::= ‘If-ThenRule(’ If-ThenRule_id ‘, ’ {If-Part}+ ‘,’ Then-Part ‘)’;

If-Part ::= ‘if (’Node_id‘, ’ StateAssignment ‘)’;

Then-Part ::= ‘then(’ {StateAssignment}+ ’)’;

StateAssignment ::= ‘State (stateName(’ string’)’ [‘,’MembershipDegree] ‘) ’;

MembershipDegree ::=’Membership( degree(’ float ‘)’[‘ ,descript(’ string’)’];

1

*

* FMFrag

1 Ordinary

Node 1 Variable

*

1 *

1 Exemplar 1

Exemplar

1 FArgument

*

FMExpression

FArgument

*

1 *

*

RandomVariable *

1

Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2 77

x Fuzzy Expression

As showed in Figure 6, this part proposes the model of fuzzy expression that can

represent the constraints or fuzzy relationships between entities.

The expression represents the relationship between entities in FuzzyPR-OWL

wherein the class FMExpression can present true value expression of the context node

or simple expression of other kind of nodes. The former indicates logical expressions

based on FOFL and the latter can be deemed as random variables of input nodes or

resident nodes with some arguments. The class Exemplar here indicates the general or

existential quantifiers in fuzzy expression which is in the form of Skolem. The

structure of fuzzy expression is defined below:

FMExpression::= ‘FMExpression(’FMExpression_id‘,’[ exits|forAll Exemplar_id ’,’] Expression ‘)’;

Expression::= Term [“and”|”or” Term] [“=”Term] [”implies”|’iff’ Term];

Term::=[“not”] RandomVariable_id [’(’Argument_id{,Argument_id}+’)’] | FMExpression_id|

OrdinaryVariable_id;

RandomVariable::=‘RandomVariable(’ RandomVariable_id ‘, hasPossibleValues(’ {URI

reference}+’)’ [‘,defineUcertaintyOf(’URI’)’] [‘,probDistr(‘PrTable_id’)’]

[‘,trueValue(‘float’)’]’)’;

OrdinaryVariable::=‘OrdinaryVariable(’ OrdinaryVariable_id ‘, ( class(’ DomainClass_URI ’))’;

Argument ::= ‘Argument(’Argument_id‘,’[‘type(’Thing‘)’][‘,typeOfData(’ Literal’)’ ]

[‘,‘MembershipDegree ‘])’;

Exemplar ::= ‘Exemplar (‘Exemplar_id ‘,’[‘type(’ Thing ‘)’] [‘,typeOfData(’ Literal’)’ ] )’;

[‘,’MembershipDegree ])’;

2.3. Semantic

interpretations of FOFL TF [11].

The structure 9 = 〈9: , 0?>; , … ; A?>; , … ; BC, /C, … 〉is a 4-tuple with the follow structure:

x DI is a nonempty set called the domain of the structure;

x {0 CCCC

D } are n-ary relations adjoined to each n-ary predicate symbol{06 }㧧

# #

functional symbol{A6# }㧧

x BC, /̅ , … ∈ 9: are elements which are assigned to each constant u, v of the

language LF.

Assume that LF contains one constant 5 ∈ HI associated with each element

5̅ ∈ 9: (a name of d). Let B ∈ HI be a constant. Then its interpretation is an

element J(B) ∈ 9: . Let A CCC#C be a function assigned to A # and L! , L" , . . . , L# be terms

K

without variables, then JA # (L! , L" , . . . , L# ) = CACKC#C(L! , L" , . . . , L# ).

The fuzzy function, defined in fuzzy set theory, can be deemed as special fuzzy

relations. Note that functional symbols are introduced for the sake of completeness,

since they can be replaced by special predicates [11]. Consider the corresponding

relationships between elements of LF and that of TF shown in table 2, the definition of

D can be further illustrated below.

x LF using entity identiﬁer symbols N identifies entities or elements which are

assigned to the constants.

x The phenomenal random variables and logical random variables in LF

represent the fuzzy functions and predicates respectively. The former possible

78 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2

values can be N ∪ {⊥}, and the latter can be either a real number ∈ [0,1], or

a member of the chain = 〈! , " , … , # 〉.

The random variables mentioned here can be represented as the expressions in

section 2.2. The probability and the membership degree of possible values of functions

can be assigned by joint probability distribution, and the If-Then rules, respectively. LF

uses phenomenal random variable with n-ary arguments to represent a function. The

(V ) (V ) (V )

function A ̅: ∆→7 Umaps a vector of entity identiﬁer symbols ∆= 〈N! W , N" , … , N# X 〉,

like input arguments, into the vector of identiﬁer symbols U =

(Z ) (Z ) (Z )

〈Y! W , Y" , … , Y\ ^ 〉,like fuzzy state or fuzzy value assignment, where the value of

for various arrangements of arguments and possible values are predeﬁned in the

language by the fuzzy interpretation of A,̅ which can also be represented as the fuzzy

relation [12] in which the truth values of a relation of input set are resulted, that is

_: 〈`, U〉 → ∈ {! , " , … , # }. By the matching of domain entity identiﬁer symbols and

domain entities, the function or relation can map the n-ary vector of domain entities

into the entities for phenomenal random variables or true values of domain assertions

for logical random variables.

3. Use Case

In the Equipment Diagnosis problem, the belt status and room temperature can affect

the engine status. This problem represented by an EngineStatus FMFrag is shown in

figure 7. In the figure, Isa(Machine,m) represents that m is an instance of machine,

EngineStatus(m), BeltStatus(b) and RoomTemp(r) represent the engine status of

machine m, status of belt b and temperature of room r, respectively. Suppose that the

engine status node has a local distribution shown in table 2 where superscripts denote

the membership degree.

Table 2 Local distribution of EngineStatus FMFrag

RoomTemp(r) BeltStatus(b) EngineStatus(m)

(Normalα1;Hotα2) (OKβ1;Brokenβ2) Satisfactoryα1 Overheatedα2

Normal OK 0.8 0.2 0

Normal Broken 0.6 0.4 0

… … … … …

isA(m,Machine) isA(r,Room)

...

m=BeltLocation(b) Isa(Belt,b)

MachineLocation(m)

R=MachineLocation(m)

MachineLocation_FMFrag

BeltStatus(b) RoomTemp(r)

EngineStatus(m)

EngineStatus_FMFrag

EquipmentDiagnosis_FMTheory

Representing the probability distribution of EngineStatus in the FuzzyPR-OWL is

shown in Figure 8. The upper large parallelogram and the nether large parallelogram

represent Fuzzy PR-OWL ontology and domain ontology, respectively. This figure

shows part of information in table 2 that is the probability distribution of node

Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2 79

EngineStatus which includes the probability assignment of states like OverHeated once

conditioning state of parent BeltStatus is OK.

fpr:Probability fpr:hasProbability fpr:Probability fpr:InputNode

Distribution Assignment Assignment (ResidentNode)

fpr:hasState fpr:hasCond

fpr:hasCondNode

Assign State

fpr:hasState fpr:Conditioning

fpr:StateAssignment

fpr:hasState Assign State

string Name fpr:hasState

Prob

float

Domain Ontology

es:EngineStatus fpr:hasProbability es:EngineStatus es:BeltStatus

Table Assignment PA1 InputNode

fpr:hasState fpr:hasCond

0.2 Assign fpr:hasCondNode

fpr:hasState State

Prob

fpr:hasState es:EngineStatusSA1 es:EngineStatus

Overheated Name ConditioningState

fpr:hasState

Assign

fpr:hasState

OK es:BeltStatusSA1

Name Parent Node EngineStatus

...

… BeltStatus Overheated …

… OK 0.2 …

The FRS of Engine Status which denotes the membership degree of states in

condition of assignment of parent nodes is shown in Figure 9. The RDF graph shows if

state of parent node BeltStatus is Normal OK (suppose words that describe degree

include Very, Normal, A little), then membership degree of state Overheated for

EngineStatus is 0.5.

FuzzyPR-OWL ontology

fpr:hasThen

fpr:Then-Part Part

fpr:FRS fpr:ResidentNode

\InputNode

fpr:hasState

Assign fpr:hasIfPart

string fpr:hasCondNode

fpr:hasState

Name

fpr:StateAssignmt fpr:hasState

fpr:Membership fpr:hasMembership fpr:If-Part

Assign

fpr:hasMembership

Degree

float

Domain Ontology

Es:EngineStatus fpr:hasThen Es:EngineStatus

Then-Part Part

es:BeltStatus

IfThenRule1

ResidentNode

fpr:hasState

Overheated fpr:hasIfPart fpr:hasCondNode

fpr:hasState Assign

Name

es:EngineStatus es:EngineStatus es:EngineStatus

Membership1 fpr:hasMembership

SA1 If-Part

fpr:hasMembership fpr:hasState

Degree Assign

es:BeltStatus

0.5 SA1

fpr:hasState If BeltStatus is Nomal OK and ,

fpr:hasDegreeDiscription Name

OK then degreeOf(EngineStatus) is

Normal {Overheated 0.5 , }

es:EquipmentDiagnosis_ r=MachineLocation(m)

FMTheory

fpr:hasFMFrag fpr:hasFMFrag

es:DomainFMFrag.Enginestate es:DomainFMFrag.BeltLocation

fpr:hasResidentNode

fpr:hasContextNode

fpr:hasOV fpr:hasOV

es:BeltLocation

es:ContextNode_C es:Enginestate_ _DomainRes ...

es:Enginestate_

X Mfrag.mechine

Mfrag.room fpr:hasFMExp

fpr:hasFMExp fpr:isSubstutedBy es:MachineLoc_

fpr:typeOfFArg FMExp

fpr:isSubstutedBy

File:/...#Machine

fpr:typeOfFArg es:FMExpression_CX1 fpr:typeOfFMExp

es:CX1_2_

File:/...Engine.owl es:MachineLoc_

inner_1

#Room fpr:hasFArg fpr:hasFArg RandomVariable

fpr:typeOfFMExp

es:CX1_1 fpr:hasFArg

es:CX1_2 fpr:typeOfFMExp

fpr:typeOfFArg

fpr:hasArgNumber es:equalTo es:CX1_2_inner_

fpr:hasArgNumber FMExp

1

fpr:hasPossibleValue 2

0.9

The fuzzy expression r=MachineLocation(m) of the context node is shown in

Figure 10. The expression defines the relation between room r and machine m which is

80 Z.-Y. Zheng et al. / A Fuzzy MEBN Ontology Language Based on OWL2

connected to another FMFrag MachineLocation. The deep color ovals constitute the

main parts of the expression, including logical connective equalTo with a truth value,

arguments CX1_1 and CX1_2 which respectively correspond to the ordinary variable

room in EngineStatus FMFrag and random variable MachineLocation(m) in

BeltLocation FMFrag.

4. Conclusion

semantic web area. Probability ontology language based on OWL2 is envisioned as an

important approach to achieve this goal. In view of the weakness of the ontology

language on lacking the ability of synchronously modeling probability and fuzzy

knowledge, this paper proposed the Fuzzy PR-OWL ontology language based on Fuzzy

MEBN that adds expressive power of widespread fuzzy knowledge in PR-OWL2

related domain. Domain cases in the last part show that Fuzzy PR-OWL can represent

the probability or fuzzy information in a specific domain well.

As for future work, we intend to construct a reasoning frame for Fuzzy PR-OWL

by studying more about FOFL and fuzzy BN theory and improve it continuously.

Acknowledgment

This work was funded by the key scientific and technological project of Henan

Province (162102310616)

References

[1] P. Michael, Uncertainty Reasoning for the Semantic Web III, Springer International Publishing, 2013.

[2] K. J. Laskey and K. B. Laskey, Uncertainty Reasoning for the World Wide Web: Report on the

URW3-XG Incubator Group, International Workshop on Uncertainty Reasoning for the Semantic Web,

Karlsruhe, Germany, 2008.

[3] K. B. Laskey, MEBN: A language for first-order Bayesian knowledge bases, Artificial

Intelligence, 172(2008):140-178.

[4] K. Golestan, F. Karray, and M. S. Kamel, High level information fusion through a fuzzy extension to

Multi-Entity Bayesian Networks in Vehicular Ad-hoc Networks, International Conference on

Information Fusion, (2013):1180-1187.

[5] K. Golestan, F. Karray, and M. S. Kamel, Fuzzy Multi Entity Bayesian Networks: A Model for Imprecise

Knowledge Representation and Reasoning in High-Level Information Fusion, IEEE International

Conference on Fuzzy Systems, (2014):1678-1685.

[6] P. Hitzler, et al, OWL2 Web Ontology Language Primer(Second edition) (2015).

[7] Z. L. Ding, and Y. Peng, A Probabilistic Extension to Ontology Language OWL, Hawaii International

Conference on System Sciences, 4(2004):40111a-40111a.

[8] P. C. Costa, G. Da, K. B. Laskey and K. J. Laskey, PR-OWL: A Bayesian Ontology Language for the

Semantic Web, Uncertainty Reasoning for the Semantic Web I:, ISWC International Workshop, URSW

2005-2007, Revised Selected and Invited Papers, (2008):88-107.

[9] N. C. Rommel, K. B. Laskey, and P. C. G. Costa, PR-OWL2.0 – Bridging the Gap to OWL

Semantics, Uncertainty Reasoning for the Semantic Web II, Springer, Berlin Heidelberg, (2013):1-18.

[10] V. Novák, On the syntactico-semantical completeness of first-order fuzzy logic, Kybernetika

-Praha- 2(1990):47-66.

[11] N. F. Noy, et al, Creating semantic web contents with protégé-2000, IEEE Intelligent Systems, 16

(2001): 60–71.

[12] W. Gueaieb, Soft computing and intelligent systems design: Theory, tools and applications, Neural

Networks IEEE Transactions on, 17(2004):825-825.

Fuzzy Systems and Data Mining II 81

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-81

Based on Fuzzy Rough Sets

De-Hua HE1, Jin-Ding CAI, Song XIE, Qing-Mei ZENG

College of Electrical Engineering and Automation, Fuzhou University, Fuzhou, China

Abstract. Return voltage method (RVM) is a good method to study aging state of

transformer insulation, but it is difficult to accurately assess the insulation aging

state by a single characteristic quantity. In this paper, the fuzzy rough sets theory

combined with RVM is proposed to assess the oil paper insulation state of

transformer and construct the assessment system of oil paper insulation of

transformer based on a lot of test data. First, the evaluation index of oil-paper

insulation status of transformer is established by return voltage characteristic

parameters. Then, fuzzy c-means clustering algorithm is used to obtain the

membership function of the transformer test data along with fuzzy partition of

characteristics .Moreover, the fuzzy attributes of assessment table of oil paper

insulation statue is simplified according to the distinct matrix,and it extracts the

evaluation rule of oil paper insulation condition. Finally, the examples in this

paper demonstrate that the assessment system is effective and feasible, which

provides a new idea for the assessment of transformer oil-paper insulation state.

The research has practical value in application of engineering

Introduction

Transformers play a vital role in the whole electrical power system. Due to a large

number of transformers within electric utilities are approaching the end of their design

life, there has been a growing interest in the condition assessment of transformer

insulation currently. The degradation of the main insulation system in transformer is

recognized to be one of the major causes of transformer breakdown [1- 3].

Methods based on the analysis of electrical polarization in dielectrics are often

used in the diagnostics of paper-oil insulation state. Three parameters customarily were

selected to assess the oil-paper insulation [4-5]. However, due to the characteristics of

insulation aging affected by a variety of factors, it is difficult to accurately assess the

insulation aging state by a single feature. The grey correlation method was introduced

for the insulation condition assessment [6], but did not consider the amount of

redundant characteristics in condition assessment of oil paper insulation, the assesse

process is complicated.

In this paper, fuzzy rough set theory is introduced and multiple characteristics are

considered to comprehensive assesse the condition of oil paper insulation. The method

solves the problem of partial information is incomplete and unknown. It has been used

1

Corresponding Author: De-Hua HE, College of Electrical Engineering and Automation, Fuzhou

University, Fuzhou, China; E-mail:153367542@qq.com.

82 D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets

that the fuzzy C means clustering algorithm (FCM) in disperse important data category

to form classification attribution [7]. The characteristics fuzzy rules and the insulation

assessment system are established based on historical database.

Rough set theory is a powerful tool in dealing with vague and uncertain information.

The basic idea of the fuzzy rough model is that a fuzzy similarity relation is used to

construct the fuzzy lower and upper approximations of a decision. The sizes of the

lower and upper approximations reflect the discriminating capability of a feature subset.

The union of fuzzy lower approximations forms the fuzzy positive region of decision.

Let a universe U as a finite nonempty set of objects. Each object within U is defined by

a set of attributes, denoted by A. The pair (U, A) is an information system (IS), where

for every subset P⊆A there exist an associated similarity relation. ǴRp(x,y) denote the

similarity of objects x, and y induced by the subset of features p. Given X⊆U, X can be

approximated by the information contained in P through the construction of the P-

lower and P-upper approximations of X as defined in Eqs. (1):

P R p X _ ( x) inf I ( P R p ( x, y ), P X ( y))

yU

P R p X ( x) sup T ( P R p ( x, y ), P X ( y ))

(1)

yU

Where I represents the fuzzy implicator and T is the t-norm, and Rp is the fuzzy

similarity relation induced by the subset of features P. The degree of similarity of

object with respect to subset of features can be constructed using Eq. (2)

(2)

where μRa(x,y) is the degree to which objects x and y are similar for feature a. It

employs a quality measure termed the fuzzy-rough dependency function γP(Q) that

measures the dependency between two sets of attributes P and Q, which is defined by:

P POS ( x) ¦P POS RP ( x)

J Pi (Q ) RP xU

U U

(3)

where the fuzzy positive region, which contains all objects of U that can be

classified into classes of U/Q using the information in P, is defined as:

( )

D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets 83

Not all attributes are necessary for assessment of oil-paper insulation system, removal

of these extra features and the amount of property vague language entry does not affect

the original oil-paper insulation diagnostic effect. The discernibility matrix can be

reduction of condition attributes and attribute values. Specific reduction steps are as

follows: 1: Calculate the similarity relation of fuzzy attributes Ck

min{Ck ( xi ), Ck ( x j )} Ck ( xi ) z Ck ( x j )

Rk ( xi , x j ) ®

¯ 1 Ck ( xi ) Ck ( x j )

(5)

Calculate the matrix evaluation system M(U,R)=( cij) n×n,:

°^Rk :1 Rk ( xi , x j ) t Oi ` Oi t O j

cij ®

°̄ Oi O j

( )

(U,R)= ш { щ (cij): cij ≠ Ø} 㧧 5: gD(U,R)=( ш R1) щ ̖ щ ( ш Rl) 㧧 6:output

RedD(R)={ R1,̖, Rl}㧧7: Building assessment rules table, delete duplicate evaluation

rule, extract oil-paper insulation condition assessment rules.

3. Membership of Characteristic

In this paper, FCM is used to calculate the cluster center of each cluster and

membership of transformer test data. Let (U,PыQ)be(a fuzzy decision system with

U={x1, x2,̖, xn}, fuzzy condition attributes P is divided into three categories, and

cluster centers V={v1,v2, v3},The relationship between sample and cluster centers can

be expressed by membership degree. Membership function is obtained by the algorithm,

and then membership degree matrix μ is obtained:

ª P11 P1 j P1n º

« »

P « P21 P2 j P2 n » j 1,

1 ,n

« P31 P3 j P3n »¼

¬ ( )

2

(1/ x j vi )1/ m1

Pij 3

¦ (1/

2

x j vc )1/ m 1

c 1 ( )

84 D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets

3 n

¦¦ (P )

2

min J ( Pij , vn ) ij

m

x j vi

i 1 j 1 (9)

n

1

vi n ¦ (P ij ) m x j , i 1, 2,3

¦ (P

j 1

ij )m j 1

(1 )

The test data and ageing information of transformers are shown in Table 1. The P1ޔP2ޔ

P3ޔP4 and P5 is condition attribute of tcdom Urmp Srmax Rg Cg, respectively, Q

is the fuzzy decision attribute of oil paper insulation. According to the relevant

regulations power equipment, the transformer insulation is divided into good (B) and

bad (G). The characteristics are divided into 15 fuzzy attributes Ck(k=1,2,…,15). The

membership of the fuzzy attribute Ck of the test data is obtained by the FCM algorithm.

The membership is listed in Table 2.

Table 1. Return voltage test sample data of transformers

x1 2518 183.5 31.20 12.26 92.17 G

x2 546.6 353.4 257.1 1.96 186.9 B

x3 1214 256.0 96.4 4.899 109.3 G

x4 667.4 385.5 293.2 1.440 190.1 B

x5 2415 175.0 32.11 13.35 70.36 G

x6 1226 248.2 87.66 4.026 106.8 G

x7 649.5 363.4 179.2 2.743 169.9 B

x8 3613 269.4 80.10 11.00 45.23 G

x9 333.7 32.60 74.02 2.830 64.38 B

x10 3540 223.4 23.70 3.682 99.51 B

x11 1265 236.1 44.50 1.537 235.3 B

x12 2655 218.5 67.72 11.77 80.40 G

x13 896.9 169.7 120.5 2.885 149.8 B

x14 1524 320.3 79.24 2.832 183.0 G

x15 3289 239.7 19.71 13.05 47.88 G

x16 700.6 83.45 46.40 2.339 125.7 G

x17 189.1 313.8 54.50 1.253 168.8 B

x18 2706 110.7 32.43 12.40 127.1 G

According to Eqs.(1) and (4), And the most important for assessment is P4,

followed by P3, P5, P1 and P2, and the set is calculated by attributes reduction

D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets 85

attribute, decision rules are listed in Table 3, the elements in the table are membership

of interval .Taking 3 transformers not in historical database as example, the basic

information are shown in Table 4. According to the insulation assessment process, the

membership of transformers is obtained, and the results are shown in Table 5. The

membership degree of Transformer T1 match the rule 1, based on the assessment rules,

the insulation of transformer T1 is well, does not need maintenance. Membership

degree of T2 match the assessment rule 6, according to the rules, insulation of T2 aging

serious, need maintenance. Membership degree of T3 match the assessment rules 9,

insulation of T3 aging serious. The diagnosis results of T3 is well judged by the method

proposed in the reference [4], the result is different from the actual condition. Three

diagnosis results are in perfect agreement with the actual condition; results have

verified the method based on fuzzy rough sets theory is effective and accurate.

Table 2. Membership function of partial fuzzy attributes

T L M H L M H L M H L M H L M H Q

C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15

1 5 13 81 8 87 4 99 0 0 0 0 99 37 59 2 G

2 99 0 0 0 0 99 0 0 99 98 1 0 0 1 97 B

3 1 98 0 3 2 13 2 96 0 8 90 1 1 98 0 G

4 95 3 0 1 5 93 1 1 97 91 8 0 0 0 98 B

5 7 20 72 14 80 5 99 0 0 0 1 97 97 2 0 G

6 1 98 0 2 89 8 0 99 0 0 99 0 3 96 0 G

7 97 2 0 0 1 98 16 38 45 88 11 0 4 18 76 B

8 3 6 90 4 66 28 1 98 0 2 3 94 92 6 1 G

9 94 4 0 94 3 1 8 90 0 82 17 0 99 0 0 B

10 2 5 92 0 99 0 96 4 0 7 91 0 14 83 2 B

11 0 99 0 1 96 2 95 4 0 92 7 0 4 8 86 B

12 2 6 91 0 99 0 24 75 0 0 0 99 79 18 1 G

13 55 43 1 18 75 5 13 81 4 78 21 0 9 57 32 B

14 5 92 2 0 6 93 2 97 0 82 17 0 1 3 94 G

15 1 1 97 1 94 3 94 5 0 0 0 98 93 5 1 G

16 92 6 0 98 1 0 92 7 0 99 0 0 3 94 2 G

17 89 9 1 1 10 88 72 26 0 84 11 1 5 20 74 B

18 1 4 93 85 12 2 99 0 0 0 0 99 3 93 3 G

Rule

C3 C4 C8 C9 C10 C12 C15 Q

2 (0 , 0.5) (0 , 0.5) (0.5 , 1) (0 , 0.5) (0.5 , 1) (0 , 0.5) (0.5 , 1) G

86 D.-H. He et al. / State Assessment of Oil-Paper Insulation Based on Fuzzy Rough Sets

4 (0 , 0.5) (0, 0.5) (0.5 , 1) (0 , 0.5) (0 , 0.5) (0 , 0.5) (0 , 0.5) G

5 (0.5 , 1) (0 , 1) (0 , 0.5) (0 , 0.5) (0 , 0.5) (0.5 , 1) (0 , 0.5) G

6 (0 , 0.5) (0, 0.5) (0 , 0.5) (0 , 1) (0.5 , 1) (0 , 0.5) (0.5 , 1) B

7 (0 , 0.5) (0, 1) (0.5 , 1) (0 , 0.5) (0.5 , 1) (0 , 0.5) (0 , 0.5) B

8 (0 , 0.5) (0.5 , 1) (0.5 , 1) (0 , 0.5) (0.5 , 1) (0 , 0.5) (0 , 1) B

9 (0.5 , 1) (0 , 0.5) (0 , 0.5) (0 , 0.5) (0 , 0.5) (0 , 0.5) (0 , 0.5) B

Traf Model years tcdom Urmp Srmax Rg/GΩ Cg/nF Furfur state

T1 SFSE-220 1 2314 230 27 15.74 90.35 0.06 Good

T2 SFP-220 14 1449 243 45 1.795 364.2 0.74 Bad

T3 cub-/220 22 3328 289 24 4.027 95.48 0.99 Bad

Traf P1(H) P2(L) P3(M) P3(H) P4(L) P4(H) P5(H) Rule Result

T1 0.6684 0.0034 0.0123 0.0007 0.0001 0.9980 0.0236 1 G

T2 0.0052 0.0132 0.0418 0.0014 0.9883 0.0050 0.9932 6 B

T3 0.9787 0.0341 0.0245 0.0016 0.0007 0.0000 0.0170 9 B

5. Conclusion

To avoid a single characteristics impact the correctness of the insulation condition

assessment, the fuzzy rough sets theory combine with RVM is proposed and used to

assess the oil paper insulation of transformer. The results demonstrate that the

assessment system is effective and feasible which provides a new idea for the

assessment of transformer oil paper insulation.

References

[1] T. K. Saha, Review of modern diagnostic techniques for assessing insulation condition in aged

transformers, IEEE Trans. Dielectr. Electr. Insul. 10(2003), 903-917.

[2] M. de Nigris, R. Passaglia, R. Berti, L. Bergonzi and R. Maggi, Application of modern techniques for the

condition assessment of power transformers, CIGRE Session 2004, , France, Paper A2-207, 2004.

[3] W. G. Chen, J. Du, Y. Ling, et al. Air-gap discharge process partition in oil-paper insulation based on

energy-wavelet moment feature analysis. Chinese Journal of Scientific Instrument, 34(2013):1062-1069.

[4] Y. Zou, J. D. Cai. Study on the relationship between polarization spectrum characteristic quantity and

insulation condition of oil-paper transformer. Chinese Journal of Scientific Instrument, 36(2015): 608-

614.

[5] R. J. Liao, H. G. Sun, Q. Yuan, et al. Analysis of oil-paper insulation aging characteristics using Return

voltage method. High Voltage Engineering, 37(2011): 136-142.

[6] J. D. Cai and Y. Huang. Study on Insulation Aging of Power Transformer Based on Gray Relational

Diagnostic Model. High Voltage Engineering, 41(2015): 3296- 3301.

[7] S. H. Gao, L. Dong, Y. Gao, et al. Mid-long term wind speed prediction based on rough set theory.

Proceedings of the CSEE, 32(2012): 32-37.

Fuzzy Systems and Data Mining II 87

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-87

Networked Systems with State and

Communication Delay

He-Jun YAO1 , Fu-Shun YUAN and Yue QIAO

School of Mathematics and Statistics, Anyang Normal University, 455000, Anyang,

Henan, China

has been considered. T-S approach has been used to model the controlled

nonlinear systems. By using the Lyapunov functional method, a finite-time

stabilization sufficient condition has been given. Then, a state feedback fuzzy

controller has been designed to make the closed networked control systems finite-

time stable. Finally, the proposed design method has been used into the

temperature control system for polymerization reactor.

Introduction

Networked control systems (NCSs) are the feedback control systems with a network.

As we all know, NCSs has a lot of advantages, for example ease of maintenance, low

cost, greater flexibility . In recent years, a number of papers have been report on

analysis and control of NCSs[2-4]. In order to design the networked-based control, Gao

obtained a new delay system approach by using LMI approach [5]. In [6], Walsh et al.

considered the asymptotic stable of nonlinear NCSs. For the NCSs with long

communication delay, the networked-based optimal controller has been designed in [7].

Yue etc. considered the H f control problem of NCSs with uncertainty [8].

As a useful approach, the fuzzy control approach is usually used to design the

robust control for nonlinear systems. With the well-known T-S approach, many papers

have been published on the stabilization and control problem for nonlinear delay

systems [9-10]. In [11], by considering the insertion of the network, in order to ensure

systems properties, a new two-step approach has been introduced. For the nonlinear

NCSs, the input-to-state stability problem has been considered in [12]. But the results

of the above papers have only focused on the asymptotic stability of dynamic systems.

A few paper considered the finite-time stability of nonlinear NCSs. Therefore, the

finite-time control problem of nonlinear NCSs worthy to be concerned, which

motivates this paper.

1

Corresponding Author. He-Jun YAO, School of Mathematics and Statistics, Anyang Normal

University, 455000, Anyang, Henan, China; E-mail addresses: yaohejun@126.com.

88 H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems

In this paper, by using the LMI approach, based on the Lyapunov functional

approach, we obtained the fuzzy controller design methods and a finite-time stability

condition.

1. Problem formulation

Rule i :

IF z1 (t ) is M 1i , z 2 (t ) is M 2i ,…, z n (t ) is M ni

THEN x(t ) Ai x(t ) Adi x(t d ) Biu (t ) GiZ (t ) (1)

x(t ) I (t ) t [d , 0]

where z1 (t ) , z 2 (t ) , , z n (t ) are the premise variable. x(t ) Rn is the systems

state vector. u(t ) Rm is the controlled input vector 㧘

i

M k (i 1, 2, , r ; k 1, 2, , n) are fuzzy sets. r is the numbers of IF-THEN

rules. n is the numbers of fuzzy sets. Ai , Adi , Bi ,Gi are known constant matrices . d is

state delay . I (t ) Rn is initial state on [ d , 0] . Z (t ) Rl is the exogenous disturbance

and satisfies

T

³ 0

ZT (t )Z (t )dt d d , d t 0 (2)

Actuator Plant Sensor

Controller

By using the T-S approach, without considering the communication delay, the

networked systems are described by[13]

r

x(t ) ¦ P ( z(t ))[ A x(t ) A

i

i i di x(t d ) Biu (t ) GiZ (t )] (3)

x(t ) I (t ) t [d, 0]

where Pi ( z(t )) satisfying

r

Pi ( z (t )) t 0, ¦ Pi ( z (t )) ! 0 i 1, 2, ,,rr

㧘

i 1

Assumption1[14]. The controller and actuator are event driven, the sensor-

controller delay is W sc ; the sensor is time driven, the controller-actuator delay is

W ca .Therefore, the communication delay is W W sc W ca .

By insertion of the network, with considering the communication delay W , the

control systems of Fig1 with a network is

r

x(t ) ¦ P ( z(t ))[ A x(t ) A

i

i i di x(t d ) Biu (t W ) GiZ (t )] (4)

x(t ) I (t ) t [d, 0]

In this paper, we would design the following controller

H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems 89

r

u (t ) ¦ P ( z(t ))K x(t )

i

i i

(5)

Inserting the above controller㧔5㧕 into network system (4), we obtain the closed

systems:

r r

x(t ) ¦¦ P ( z(t ))P ( z(t ))[ A x(t ) A

i j

i j i di x(t d ) Bi K j (t W ) GiZ (t )] (6)

x(t ) \ (t ) t [ d, 0]

We suppose the initial state x(t ) \ (t ) is a smooth function on [d , 0] ,

d max{W , d} . So, ||\ (t ) ||d \ t [d , 0] , where \ is a positive constant.

Definition1[15] For the given positive scalars c1 , c2 , T , positive matrix R , the

time delay NCSs (6) (setting Z (t ) { 0 ) is finite-time stable, if

xT (0) Rx(0) d c1 xT (t ) Rx(t ) c2 t [0, T ] (7)

Definition2[16] For the given positive scalars c1 , c2 , T , positive matrix R , with the

state feedback controller, the time delay NCSs (6) is finite-time stabilization if the

following condition holds

xT (0) Rx(0) d c1 xT (t ) Rx(t ) c2 t [0, T ] (8)

2. Main Results

Theorem1. For the given positive scalars c1 , c2 , T , positive matrix R , the NCSs (6) is

finite-time stabilization, if there are scalar D t 0 , matrix K i R mu n , positive matrices

P, Q, T R nun , S R l ul to make the matrix inequalities hold

ª; PAdi PGi º PBi K j

«
Q 0 »» 0

« 0

(9)

«
0 » T

« »

¬
D S ¼

c1 (Omax ( P) hOmax

m (Q ) WOmax

m x (T )) d Om )( e D T )

max ( S )(1

c2 e D T (10)

Omin ( P)

where

; PAi AiT P Q T D P , P R 1/ 2QR

Q 1/ 2 , T R 1/ 2TR 1/ 2

R 1/ 2 PR

P 1/ 2 , Q

and Omax ( ) and Omin ( ) are the maximum and minimum eigenvalue.

Proof. For the positive matrix P, Q, T in Theorem 1, we choose the Lyapunov

function[13]:

t t

V ( x(t )) : xT (t ) Px(t ) ³ xT (T )Qx(T )dT ³ xT (T )Tx(T )dT (11)

t -h t -W

T

ª x(t ) º ª PAi AiT P Q T PAdi PBi K j PGi º ª x(t ) º

« x(t d ) » « »« »

r r

« Q 0 0 » « x(t d ) »

V ( x(t )) ¦¦

¦ Pi ( z (t )) P j ( z (t )) « »

i 1 j 1 « x(t W ) » « T »

0 « x(t W ) »

« » « »« »

¬ Z (t ) ¼ «¬ 0 »¼ ¬ Z (t ) ¼

From condition (9), we have

90 H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems

D t

Multiplying (12) by e , we obtain

e D tV ( x(t )) e D tDV ( x(t )) D e D t Z T (t ) SZ (t )

Furthermore

d D t

(e V ( x(t ))) D e D t Z T (t ) SZ (t )

dt

From 0 to t , integrating the above inequality, with t [0, T ] ,

t

eD tV ( x(t )) V ( x(0)) ³ D eDT Z T (T ) SZ (T )dT (13)

0

QR1/ 2 , and T R TR , we can obtain the

following relation:

xT (t ) Px(t ) d V ( x(t )) eDT [c1 (Omax ( P) hOmax (Q) WOmax m (T )) d Omax )( eD t )]

m ( S )(1

(14)

On the other hand, it yields

xT (t ) Px(t ) xT (t ) R1/ 2 PR1/ 2 x(t ) t Ommin ( P) xT (t ) Rx(t ) (15)

Putting together (14) and (15) we have

eDT [c1 (Omax ( P) hOmax

m (Q ) WOmax

max (T )) d Omax )( e DT )]

m ( S )(1 (16)

xT (t ) Rx(t )

Omin ( P)

Condition (10) and inequality (16) imply,

xT (t ) Rx(t ) d c2 , t [0, T ] .

Theorem2. For the given positive scalars c1 , c2 , T , positive definite matrix R ,

with the fuzzy controller (5), the NCSs(6) is finite-time stabilization, if there are

scalars D t 0, Oi ! 0, i 1, 2,3, 4. , matrix K Rmun , positive matrices X , Q, T R nun ,

S R l ul to make the following matrix inequalities hold:

ª4 Adi X Bi K j Gi º

« » (17)

« Q 0 0 »

0

« T 0 »

« »

¬« D S ¼»

O1 R 1 X R 1 (18)

O2 Q O1 X (19)

O3T O1 X (20)

0 S O4 I (21)

ª d O4 (1 e D T ) c2 e D T c1 h Wº

« » (22)

« O1 0 0 »

0

« O2 0 »

« »

«¬ O3 »¼

where

4 Ai X XAiT Q T D X

Proof. Left-and right-multiplying the inequality (9) by diag{P -1 , P -1 , P -1 , I } , the

inequality (9) is equivalent to

ª6 Adi P 1 Bi K j P 1 Gi º

« 1 1 » (23)

« P QP 0 0 »

0

« P TP1 1

0 »

« »

¬« D S ¼»

where 6 Ai P 1 P 1 AiT P 1QP 1 P 1TP 1 D P 1

H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems 91

to inequality (17).

On the other hand, we denote X R 1/ 2 XR 1/ 2 , Q R 1/ 2QR

Q 1/ 2 , T R 1/ 2TR 1/ 2 .

Omin (P)

(18-21) imply that

1 O O

1 Omin ( P)), Ommax ( P) , O (Q) 1 Ommax ( P)), Omax (T ) 1 Omax ( P)), Omax

m x ( S ) O4

(24)

O1 mmax O2 O3 m

With the Schur Lemma, the inequality (22) is equivalent to

c h W

d O4 (1 eDT ) c2 eDT 1 0 (25)

O1 O2 O3

With (24), the condition (10) follows that

c1 (Omax ( P) hOmax

m (Q ) WOmax

m x (T )) d Omax )( e D T )

m ( S )(1 c h W

d O4 (1 eD T ) 1 (26)

Omin ( P) O1 O2 O3

Inserting the inequality (25) into (26), the inequality (10) is satisfied.

3. Numerical Example

The temperature control system for polymerization reactor is a inertia link with time

delay. The state space model of polymerization reactor is usually written as[6]

x1 (t ) x2 (t )

x2 (t ) a1 x1 (t ) a2 x2 (t ) bu (t )

y (t ) x1 (t )

It is impossible to avoid the external disturbance and time delay. We consider the

nonlinear delay system with norm-bounded uncertainties as following

x(t )

x( Ai x(t ) Adi x(t d ) Biu (t )

x(t ) \ (t ) d dt d0

where

ª 30 0 º ª3 12º ª 2 0.5º ª 3 1º ª1 º ª0 º ª1º

A1 « 0 20 » , A2 «1 0 » , Ad 1 « 0.5 2 » , Ad 2 « 0.1 1» , B1 « 2» , B2 «1 » ,\ (t ) « 1» ,

¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼

d 0.2,W 0.5

Solving the LMIs (17), the gain matrix can be obtained

K1 K1P 1 [3.4529 1.6837], K 2 K 2 P 1 [8.6183 4.3602]

With the state feedback controller (5) in Theorem 2, and choosing the initial

conditions \ (t ) [2 0.5]T

The simulation results are shown in the following figures 2-3

92 H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems

Figure 2. x1 (t ) of systems

Figure 3. x2 (t ) of systems

In the above figures, one can see that the systems is well finite-time stable.

4. Conclusion

In this paper, by introducing the Lyapunov approach and a new finite-time stable

approach, a finite-time stabilization condition is obtained. Based on this condition, the

state feedback fuzzy controller has been designed by using LMI.

Acknowledgments

This work was supported by Anyang Normal University Innovation Foundation Project

under Grant ASCX/2016-Z113.

H.-J. Yao et al. / Finite-Time Stabilization for T-S Fuzzy Networked Systems 93

References

[1] Y. Xia, Y. Gao, Recent progress in networked control systems-a survey, International Journal of

Automation and Computing, 12(2015), 343-367.

[2] G. Chen, Q. Lin, Finite-time observer based cooperative tracking control of networked large range

systems, Abstract and Applied Analysis, 2014, Article ID 135690.

[3] B. Chen, W. Zhang, Distributed fusion estimation with missing measurements, random transmission

delays and packet dropouts. IEEE Transactions on Automatic Control, 59(2014), 196-1967.

[4] J. Chen, H. Zhu, Finite-time H f filtering for a class of discrete-time Markovian jump systems with partly

unknown transition probabilities. International Journal of Adaptive Control and Signal Processing,

28(2014), 1024-1042.

[5] H. Gao, T. Chen, J. Lam, A new delay system approach to network-based control, Automatica, 44(2008),

39-52.

[6] G. C. Walsh, H. Ye, L G. Bushnell, Stability analysis of networked control systems, IEEE Trans on

Control Systems Technology, 10(2002), 438-446.

[7] S. Hu, Q. Zhu, Stochastic optimal control and analysis of stability of networked control systems with

long delay, Automatica, 39(2003),1877–1884.

[8] D. Yue, Q. L. Han, and J. Lam, Network-based robust H∞ control of a system with uncertainty,

Automatica, 4(2005), 999- 1007.

[9] Z. H. Guan, J. Huang, G. R. Chen, Stability Analysis of Networked Impulsive Control Systems, Proc. 25th

Chinese Control Conference, 2006, 2041-2044.

[10] Y. Tian, Z. Yu, Multifractal nature of network induced time delay in networked control systems,

Physics Letter A, 361(2007), 103-107.

[11] G. C. Walsh, O. Beldiman, L. G. Bushnell, Asymptotic behavior of nonlinear networked control

systems, IEEE Transactions on Automatic Control, 46(2001), 1093–1097.

[12] D. Nesic, Observer design for wired linear networked control systems using matrix inequalities,

Automatica, 44(2008), 2840-2848.

[13] S. He, H. Xu, Non-fragile finite-time filter design for time-delayed Markovian jumping systems via T-S

fuzzy model approach, Nonlinear Dynamic, 80(2015), 1159-1171.

[14] D. Huang, S. Kiong, State feedback control of uncertain networked control systems with random time

delays, IEEE Transactions on Automatic Control, 53(2008), 829-834.

[15] F. Amato, M. Ariola, P. Dorate, Finite-time stabilization via dynamic output feedback, Automatica,

42(2006), 337-342.

[16] F. Amato, M. Ariola, C, Cosentino, Finite-time control of discrete- time linear systems: Analysis and

design conditions, Automatica, 46(2010), 919-924.

94 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-94

Decision Making Based on Rough Sets

Zhi-Ying LVa,b,1, Ping HUANGb, Xian-Yong ZHANGc,d and Li-Wei ZHENGe

a

College of Mathematics, University of Electronic Science and Technology of Chi-

na Chengdu, Sichuan, China

b

College of Management, Chengdu University of Information Technology,Chengdu,

Sichuan, China

c

College of Mathematics and Software Science, Sichuan Normal Universi-

ty,Chengdu, Sichuan, China

d

Institute of Intelligent Information and Quantum Information,Sichuan Normal

University, Chengdu, Sichuan, China

e

College of Applied Mathematics, Chengdu University of Information Technology,

Chengdu, Sichuan, China

to solve complex systems, and has wide, practical application. This paper studies

the FMADM of the trapezoidal fuzzy number. In order to achieve desirable deci-

sion making, the similarity measures between two trapezoidal fuzzy numbers is

defined, which is based on a new method for ranking fuzzy numbers. A new algo-

rithm is proposed to remove surplus attributes. This algorithm is based on rough

sets and the technique for order of preference by similarity to ideal solution (TOP-

SIS) method; Finally, an example is examined to demonstrate the model’s use in

practical problems.

tion, rough sets

Introduction

The main idea of multiple attribute decision making (MADM) problems is to rank the

alternatives or choose the optimal solution. However, the available information is often

imprecise or vague. In this case, a better solution is to use fuzzy number. Fuzzy theory

[1] is able to address many decision problems that experts and decision makers struggle

to respond to, because of lack of information. Over the years, many theories and appli-

cations have been proposed for solving FMADM problems [2-3].To deal with these

fuzzy situations, experts are usually encouraged to use the trapezoidal fuzzy number,

which can involve the triangular number and interval number. At the same time, rank-

ing fuzzy numbers [4-5] is very important in real time decision-making applications.

Therefore there is a need for a procedure which can rank fuzzy numbers in more condi-

1

Corresponding Author: Zhi-Ying Lv, College of Mathematics, University of Electronic Science and

Technology of China, Chengdu 611731, China; College of Management, Chengdu University of Information

Technology; E-mail: lvZhiying1979@163.com.

Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets 95

tions. Ref [6] gives a way to rank trapezoidal fuzzy numbers based on the circumcenter

of centroids. This is a very practical method, which can incorporate the importance of

using the mode and spreads of fuzzy numbers.

Study found that correlations among the attributes will seriously affect the scientif-

ic objectivity and fairness of the evaluation, so attribute reduction [7-8] is an essential

subject in MADM. Usually, the rough set theory is a useful tool to study the attribute r

reduction problems. This theory is initiated by Pawlak in 1982 [9]. However, few stud-

ies have been conducted on the problem of attribute reduction in fuzzy decision making.

In this paper, a new FMADM method is presented, in which the distance between

two trapezoidal fuzzy numbers is defined and a fuzzy number attribute reduction meth-

od based on the TOPSIS method and rough sets [10] is proposed.

1. Preliminaries

In this section, we give the concepts of rough sets and trapezoidal fuzzy numbers and

their extensions.

U is a set which have finite elements and R is an equivalence relation on U , then the

equivalence class containing x is given by [ x]R .

Let S (U , C,V , f ) be an information system, where C is the set of attributes, V is

the domain of attribute values, V U cCVc , where Vc is a nonempty set of values of

attribute c C , called the domain of c , f : U u C o V is an information function that

maps an object in U to exactly one value in Vc such that c C , x U , f ( x, c) Vc .

For B C , denote [ x]RB { y U | ( f ( x, b), f ( y, b)) R, b B} , RB {[ x]RB : x U } , that

is, RB is the set of equivalence relation classes. A subset B C has its lower and upper

approximations of X U , which are defined as:

apr ( X ) {x U | x |RB X } and apr ( X ) {x U | x |RB X z )}

| apr (U ) |

and the approximate quality is rBU .

|U |

Because rCU 1 , if ck C , s.t. rCU{ck } 1 , then ck is a reduction of C . Otherwise,

ck is a dispensable attribute. The set of all dispensable attributes is the core of C ,

which can be denoted by core(C) .

Below, we briefly review the definition of the trapezoidal fuzzy number and the rank-

ing method.

Definition 1 The membership function of a trapezoidal fuzzy number

~

P (a, b, c, d ; Z ) is given by:

96 Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets

Z (a t )

° a b adt db

°° Z bdt dc

P ~p (t ) ® Z (d t )

° cdt dd

° d c

¯° 0 otherwise

~

where, -f a d b d c d d f , 0 d Z d 1 . If Z 1 , then P is normalized and can be

~

denoted by P (a, b, c, d ) , which is shown in figure 1.

We may see a trapezoidal fuzzy number as a trapezoid, which can be divided into

three plane figures. These figures are two triangles (APB and CQD) and a rectangle

(BPQC). Suppose G1, G2 , G3 are the centroids of these figures, which can form a new

triangle ( G1, G2 , G3 ).

Now, we give the definition of the circumcenter of the trapezoidal fuzzy number.

Definition 2[6] Let ~p (a, b, c, d ;Z) is a generalized trapezoidal fuzzy number, The

circumcenter S ~p ( ~x0 , ~y0 ) of the triangle( G1 G2 G3 ) can be defined as:

a 2b 2c d (2a b 3c)(2d c 3b) 5Z 2

S ~p ( x 0 , y 0 ) ( , ) (1)

6 12Z

Definition 3[6] Based on the circumcenter of centroids S ~p ( ~x0 , ~y0 ) , the ranking

function of fuzzy number is defined as:

R( ~

p) (~

x0 ) ~p ( ~

y0 ) ~p (2)

This represents the area of a rectangle, which is formed by S ~p ( ~x0 , ~y0 ) and the origin.

As the value of R( ~p) increases, so does the fuzzy number ~p .We can define the dist-

ance between two normalized trapezoidal fuzzy numbers according to the distance bet-

ween their circumcenter points of the centroids because these points can be considered

red better balancing points for the trapezoidal fuzzy numbers.

~ ~

Definition 4 Let P1 (a1, b1, c1, d1) and P2 (a2 , b2 , c2 , d2 ) be two normalized trapezoi-

~ ~

dal fuzzy numbers, let SP1 (~

x01, ~

(~

x02 , ~

y01 ) and S P2

y02 ) be the circumcenter of the centroi-

~ ~ ~ ~

ds of P1 and P2 respectively, then the distance between P1 and P2 is defined by

~ ~

d ( P1, P2 ) ~x

1 ~2 2

0 x0 ~y

1

0 ~

y02

2

(3)

Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets 97

Denote the set of all attributes by C {c1, c2 ,", cm} . Assume the weight vector of the att-

m

ributes is Z (Z1 , Z2 ,", Zm )T , such that ¦ Z j 1 , where Z j t 0 and Z j denotes the wei-

j 1

~

ght of attribute C j . Suppose P ( ~pij ) num is the trapezoidal fuzzy decision matrix given

by the expert, where ~pij (aij , bij , cij , dij ) is the attribute value of the alternative xi with

respect to the attribute c j C .

Given the fuzzy and rough theories described above, the proposed FMADM procedure

is defined as follows:

~ ~

Step 1. Construct the circumcenter of the centroid matrix O (( xij , yij )) of P .

~

Step 2. Construct the value matrix Q (qij ) of P .

Step 3. Determine the positive ideal and negative idea solution using the following

steps:

½ ½

°~

p j

~

® pij : i N , qij max qij °¾ and p j

~ °~

® pij : i N , qij min qij °¾ (4)

°̄ iN °¿ °̄ iN °¿

Then,

{ p1 , p2 ,!, pm

AP

} and A N { p1 , p2 ,!, pm

} (5)

~

Step 4. The distance between pij and the positive value or negative values are de-

fined as:

dij d(~ p j )

pij , ~ x

j xij y

2

j yij

2

and dij d(~ p j )

pij , ~ x

j xij y

2

j yij

2

(6)

where x j , y j and x j , x j are the circumcenters of the centroids of p j and p j respec-

tively. Then calculate the similar degrees tij between ~pij and the idea solution and con-

struct matrix T (tij )mun , where

dij

tij (7)

dij dij

Step 5. Construct a judgment matrix M (mij )8u6 through T (tij )mun , where

0 0 d tij 0.3

°

mij ®1 0.3 d tij 0.6 (8)

°2 0.6 d t d 1

¯ ij

98 Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets

relation RB about B C .

For xi U and [ xi ]RB {xk : mki mij , c j B} , so RB {[ xi ]RB : i N } .The lower ap-

proximation of U about B is defined by apr (U ) {xi : [ xi ]RB [ xi ]CR , i N } , then the ap-

| apr (U ) |

proximate quality is rBU . Because rCU 1 , if ck C , s.t. rCU{ck } 1 , then ck is a

|U |

reduction of C .

Step 7. Give the weight vector Z (Z1, Z2 ," , Zt ) of the set of all non-superfluous at-

tributes, then calculate the values of all alternatives:

t

di ¦ Z jtij (i 1,2,", n) (9)

j 1

Then choose the best alternatives based on the ranking value of d i .

In this section, we present an example to show how the given model works in practice.

A fuzzy multiple attribute decision with trapezoidal fuzzy number involves a company

making an investment decision. Let us consider an investment company, which wants

to make the best investment decision for a given sum of money.

There is a panel with eight possible alternatives U {x1, x2 ,", x8} in which the

company can invest. Each alternative is assessed on six attributes C {c1, c2 ,", c6} . The

decision makers compare these eight companies with respect to the attributes, then con-

~

struct the decision matrix P ( ~pij )8u6 , which is shown below:

§ (0.7,0.72,0.75,0.8) (0.4,0.45,0.6,0.63) (0.7,0.72,0.82,0.9) (0.5,0.5,0.64,0.72) (0.18,0.19,0.2,0.21) (0.09,0.1,0.14,0.17) ·

¨ ¸

¨ (0.54,0.57,0.59,0.6) (0.5,0.52,0.6,0.63) (0.5,0.62,0.62,0.7) (0.5,0.5,0.54,0.6) (0.18,0.19,0.2,0.21) (0.09,0.09,0.098,0.1) ¸

¨ (0.7,0.73,0.78,0.79) (0.5,0.52,0.6,0.63) (0.6,0.72,0.8,0.9) (0.8,0.85,0.9,0.92) (0.21,0.23,0.25,0.27) (0.1,0.1,0.15,0.2) ¸¸

¨

¨ (0.6,0.63,0.66,0.73) (0.4,0.45,0.6,0.63) (0.7,0.72,0.86,0.9) (0.44,0.5,0.66,0.7) (0.17,0.18,0.18,0.19) (0.09,0.12,0.15,0.18) ¸

¨ ¸

¨ (0.72,0.75,0.77,0.8) (0.7,0.73,0.81,0.83) (0.7,0.72,0.8,0.83) (0.7,0.7,0.74,0.8) (0.19,0.21,0.24,0.26) (0.1,0.16,0.18,0.22) ¸

¨ (0.54,0.57,0.59,0.6) (0.4,0.46,0.5,0.56) (0.7,0.75,0.8,0.92) (0.4,0.5,0.54,0.62) (0.18,0.19,0.2,0.21) (0.1,0.12,0.13,0.13) ¸

¨ ¸

¨ (0.6,0.63,0.69,0.71) (0.5,0.52,0.7,0.74) (0.41,0.45,0.5,0.51) (0.44,0.5,0.66,0.7) (0.18,0.19,0.21,0.23) (0.12,0.18,0.21,0.22) ¸

¨ (0.72,0.75,0.77,0.8) (0.5,0.52,0.6,0.63)

© (0.71,0.72,0.86,0.9) (0.7,0.7,0.74,0.8) (0.19,0.21,0.24,0.26) (0.1,0.16,0.18,0.22) ¸¹

~

Step1.Using Eq. (1), construct the circumcenter of the centroid matrix O (( xij , yij )) :

§ (0.7400,0.4146) (0.5217 ,0.3933) (0.7800,0.4036) (0.5833,0.3964) (0.2267 ,0.4161) (0.1233,0.4146) ·

¨ ¸

¨ (0.5767 ,0.4159) (0.5617 ,0.4907 ) (0.6133,0.4135) (0.5300,0.4143) (0.1950,0.4165) (0.0943,0.4166) ¸

¨ (0.7517 ,0.4137 ) (0.5617 ,0.4907 ) (0.7567 ,0.3991) (0.8700,0.4127 ) (0.2400,0.4158) (0.1333,0.4135) ¸¸

¨

¨ (0.6517 ,0.4138) (0.5217 ,0.3933) (0.7933,0.3975) (0.5767 ,0.3887 ) (0.1800,0.4166) (0.1350,0.4148) ¸

¨ ¸

¨ (0.7600,0.4155) (0.7683,0.4097 ) (0.7617 ,0.4097 ) (0.7300,0.4143) (0.2250,0.4153) (0.1667 ,0.4146) ¸

¨ (0.5767 ,0.4159) (0.4800,0.4119) (0.7867 ,0.4085) (0.5167 ,0.4092) (0.1950,0.4165) (0.1217 ,0.4165) ¸

¨ ¸

¨ (0.6583,0.4123) (0.6133,0.3867 ) (0.4700,0.4134) (0.5767 ,0.3887 ) (0.2017 ,0.4160) (0.1867 ,0.4147 ) ¸

¨ (0.7600,0.4155)

© (0.5617 ,0.4097 ) (0.7950,0.3983) (0.7300,0.4143) (0.2250,0.4153) (0.1667 ,0.4146) ¸¹

~

Step 2. Based on Eq. (2), construct the value matrix Q (qij )8u6 of P as follows:

Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets 99

¨ ¸

¨ 0.2398 0.2756 0.2536 0.2196 0.0812 0.0393 ¸

¨ 0.3110 0.2756

¨ 0.3020 0.3590 0.0999 0.0551 ¸¸

¨ 0.2697 0.2052 0.3153 0.2242 0.0750 0.0560 ¸

¨ ¸

¨ 0.3158 0.3148 0.3121 0.3024 0.0934 0.0691 ¸

¨ 0.2398 0.1977 0.3214 0.2114 0.0812 0.0507 ¸

¨ ¸

¨ 0.2714 0.2372 0.1943 0.2242 0.0839 0.0774 ¸

¨ 0.3158 0.2301

© 0.3166 0.3024 0.0934 0.0691 ¸¹

Step 3. Based on Eqs. (4) - (5), determine the positive ideal and negative ideal so-

lutions:

AP {(0.72,0.75,0.77,0.8) (0.7,0.73,0.81,0.83) (0.71,0.72,0.86,0.9) (0.8,0.85,0.9,0.92)

(0.21,0.23,0.25,0.27) (0.12,0.18,0.21,0.22)}

AN {(0.54,0.57,0.59,0.6) (0.4,0.46,0.5,0.56) (0.41,0.45,0.5,0.51) (0.4,0.5,0.54,0.62)

(0.17,0.18,0.18,0.19) (0.09,0.09,0.098,0.1)}

Step 4. Construct the similar degree matrix T (tij )8u6 based on Eqs. (6) - (7) as

follows:

§ 0.8908 0.1559 0.9512 0.1910 0.7783 0.3144 ·

¨ ¸

¨ 0 0.2835 0.4401 0.0402 0.2500 0 ¸

¨ 0.9537 0.2835 0.8823 1 1 0.4228 ¸¸

¨

¨ 0.4092 0.1559 0.9942 0.1773 0 0.4407 ¸

¨ ¸

¨ 1 1 0.8923 0.6038 0.7500 0.7836 ¸

¨ 0 0 0.9601 0 0.2500 0.2965 ¸

¨ ¸

¨ 0.4453 0.4640 0 0.1773 0.3618 1 ¸

¨ 1

© 0.2835 1 0.6038 0.7500 0.7836 ¸¹

Step 5. Based on Eq. (8) , construct the judgment matrix M (mij )8u6 :

0 2 0 2 1· §2

¨¸

0 1 0 0 0¸ ¨1

0 2 2 2 1¸ ¨2

¨¸

0 2 0 0 0¸ ¨1

M ¨¸

2 2 2 2 2¸ ¨2

0 2 0 0 0¸ ¨0

¨¸

1 0 0 1 2¸ ¨1

0 2 2 2 2 ¸¹¨2

©

Step 6. Compute the equation class RB , where B C .

RC c1 ^^x1`,^x2`,^x3`,^x4 , x6`,^x5`,^x7 `` ,

RC c2 ^^x1`, ^x2 `, ^x3`, ^x4`, ^x5 , x8`, ^x6`, ^x7 ``,

RC c ^^x1`, ^x2 , x4 `, ^x3`, ^x5 `, ^x6 `, ^x7 `, ^x8 `` ,

3

RC c ^^x1`, ^x2 `, ^x3`, ^x4 `, ^x5 `, ^x6 `, ^x7 `, ^x8 `` ,

5

6

1 5

100 Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets

RC {c4 ,c5 } ^^x1, x3`, ^x2 `, ^x4 `, ^x5`, ^x6 `, ^x7 `, ^x8`` ,

RC {c3 ,c5 } ^^x1`, ^x2 , x4 `, ^x3`, ^x5 `, ^x6 `, ^x7 `, ^x8 `` ,

RC {c2 ,c5 } ^^x1`, ^x2 `, ^x3`, ^x4 `, ^x5 , x8 `, ^x6 `, ^x7 `` ,

RC {c5 ,c6 } ^^x1`, ^x2 `, ^x3 , x8 `, ^x4 `, ^x5 `, ^x6 `, ^x7 `` ,

RC ^^x1`, ^x2 `, ^x3`, ^x4 `, ^x5 `, ^x6 `, ^x7 `, ^x8 `` .

Thus, rC {c5 } ( X ) 1 ; therefore, c5 is a reduction of C . so, core(C ) ^c1, c2 , c3 , c4 , c6` .

So we can deduct the fifth line in the matrix T .

Step 7. Let Z {0.18,0.24,0.16,0.23,0.19} be the weight vector of ^c1, c2 , c3 , c4 , c6 ` .

Then using Eq. (9) calculates the values of the alternatives as follows:

d1 0.0511 , d 2 0.0393 , d3 0.0551 , d 4 0.0560 ,

d5 0.0691 , d 6 0.0507 , d 7 0.0774 , d 4 0.0691 .

Therefore, we can conclude that the most desirable alternative is x7 .

4. Conclusion

In this article, a new fuzzy attribute decision making method is proposed, in which the

attributed values are trapezoidal fuzzy numbers. An attribute reduction method is pro-

posed based on the distance definition between two trapezoidal fuzzy numbers and

rough sets, which can improve the accuracy of the evaluation. In future research, the

decision model presented in this paper will be extended to interval type-2 fuzzy values

based on Ref. [10].

Acknowledgment

This paper is supported by the National Natural Science Foundation Project of China

(No.61673285; No.61203285; No. 41601141); the Province Department of Soft Sci-

ence Project in Sichuan (2016ZR0095); soft Science Project of the technology bureau

in Chengdu (2015-RK00-00241-ZF); the high level research team of the major projects

division of Sichuan province (Sichuan letter [2015] no.17-5); the Project of Chengdu

University of Information Technology (N0.CRF201508, CRF201615)

References

[2] Z.Y. Lv, X. N. Liang, X. Z. Liang, L. W. Zheng, A fuzzy multiple attribute decision making method

based on possibility degree, 2015 12th International Conference on Fuzzy Systems and Knowledge

Discovery, January 13, 2016, 450-454.

[3] D.F. Li, Multiple attribute decision making method using extended linguistic variables. International

Journal Uncertain Fuzziness Known Based System, 17(2009): 793-806.

[4] G. Facchinetti, R.G. Ricci and S. Muzzioli, Note on fuzzy ranking Triangular numbers. International

Journal of Intelligent Systems, 13(1998):613-622.

[5] Z.S. Xu and Q.L. Da, Possibility degree method for ranking internal numbers and its applications.

Journal of Systems and Engineering, 18(1)(2003):67-70.

Z.-Y. Lv et al. / A Trapezoidal Fuzzy Multiple Attribute Decision Making Based on Rough Sets 101

[6] P.B. Rao and N.R. Shanker, Ranking fuzzy numbers with an area method using circumcenter of centroids.

Fuzzy Information and Engineering, 1( 2013): 3-18

[7] Z.Y. Lv, T. M. Huang and F.X. Jin. Fuzzy multiple attribute lattice decision making method based on the

elimination of redundant similarity index. Mathematics in Practice and Theory, 43(10)(2013):173-

181

[8] X.Y. Zhang and D.Q. Miao, Quantitative/qualitative region-change uncertainty/certainty in attribute

reduction, Information Sciences, 334-335(2016):174--204.

[9] Z. Pawlak, Rough Sets. International Journal of Computer and Information Science, 11(1982):34-356.

[10] L. Dymova, P. Sevastjanov and A. Tikhonenko, An interval type-2 fuzzy extension of the TOPSIS

method using alphacuts. Knowledge-based Systems, 83(2015):116-127.

102 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-102

Price Momentum and Market

Capitalization

Ratchata PEACHAVANISH1

Department of Computer Science, Thammasat University, Pathum Thani, Thailand

requiring complex decision making under uncertainty. This paper proposes a

method that applies fuzzy rule-based inference to rank stocks based on price

momentum and market capitalization. Experiments performed on Thai stock

market data showed that high-momentum stocks significantly outperformed the

market index benchmark, and that stocks of companies with small market

capitalization performed better than larger ones. Fuzzy rule-based inference was

applied to combine both the momentum factor and the market capitalization factor,

with different sets of rules for different prevailing market conditions. The result

produced a higher investment return than using either momentum or market

capitalization alone.

Introduction

Stock market investing is a high-risk activity with a potentially high reward, requiring

complex decision making based on imprecise and incomplete information under

uncertainty. Typically, two analytical approaches are utilized in investment decision

making: fundamental analysis and technical analysis. Decisions based on fundamental

analysis primarily consider the business entity represented by a stock. The information

under consideration includes the nature of the business, its profitability, its

competitiveness, and most importantly its financial standing through detailed study of

its financial statements. For technical analysis, a stock is treated separately from the

business entity. Only stock price movements and patterns generated by them are used

in making trading decisions. Technical analysis views price movements as being

governed by supply and demand of market participants and aims to exploit them.

This paper proposes a technical analysis-based method that applies fuzzy rule-

based inference on stock price momentum and market capitalization (company size),

with different sets of rules for different prevailing market conditions. The method was

tested on the Stock Exchange of Thailand.

1

Corresponding Author: Ratchata PEACHAVANISH, Department of Computer Science, Thammasat

University, Pathum Thani, Thailand; E-mail: rp@cs.tu.ac.th.

R. Peachavanish / Fuzzy Rule-Based Stock Ranking 103

1. Related Works

There is a large and diverse body of research literature on computerized stock market

investing. Techniques in soft computing, fuzzy logic, machine learning, and traditional

data mining have been applied to address various aspects of stock trading, utilizing

both fundamental analysis and technical analysis. Support vector machine and genetic

algorithm were applied on business financial data to perform stock selection that can

outperform market benchmark [1, 2]. Fuzzy logic was applied on stock price

movements to market time stock trades [3], to create a new technical indicator that

incorporated investor risk tendency [4], and to assist in portfolio management [5, 6].

Machine learning experiments on technical analysis-based trading conducted by [7] did

not outperform the market benchmark when using transaction costs. In addition, using

sentiment data obtained from social networks to assist in stock market investing has

also been attempted [8, 9]. A recent comprehensive review of works using evolutionary

computing methods can be found in [10].

Stock markets in different regions have different rules and characteristics. Highly-

developed and efficient markets, such as the New York Stock Exchange, differ greatly

from emerging markets like the Stock Exchange of Thailand. In smaller markets,

extreme price movements are more common as a few well-funded participants can

dictate the market direction in the short term and affect market volatility. This is

especially true for market participants that are classified as foreign fund flows [11].

Lack of regulation and enforcement against insider trading in emerging markets like

Thailand also makes market inefficient and unfair [12]. These differences make

comparisons among research studies difficult. A working strategy under one market

environment may not be effective in another. Nevertheless, the industry-standard way

of judging an investment strategy is to compare the investment return against the

market index benchmark. Most mutual funds, in the long term, failed to outperform the

market [13]. The method proposed in this paper provides a superior investment return

to the market index. It is described in the next section.

2. Method

The strategy proposed in this paper is based on a key technical analysis principle: price

moves in trend and has momentum. This momentum effect, which implies that stock

price tends to continue on its current direction due to inertia, has been observed in

stock markets [14, 15]. Price reversal then occurs after the momentum weakens.

According to this principle, buying stocks with strong upward momentum is likely to

give superior result to buying stocks with weaker or downward momentum. The

strategy is then to make trading decisions based on a technical indicator that reflects

stock price momentum, which by definition is computed from past price series. This

reactive approach therefore makes no attempt is explicitly forecast future price, but

rather to take actions based on past price behavior.

Additionally, past evidence suggested that company’s market capitalization, or its

size, also determines the characteristic of its stock return [16]. In general, stocks of

small companies (the so called “small caps” stocks) tend to be far more volatile than

those of large, established companies (“big caps”). This is simply due to the tendency

for small companies to grow faster, albeit with higher risk. During a bull market, small-

104 R. Peachavanish / Fuzzy Rule-Based Stock Ranking

cap stocks as a group far outperform big-cap stocks. On the other hand, investors prefer

the relative safety of big-cap stocks during an economic downturn or a bear market.

To see how trading using momentum and market capitalization can provide

addition returns above the market index, experiments were performed on the Thai

stocks spanning January 2012 to July 2016. The pool of stocks for the experiments

comprised all constituents of the Stock Exchange of Thailand’s SET100 index. These

stocks are the 100 largest and most liquid stocks in the market (SET100 members are

updated semiannually). These relatively large stocks are considered investment grade

and are least susceptible to manipulations. The daily closing price data of the stocks

were obtained from the SETSMART system [17]. The experiments were conducted

using a custom-written software implemented in the C# language and Microsoft SQL

Server.

The momentum indicator used in the experiment was the Relative Strength Index

(RSI) [18], a standard technical indicator widely-used by stock traders for measuring

the strength of stock price movements. The RSI is a bounded oscillating indicator

calculated using past n-period closing price data series (1).

100

_bc = 100 −

d;

1+

9;

d; = 9; =

> >

(1)

-f;gh; − -f;gh;−1 , ;A -f;gh; > -f;gh;−1 -f;gh;−1 − -f;gh; , ;A -f;gh;−1 > -f;gh;

d; = e 9; = e

0, ;A -f;gh; ≤ -f;gh;−1 0, ;A -f;gh;−1 ≤ -f;gh;

The RSI is effectively a ratio of average gain to average loss during a given past n

consecutive trading periods. An RSI value is bounded between 0 and 100 where a value

higher than 50 indicates an upward momentum and a value lower than 50 indicates a

downward momentum. An extreme value on either end indicates an overbought or an

oversold condition, often used by traders to identify point of price reversal. For this

experiment, the 60-day RSI was chosen.

For trading, the portfolio was given 100 million Thai Baht of cash for the initial

stock purchase. The algorithm selected a quartile of 25 stocks from the pool of 100

stocks ranked by 60-day RSI. They were then purchased on an equal weight basis using

all available cash and held on to for 20 trading days (one month). The process was then

repeated – the algorithm chose a new group of stocks and the portfolio was readjusted

to hold on only to them. Trading commission fees at retail rate were incorporated into

the experiments.

Similarly, the same 100 stocks, this time ranked by market capitalization, were

divided into four quartiles for the algorithm to choose from. However, since the weight

distribution of stocks in the market was nonlinear, each of the four quartiles contained

different numbers of stocks: the first quartile comprised the 4 largest stocks in the

market, the second quartile comprised the next 8 largest stocks, the third quartile

comprised the next 16 largest stocks, and the last quartile comprised the remaining 72

stocks. In other words, every quartile weighted approximately the same when the

market capitalizations of its component stocks are summed.

R. Peachavanish / Fuzzy Rule-Based Stock Ranking 105

The results of the experiments are shown in Table 1. Monthly trading based on 60-

day RSI momentum indicator significantly outperformed the market index. Small-cap

stocks outperformed big-cap stocks.

Table 1. Portfolio returns based on monthly trading using momentum and market capitalization, compared to

the return of the SET100 market index benchmark.

Group By Momentum By Market Capitalization

First Quartile 126.61 % 9.40 %

Second Quartile 68.82 % 29.82 %

Third Quartile 32.12 % 76.96 %

Fourth Quartile -5.29 % 65.31 %

SET100 40.40 % 40.40 %

Experiments using momentum and market capitalization have provided the basis for

stock selection: buy small-cap stocks with high momentum. However, this strategy

does not work during market downtrend. While small-cap stocks as a group outperform

the market during normal times, they severely underperform during market downtrends

due to their lower liquidity. In addition, stocks with high momentum are indicative of

being overbought and have a much greater chance of sudden and strong price reversal.

Price momentum, company size as measured by market capitalization, and

prevailing market condition are the three dimensions that influence stock price

behavior. Each has inherently vague and subjective degrees of measure and so fuzzy

logic [19] is an appropriate tool to assist in the decision-making process. For the

proposed method, fuzzy rules were constructed based on these three factors with

membership functions shown in Figure 1 and fuzzy rule matrix shown in Figure 2. The

60-day RSI indicator was used to indicate both the momentum of stocks and the

prevailing market condition (bull market is characterized by a high RSI value, and vice

versa). There were three linguistic values expressing the momentum – “Weak”,

“Moderate”, and “Strong”, with a typical non-extreme 60-day RSI value ranging

between 40 and 60. For company size, relative ranking of market capitalization was

used instead of the absolute market capitalization of a company. The largest 50 stocks

out of 100 were considered “Large” and “Mid”, with overlapping fuzzy memberships.

The remaining half was considered “Mid” and “Small”, also with overlapping fuzzy

memberships. For output, there were five levels of stock purchase ratings in linguistic

terms: “Strong Buy” (SB), “Buy” (B), “Neutral” (N), “Sell” (S), and “Strong Sell” (SS),

having overlapping numerical scoring ranges between 0 and 10.

1

40 45 50 55 60 10 30 50 70 90 1 3 5 7 9

Momentum (RSI) Market Capitalization Purchase Rating

Figure 1. Fuzzy membership functions for momentum as measured by RSI (left), company size as measured

by market capitalization (middle), and purchase rating of stock (right).

Mamdani-type [20] fuzzy inference was used to determine stock purchase rating.

For each rule, the intersection between antecedents was evaluated. Consequents of

106 R. Peachavanish / Fuzzy Rule-Based Stock Ranking

rules were then combined using Root-Sum-Square method and the Center of Gravity

defuzzification process was performed to obtain the final crisp stock purchase rating.

The Fuzzy Framework [21] C# library was used to implement the fuzzy logic rule-

based algorithm.

Market Capitalization

Small Mid Large Small Mid Large Small Mid Large

Stock Momentum

Figure 2. Fuzzy rules for different market conditions as measured by momentum (RSI): weak market (left),

moderate market (middle), and strong market (right).

During strong market condition, money should be allocated first to small-cap

stocks with strong momentum and second to mid-cap stocks, also with strong

momentum. During weak market condition, small-cap stocks should be avoided and

priority should be given to big-cap stocks with strong momentum. For moderate market

condition, desirability of a stock was decided on its momentum.

Portfolio readjustments were performed in the same manner to the previous

experiments. The algorithm chose the top quartile of stocks with the best purchase

rating computed from the fuzzy rules. The portfolio returns 161.76%, which was better

than the best return from the experiment using momentum alone (126.61%) or market

capitalization alone (76.96%). The fuzzy rule-based approach also outperformed both

the SET100 index benchmark (40.40%) and one of the best actively-managed mutual

funds in the industry (“BTP” by BBL Asset Management Co., Ltd. at 124.43%) The

results are shown in Figure 3.

200

150

100

50

0

SET100 BTP Mutual Fund Momentum Market Capitalization Fuzzy Rules

Figure 3. Investment returns by algorithms: best result from momentum-only strategy (126.61%), best result

from market capitalization-only strategy (76.96%), and fuzzy rule-based method (161.76%). Returns of the

SET100 index benchmark and “BTP” mutual fund are shown for comparison.

This paper proposes a method that uses fuzzy rule-based inference to rank stocks based

on a combination of price momentum, company’s market capitalization, and prevailing

market condition. The method yields superior return to both the market index

benchmark as well as an industry-leading mutual fund. The method can be further

R. Peachavanish / Fuzzy Rule-Based Stock Ranking 107

improved in the future by incorporating the ability to hold cash during market

downturns. Additionally, short-term indicators may also be used to detect imminent

weakening or strengthening of momentum – information that is potentially useful in

making trading decisions.

References

[1] H. Yu, R. Chen, and G. Zhang, A SVM stock selection model within PCA, 2nd International Conference on

Information Technology and Quantitative Management, 2014.

[2] C. Huang, A hybrid stock selection model using genetic algorithms and support vector regression,

Applied Soft Computing, 12 (2012), 807-818.

[3] C. Dong, F. Wan, A fuzzy approach to stock market timing, 7th International Conference on Information,

Communications and Signal Processing, 2009.

[4] A. Escobar, J. Moreno, S. Munera, A technical analysis indicator based on fuzzy logic, Electronic Notes

in Theoretical Computer Science 292 (2013), 27-37.

[5] K. Chourmouziadis, P. Chatzoglou, An intelligent short term stock trading fuzzy system for assisting

investors in portfolio management, Expert Systems with Applications, 43 (2016), 298-311.

[6] M. Yunusoglu, H. Selim, A fuzzy rule based expert system for stock evaluation and portfolio

construction: An application to Istanbul Stock Exchange, Expert Systems with Applications, 40 (2013),

908-920.

[7] A. Andersen, S. Mikelsen, A novel algorithmic trading frame-work applying evolution and machine

learning for portfolio optimization, Master’s Thesis, Norwegian University of Science and Technology,

2012.

[8] J. Bollen, H. Mao, X. Zeng, Twitter mood predicts the stock market, Journal of Computational Science, 2

(2011), 1-8.

[9] L. Wang, Modeling stock price dynamics with fuzzy opinion networks, IEEE Transactions on Fuzzy

Systems, (in press).

[10] Y. Hu, K. Liu, X. Zhang, L. Su, E. W. T. Ngai, M. Liu, Application of evolutionary computation for

rule discovery in stock algorithmic trading: a literature review, Applied Soft Computing, 36 (2015), 534-

551.

[11] C. Chotivetthamrong, Stock market fund flows and return volatility, Ph.D. Dissertation, National

Institute of Development Administration, Thailand, 2014.

[12] W. Laoniramai, Insider trading behavior and news announcement: evidence from the Stock Exchange

of Thailand, CMRI Working Paper, Thai Stock Exchange of Thailand, 2013.

[13] C. Mateepithaktham, Equity mutual fund fees & performance, SEC Working Papers Forum, The

Securities and Exchange Commission, Thailand, 2015.

[14] N. Jegadeesh, S. Titman. Returns to buying winners and selling losers: implications for stock market

efficiency, Journal of Finance, 48 (1993), 65-91.

[15] R. Peachavanish, Stock selection and trading based on cluster analysis of trend and momentum

indicators, International MultiConference of Engineers and Computer Scientists, 2016.

[16] T. Bunsaisup, Selection of investment strategies in Thai stock market, Working Paper, Capital Market

Research Institute, Thailand, 2014.

[17] SETSMART (SET market analysis and reporting tool), http://www.setsmart.com.

[18] J. Welles Wilder, New concepts in technical trading systems, Trend Research, 1978.

[19] L. Zadeh, Fuzzy sets, Information and Control, 8 (1965), 338-353.

[20] E. Mamdani, S. Assilian, An experiment in linguistic synthesis with a fuzzy logic controller,

International Journal of Man-Machine Studies, 7 (1975), 1-13.

[21] Fuzzy Framework, http://www.codeproject.com/Articles/151161/Fuzzy-Framework.

108 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-108

Robot and Simulation

Huan NIUa, Jie YANGa,1 and Jie-Ru CHIb

a

School of Electrical Engineering, Qingdao University, Qingdao, Shandong, China

b

School of Electronic and Information Engineering, Qingdao University, Qingdao,

Shandong, China

Abstract. Aiming at control method of 2-DOF joint robot, the 3D robot model is

established in ADAMS firstly, and then dynamic equation of the robot is deduced

by using the obtained parameters. And dynamic model is combined with control

system model in MATLAB/Simulink by the ADAMS/Control module and is

established coordinated simulation system. In order to eliminate the effect of the

modeling error and uncertainty signal, a sliding-mode control is proposed. In this

method, a linear sliding surface is used to ensure the system to reach equilibrium

with the sliding surface in finite time; and fuzzy control is used to compensate for

the modeling error and uncertainty signal. Equivalent control law and switching

control law are derived by using Lyapunov stability criterion and exponential

reaching law. Fuzzy control law and membership function are set up by using

fuzzy control rules. Through online adaptive learning of fuzzy, buffeting is

weakened. Simulation result shows that the control method is effective.

Introduction

In order to achieve accurate control of the multi-joint robot system including modeling

errors and uncertainty signals, there have been many effective methods. And the

development of robot control theory has gone through three stages: traditional control,

modern control and intelligent control. The traditional control theory mainly includes

PID control, feed-forward control, and so on; modern control theory mainly includes

robust control, sliding-mode control and so on; intelligent control theory mainly

includes fuzzy control, neural network control, adaptive control, etc [1-2].Robot

control is divided into point-to-point control(PTP) and trajectory tracking control(or

continuous path control, CP).Point-to-point control only requires that the end effector

of the robot is moved from one point to another without taking into account the motion

trajectory. Robot trajectory tracking control is that the driving torque of each joint is

given, so that the position, velocity and other state variables of robot are tracked the

known ideal trajectory. For the entire trajectory, it is necessary to strictly control [3-6].

In recent years, fuzzy control and sliding-mode control have been got more and

more people's attention for their strong robustness. As for the sliding-mode control, by

designing a stable sliding surfaces can ensure that the control system would be run into

1

Corresponding Author: Jie YANG, School of Electrical Engineering, Qingdao University, 308

Ningxia Rd, Qingdao, Shandong, PRC, 266071, China; E-mail: jackiey69@sina.com.

H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation 109

the surfaces from any of the initial state within a limited time frame, and be sported

near the balance point on the surface. But the problem of buffeting is still existed in the

control system and the upper limit of the modeling error and uncertainty signal of the

control system must be knew in advance in this method, which is hard to do in the

actual robot control[7]. However, the fuzzy control has overcome these deficiencies,

which is an effective way to eliminate the buffeting of sliding-mode control system. Its

strong adaptive learning capability can also be used to weaken the uncertain signal. So

combining sliding-mode control with fuzzy control is used to implement the control of

trajectory tracking, which ensures the stability and effectiveness of the control system.

In this paper, the first part mainly introduces the establishment of the 3D model

and the derivation of dynamic equation of robot; in the second part, the design of the

sliding-mode control system is introduced; in the third part, the design of the sliding-

mode control system is introduced; in the fourth part, the simulation experiment and

simulation results of the robot control system are introduced; a brief summary is at the

end of the paper. Those have a certain reference value for the robot control in the future.

Firstly, the 3D model of the robot is established in function module of ADAMS VIEW,

which has two robotic arms and can be realized 2-DOF rotary motion in YOZ plane.

The robotic arms’ length is set to 0.225m and qualities are set to 0.03kg, as shown in

Figure 1.

x1 =0.1125, y1 =0.33, z1 =0; x2

=0.3375,

y2 =0.39, z2 =0.The inertial parameters of robot are I =0.1732, I yy =0.1588,

xx

deduced.

M (q) > q1 q2 @ C (q, q) > q1 q2 @ G ( q ) U t

T T

W (1)

In Eq.(1): W is controlling moment; q, q, q R are respectively position

n

ª M M 12 º nu n

M (q ) « 11

M M » R , it is inertia matrix; parameters of it are: M 22 0.0252 ,

¬ 21 22 ¼

110 H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation

» R , it is centrifugal force matrix, parameters of it

n

C ( q, q ) «C

¬ 211 C212 C221 C 222 ¼

are: C111 C212 C221 C222 0 , C112 C121 C122 0.0026325cos q2 0.0022725sin q2 ,

C211 0.0026325cos q2 0.0022725sin q2 ; G(q) >G1 G2 @ is gravity matrix, parameters of

it are: G1 G2 0 ; U t is the modeling error and uncertainty signal; it is generally set

to the same form of input signal, which the amplitude is > 2% 5%@ of input signal[8].

The purpose of trajectory tracking control of robot is to make the joint position

vectorconsistent with the desired joint angular displacement as much as possible[9-10].

Therefore, the sliding-mode surface is designed to Eq.(2):

s e De (2)

In Eq.(2): D is constant of sliding-mode surface; e q qr is tracking error; e q qr

is derivative of tracking error. And the exponential reaching law of sliding-mode

s

control is designed to s M Ks , and M , K ! 0 .

s

The Eq.(2)is simultaneous withthe reaching law, so the Eq.(3) can be got:

W ueq uvss (3)

In Eq.(3):

s

ueq M (q)qr C (q, q )q G (q ) U (t ) D M (q )e , uvss M M (q ) KsM (q ) ;

s

K ! K U (t ) , K is any small positive number; M , K are parameters of exponential

reaching law.

In multi-joint robot system,the effect of modeling error and uncertainty signal is always

existed. So combining sliding-mode control with fuzzy control is usually used to

weaken the effect, which ensures the stability and effectiveness of the control system.

Fuzzy reasoning is used to establish fuzzy rules. The fuzzy set is defined as shown in

Table 1:

H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation 111

S*ds ds

NB NM NS ZO PS PM PB

s

PB NB NB NM PM PB PB PB

PM NB NM NS PS PM PB PB

PS NM NS NS PS PS PM PB

ZO NM NS NS ZO PS PS PM

NS PB PM PS PS NS NS NM

NM PB PB PM PS NS NM NB

NB PB PB PB PM NM NB NB

Among the Table 1: NB is represented the maximum of negative number; NM is

represented the middle value of negative number; NS is represented the minimum of

negative number; ZO is represented the zero; PN is represented the minimum of

positive number; PM is represented the middle value of positive number; PB is

represented the maximum of positive number. Fuzzy rules are the model of IF-THEN:

Rm : if i is A and i is B si si is C

l

MATLAB/Simulink. The basic form is set to triangular membership function(trimf);

the range of values is set to [-3 3]; defuzzify is the method of membership degree of

average maximum.

After the fuzzy control is introduced into the sliding-mode control, the control law

should be changed to the form of Eq.(4):

W ueq uvss u f (4)

In Eq.(4): u f u f (x T ) ª¬u f 1 ( x1 T1 ) u f 2 ( x2 T 2 ) º¼ , it is the output of fuzzy

controller; xi > si , si @ , it is the input of fuzzy controller; Ti ri si[ ( x) , it is adaptive

law; ri is learning coefficient of control system.

For the 2-DOF robot, it is assumed that the upper bound of the modeling error and

uncertainty signal is Ui (t ) d Li ; optimal approximation parameter of adaptive laws is:

Ti
arg min ª¬sup u fi ( xi Ti ) ( Li K ) sign( s) º¼ ;adaptive error is Ti Ti Ti
; the upper

T R

112 H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation

1 T 1 2 1

function is V s M (q)s ¦ TiT Ti ; it is taken the derivative:

2 2 i 1 ri

2

V

1 T

s M (q ) s sT M (q ) s sT M (q ) s ¦ TiT Ti

1

2 i 1 ri

(5)

2

¦ ª¬( L K ) s

i 1

i i H i si Ui (t ) wi si º¼ 0

The result of Eq.(5) shows that the control system has the global stability.

4. Simulation Experiment

Physical parameters of 2-DOF robot are set as follow: the parameters of sliding-

mode surface: D1 D 2 1 ;the parameters of exponential reaching law: M 10, K 10000 ;

s is replaced with s +0.000001 to prevent the emergence of the Singularity;

memory module is used to prevent the emergence of the algebraic loop, and the

parameters of it is set to 1; learning coefficient: ri >0.85 0.85 0.85 0.85 0.85 0.85@ , desired

trajectory: qr1 1 cos(S t ) , qr 2 0.5 0.5cos(S t) ; modeling error and uncertainty signal:

U1 (t ) 0.5sin 0.5t , U2 (t ) 0.1sin 0.1t .

S-Function is written in MATLAB, which is used to simulate. MATLAB is

connected with ADAMS by the interface of Control, and the simulation of control

system is implemented. After that, the trace curve of joint_1, trace curve of joint_2 and

error curve are gotten, as shown in Figure3, Figure4and Figure5.

If PD Control is used, which the control law is Eq.(6):

W i k pi ei kdi ei (6)

H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation 113

The error curve of PD control will be gotten, as shown in Figure 6.

Through the observation of Figure3, Figure4 and Figure5, the adaptive fuzzy

sliding-modecontrol has better ability in trajectory tracking. When the interfering signal

is imported, the control system can be restored to steady state operation near the

equilibrium point of sliding-mode surface(s=0). So the control system is effective and

robust. There is no obvious buffeting in simulation experiment, so the control system is

met the requirement of design. Comparing Figure5 and Figure6 shows that adaptive

fuzzy sliding-mode control is superior to the PD control. For the effects of same

interference signal, anti-jamming capability of PD control is poorer. The control error

is increased and the control precision is reduced sharply. This can also verify the

validity of adaptive fuzzy sliding-mode control.

5. Conclusions

In allusion to position control of 2-DOF joint robot and the modeling error and

uncertainty signal of control system, adaptive fuzzy sliding-mode control is proposed.

Simulation experiment is conducted in MATLAB and ADAMS, and the experiment

result of adaptive fuzzy sliding-mode control is compared with PD control. The

simulation result shows that adaptive fuzzy sliding-mode control is effective and robust.

And there is no obvious buffeting in control system. The trajectory tracking is more

effective than PD control. So this controlpolicy has practical operability, and the study

would supply a certain practice guidance with value in theory.

114 H. Niu et al. / Adaptive Fuzzy Sliding-Mode Control of Robot and Simulation

Acknowledgment

This work is supported by the Science & Technology Project of College and University

in Shandong Province (J15LN41).

References

[1] J. X. Lv, Y. H. Li, X. Z. Wang, X. L. Bao, Mechanical structure optimization and power fuzzy control

design of picking robot end effector, Journal of Agricultural Mechanization Research, 38(2016): 36-40.

[2] S. H. Ju, Y. M. Li, Research on nonholonomic mobile robot based on self-adjusting universe fuzzy

control, Electronic Design Engineering, 24(2016), 103-106.

[3] Z. M. Ju, Fuzzy, control is applied to wheel type robot target tracking, Computer Measurement &

Control, 22(2014): 614-616.

[4] J. L. Zhang, Comprehensive obstacle avoidance system based on the fuzzy control for cleaning robot,

Machine Tool & Hydraulics, 18(2014): 92-95.

[5] Z. B. Ma, Self-adjusting parameter fuzzy control for self-balancing two-wheel robots, Techniques of

Automation and Applications, 33(2014): 9-13

[6] S. B. Hu, M. X. Lu, Fuzzy integral sliding mode control for three-links spatial robot, Computer

Simulation, 20(2012): 162-166.

[7] L. Lin, H. R. Wang, Y. N. Hu, Fuzzy adaptive sliding mode control for trajectory tracking of uncertain

robot based on saturated function, Machine Tool& Hydraulics, 36(2008): 137-140.

[8] C. Z. Xu, Y. C. Wang, Nonsingular terminal fuzzy sliding mode control for multi-link robots based on

back stepping, Electrical Automation, 34(2012): 8-9.

[9] W. D. Gao, Y. M. Fang, W. L. Zhang, Application of adaptive fuzzy sliding mode control to

servomotor system, Small& Special Electrical Machines, 37(2009): 32-36.

[10] T. W. Wu, Y. S. Yang, Research on simulation of adaptive sliding-mode guidance law, Modern

Electronics Technique, 34(2011): 23-25.

Fuzzy Systems and Data Mining II 115

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-115

Application in Decision Making

Ying HAN 1 , Qi LUO and Sheng CHEN

B-DAT & CICAEET, Nanjing University of Information Science and Technology,

Jiangsu, Nanjing 210044, P. R. China

Abstract. In this paper, by combining hesitant fuzzy set with bipolar-valued fuzzy

set, the concept of hesitant bipolar value fuzzy set is introduced, and the hesitant

bipolar fuzzy group decision making method based on TOPSIS is proposed. Our

study ﬁrstly integrates fuzziness, hesitation and incompatible bipolarity in multiple

criteria decision making method. An illustrative case of chemical project evaluation

also demonstrates the feasibility, validity, and necessity of our proposed method.

Keywords. Fuzzy set, Bipolar-valued fuzzy set, Hesitant fuzzy set, Multiple criteria

decision making, Incompatible bipolarity

Introduction

As an extension of fuzzy set [1], hesitant fuzzy set (HFS) was introduced by Torra and

Narukawa to describe the case that the membership degrees of an element to a given set

have a few different values, which arises from hesitation the decision makers hold [2].

A growing number of studies focus on HFS and some extensions are presented, such as

interval-valued HFS [3], possible-degree generalized HFS [4] and linguistic HFS [5].

On the other hand, in recent years, incompatible bipolarity has attracted re-

searchers’ attentions with some instructive results have devoted to it [6,7]. In fact,

incompatible bipolarity is inevitable in the real world. See an example of the psychol-

ogy disease-bipolar disorder. A patient suffering bipolar disorder has episodes of mania

and depression. Two poles may simultaneously reach extreme cases, i.e., the sum of

positive pole value and negative pole value is bigger than 1. Bipolar-valued fuzzy set

(BVFS) was pointed out is suitable to handle incompatible bipolarity [8,9].

The aforementioned HFS and its extensions can not accommodate incompatible

bipolarity. Considering BVFS is adept at modeling incompatible bipolarity, by combin-

ing BVFS with HFS, hesitant bipolar fuzzy set (HBFS) is introduced in this paper. And a

hesitant bipolar fuzzy multiple criteria group decision making (MCGDM) method based

on TOPSIS [10] is presented. Our study ﬁrstly accommodates fuzziness, hesitation, and

incompatible bipolarity in fuzzy set and multiple criteria decision making.

The rest is structured as follows. In Section 1, some related notions are reviewed.

The concept of HBFS is introduced and some related properties are discussed. In Section

1 Corresponding Author: Ying Han, B-DAT & CICAEET, Nanjing University of Information Science and

116 Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making

2, a hesitant bipolar fuzzy group decision making method based on TOPSIS is presented.

In Section 3, an illustrated case about chemical project evaluation is included to show the

feasibility, validity, and necessity of the theoretical results obtained. Finally, the paper is

concluded in Section 4.

Throughout the paper, denote I P = [0, 1], I N = [−1, 0]. The sets X always repre-

sents the ﬁnite discourse.

In this section, ﬁrstly, some related notions are reviewed. Then, the concept of HBFS is

introduced and some related properties are discussed.

In [2], Torra and Narukawa suggested the concept of HFS permitting the member-

ship degree of an element to a set presented as several possible values in I P . In [11],

bipolar-valued fuzzy set B in X is deﬁned as B = {< x, B(x) = (B P (x), B N (x)) >|

x ∈ X}. Where the functions B P : X → I P , x → B P (x) ∈ I P and B N : X →

I N , x → B N (x) ∈ I N deﬁne the satisfaction degree of the element x ∈ X to the prop-

erty corresponding and the implicit counter-property to the BVFS B in X, respectively.

Denote L = {α = (αP , αN ) | αP ∈ I P , αN ∈ I N }, then α is called a bipolar-valued

fuzzy number (BVFN) in [9]. For any α = (αP , αN ), the preference order relation is

deﬁned as α ≤ β if and only αP ≤ β P and αN ≤ β N . The preference order relation is

P N

partial. Denote αM = α +α 2 and we see that if α ≤ β, then αM ≤ β M , then we can

rank all the BVFNs according to their mediation values [9].

Next, the concept of the HBFS is introduced, accommodating fuzziness, hesitation,

and incompatible bipolarity in fuzzy set theory for the ﬁrst time.

Deﬁnition 1 Hesitant bipolar fuzzy set in X is deﬁned as Ã = {< x, h̃Ã (x) >| x ∈ X}.

Where h̃Ã (x) is a set of some different BVFNs in L, representing the possible bipolar

membership degree of the element x ∈ X to the set Ã. For convenience, h̃Ã (x) is called

a hesitant bipolar fuzzy element (HBFE), a basic unit of HBFS.

Inspired by work about HFS proposed by Xia et al. [12], for a HBFE h̃Ã (x), it

is necessary to arrange the BVFNs in h̃Ã (x) in the increasing order according to the

mediation value. Suppose that l(h̃Ã (x)) stands for the number of BVFNs in HBFE h̃Ã (x)

σ

and h̃Ãj (x) be the jth largest BVFN in h̃Ã (x). Given two different HBFSs Ã, B̃ in X,

denote lx = max{l(h̃Ã (x)), l(h̃B̃ (x))}. If l(h̃Ã (x)) = l(h̃B̃ (x)), then the shorter one

should be extended by adding the largest value until it has the same length with the longer

one.

In the following paper, all of HBFSs in X are denoted by F̃ (X). HBFE is denoted

by h̃ for simplicity, and the set of all of the h̃ is denoted by L̃. And the preference order

relation in L̃ is deﬁned in the following deﬁnition.

Deﬁnition 2 Let h̃1 , h̃2 ∈ L̃, then deﬁne preference order relation in L̃ as follows: h̃1 ≤

σ σ σ σ σj

h̃2 if and only if (h̃1 j )P ≤ (h̃2 j )P and (h̃1 j )N ≤ (h̃2 j )N . Where, (h̃i ) be the jth

σj

largest BVFN in (h̃i ), (i = 1, 2) according to the mediation value.

Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making 117

troducing some operations about HBFEs, a hesitant bipolar fuzzy aggregation operator

is proposed.

h̃1 ⊗ h̃2 = (γ̃1P · γ̃2P , −γ̃1N · γ̃2N ),

γ̃1 ∈h̃1 ,γ̃2 ∈h̃2

(h̃1 )λ = (γ̃1P )λ , −|γ̃1N |λ

γ̃1 ∈h̃1

n

vector of h̃i satisfying wi ∈ I P as well as i=1 wi = 1. Then, a hesitant bipolar fuzzy

weighted geometric (HBFWG) operator is a mapping deﬁned as follows:

i

i

(1)

HBFSs is introduced.

Deﬁnition 5 For any Ã, B̃, C̃ ∈ F̃ (X), if the operation d˜ : F̃ (X) × F̃ (X) → I P

satisfying the following conditions: 1◦ 0 ≤ d( ˜ Ã, B̃) ≤ 1 and d(

˜ Ã, B̃) = 0 if and only if

Ã = B̃; 2◦ d( ˜ B̃, Ã); 3◦ d(

˜ Ã, B̃) = d( ˜ Ã, C̃) ≤ d(

˜ Ã, B̃) + d(

˜ B̃, C̃). Then d˜ is called the

distance in F̃ (X).

n

Example 1 Let ωi ∈ I P (i = 1, 2, · · · , n) satisfying ωi = 1. For any Ã, B̃ ∈ F̃ (X),

i=1

weighted Hamming distance d˜wh between Ã and B̃ is deﬁned as follows:

lxi

σj P σ

d˜wh (Ã, B̃) = ωi 1

2lxi (h̃Ã ) (xi ) − (h̃B̃j )P (xi )

j=1 (2)

σ σ

+ (h̃Ãj )N (xi ) − (h̃B̃j )N (xi )

In this section, based on the theory results in the above section, a hesitant bipolar fuzzy

MCGDM method based on TOPSIS is presented.

Considering a MCGDM problem with hesitant bipolar fuzzy information. Let

{x1 , · · · , xm } be the alternatives set, {c1 , · · · , cn } be the evaluation criteria set and t

experts be invited to make evaluation. The hesitant bipolar fuzzy evaluation value to

alternative xi about the criteria cj given by the sth expert is denoted by the HBFE

(h̃sij ), then we can derive the hesitant bipolar fuzzy matrix (HBFM) given by the sth

expert as H̃ s = (h̃sij )m×n (i = 1, · · · , m; j = 1, · · · , n, s = 1, · · · , t). Suppose all the

118 Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making

BVFNs in h̃sij is arranged in the increasing order according to the mediation value. The

weights vector about experts is supposed to be known as w = (w1 , · · · , wt ) satisfying

s

ws ∈ I P and ws = 1 and the weights vector about criteria is supposed to be known

t=1

n

as ω = (ω1 , · · · , ωn ) satisfying ωj ∈ I P and ωj = 1.

j=1

The hesitant bipolar fuzzy multiple criteria decision making method based on TOP-

SIS is given as follows:

Step 1. Use (1) to aggregate the HBFM H̃ s to get the comprehensive HBFM H̃ =

(h̃ij )m×n (i = 1, · · · , m; j = 1, · · · , n). Where h̃ij = HBF W G(h̃1ij , h̃2ij , · · · , h̃tij ).

Step 2. Denote lj = maxi=1,···,m {l(h̃ij )}. For j = 1, · · · , n, if l(h̃ij ) < lj , adding

the largest value in it until its length equal to lj . And compute

(h̃j )∗ =

σ(1) σ(1)

maxi=1,···,m (h̃ij )P , maxi=1,···,m (h̃ij )N , · · · ,

σ(l ) σ(l ) (3)

maxi=1,···,m (h̃ j )P , maxi=1,···,m (h̃ j )N

ij lj

and

σ(1) σ(1)

(h̃j )∗ = mini=1,···,m (h̃ij )P , mini=1,···,m (h̃ij )N , · · · ,

σ(l ) σ(l ) (4)

mini=1,···,m (h̃ij j )P , mini=1,···,m (h̃lj j )N .

Then h̃∗ = {(h̃1 )∗ , · · · (h̃n )∗ } is the positive ideal point and h̃∗ = {(h̃1 )∗ , · · · (h̃n )∗ } is

the negative ideal point.

Step 3. Denoted h̃i = {h̃i1 , · · · , h̃in }. For any i = 1, · · · , m, compute the distance

(d˜i )∗ ((d˜i )∗ )between h̃i and h̃∗ (h̃∗ ) by (2).

˜i )∗

Step 4. Computes ξi = (d˜ )(d+( d˜i )∗

, i = 1, · · · , m

i ∗

Step 5. Rank the alternatives according to the principle that the smaller ξi is, the

better the project xi is.

3. Case Study

to use the proposed method to make evaluation with incompatible fuzzy bipolarity and

hesitation information.

Example 2 Considering a chemical project evaluation problem. Suppose there are four

chemical projects {x1 , x2 , x3 , x4 } to be evaluated, and two experts are invited to make

evaluation. c1 : economy, c2 : environment and c3 : society are evaluation criteria. Con-

sidering its economy evaluation criteria: in a short time, it may bring huge beneﬁts to

the company, resulting its positive evaluation value is 0.8; on the other hand, in the long

run, the pollution needs a huge amount of money to ﬁx, resulting its negative evaluation

value is 0.7. The sum of two poles is 1.5, bigger than 1, i.e., there exists incompatible

bipolarity. And when make evaluation, experts may have hesitation among several mem-

Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making 119

berships, thus, to alternative xi about the criteria cj the evaluation values given by the

sth expert is denoted by the HBFE h̃sij . Suppose all the BVFNs in h̃sij is arranged in the

increasing order according to the mediation value. The HBFMs given by Expert 1 and

Expert 2 are presented in Table 1 and Table 2, respectively. The weights vectors about

experts and criteria are given as w = (0.7, 0.3) and ω = (0.3, 0.5, 0.2), respectively.

X c1 c2 c3

x1 ([0.9, −0.7]) ([0.8, −0.6], [0.7, −0.4]) ([0.7, −0.4], [0.8, −0.3])

x2 ([0.8, −0.1]) ([0.8, −0.3]) ([0.9, −0.2], [0.8, −0.1])

x3 ([0.7, −0.2]) ([0.6, −0.3]) ([0.7, −0.1])

x4 ([0.6, −0.4]) ([0.5, −0.3]) ([0.7, −0.4])

X c1 c2 c3

x1 ([0.8, −0.6]) ([0.8, −0.5]) ([0.8, −0.4])

x2 ([0.8, −0.2]) ([0.7, −0.1]) ([0.8, −0.2], [0.7, −0.1])

x3 ([0.6, −0.3]) ([0.7, −0.2], [0.8, −0.4]) ([0.8, −0.3])

x4 ([0.6, −0.3]) ([0.6, −0.2], [0.7, −0.1]) ([0.6, −0.4])

Next, we will see how to use the proposed method to make evaluation.

Step 1. Use (1) to aggregate the HBFM H̃ s given by the sth expert to get the com-

prehensive HBFM H̃ = (h̃ij )4×3 .

The comprehensive HBFM is given in Table 3.

U c1 c2

x1 ([0.8688, −0.6684]) ([0.8000, −0.5681], [0.7286, −0.4277])

x2 ([0.8000, −0.1231]) ([0.7686, −0.2158])

x3 ([0.6684, −0.2259]) ([0.6284, −0.2656], [0.6541, −0.3270])

x4 ([0.6000, −0.3669]) ([0.5281, −0.2656], [0.5531, −0.2158])

X c3

x1 ([0.7286, −0.4000], [0.8000, −0.3270])

x2 ([0.8688, −0.2000], [0.7686, −0.1000], [0.8000, −0.1231], [0.8346, −0.1625])

x3 ([0.7286, −0.1390])

x4 ([0.6684, −0.4000])

Step 2. Compute the positive (negative) ideal point h̃∗ (h̃∗ ) by (3) ((4)).

By (3), we have h̃∗ = {([0.8688, −0.1231]), ([0.8000, −0.2158], [0.7686, −0.2158]),

([0.8688, −0.1390], [0.8000, −0.1000], [0.8688, −0.1231], [0.8000, −0.1390])}.

By (4), we have h̃∗ = {([0.6000, −0.6684]), ([0.5281, −0.5681], [0.5531, −0.4277]),

([0.6684, −0.4000], [0.6684, −0.4000], [0.6684, −0.4000], [0.6684, −0.4000])}.

Step 3. Compute the distance (d˜i )∗ ((d˜i )∗ ) between h̃i and h̃∗ (h̃∗ ) by (2), i =

1, 2, 3, 4.

By (2), we have (d˜1 )∗ = 0.1850, (d˜2 )∗ = 0.0199, (d˜3 )∗ = 0.1116, (d˜4 )∗ = 0.1934;

(d˜1 )∗ = 0.1178, (d˜2 )∗ = 0.2878, (d˜3 )∗ = 0.1977, (d˜4 )∗ = 0.1093;

˜i )∗

Step 4. Computes ξi = (d˜ )(∗d+( d˜i )∗

, i = 1, 2, 3, 4.

i

We have ξ1 = 0.3890, ξ2 = 0.9352, ξ3 = 0.6392, ξ4 = 0.3610.

Step 5. Rank the alternatives according to the principle.

120 Y. Han et al. / Hesitant Bipolar Fuzzy Set and Its Application in Decision Making

Comparison

If we just consider the positive evaluation of criteria in Example 2, we get (ξ1 )P =

0.9230, (ξ2 )P = 0.8377, (ξ3 )P = 0.3796, (ξ4 )P = 0, then x1 is the optimal project.

The result is different from the one considering incompatible bipolarity. Comparing with

the existing method, by accommodating incompatible bipolarity, fuzziness and hesitation

information in decision making for the ﬁrst time, our method is more suitable to the

urgent demands about protection of environment and resource.

4. Conclusions

In this paper, by combining hesitant fuzzy set with bipolar-valued fuzzy set, the concept

of hesitant bipolar value fuzzy set is introduced, and then, a hesitant bipolar fuzzy group

decision making method is presented. Our study ﬁrstly accommodates fuzziness, hesita-

tion, and incompatible bipolarity in information process. In the following work, we will

try to combining rough set theory with hesitant bipolar fuzzy set.

Acknowledgements

This work was supported in part by the Joint Key Grant of National Natural Science

Foundation of China and Zhejiang Province (U1509217), the National Natural Sci-

ence Foundation of China (61503191) and the Natural Science Foundation of Jiangsu

Province, China (BK20150933).

References

[1] L.A. Zadeh, Fuzzy sets, Inform. and Control, 8 (1965) 338–353.

[2] V. Torra and Y. Narukawa, On hesitant fuzzy sets and decision, in: the 18th IEEE International Confer-

ence on Fuzzy Systems, Korea, 2009, 1378–1382.

[3] N. Chen, Z.S. Xu and M.M. Xia, Correlation coefﬁcients of hesitant fuzzy sets and their applications to

clustering analysis, Applied Mathematical Modeling, 37 (2013) 2197–2211.

[4] Y. Han, Z.Z. Zhao, S. Chen and Q.T. Li, Possible-degree generalized hesitant fuzzy set and its Applica-

tion in MADM, Advances in Intelligent Systems and Computing, 27 (2014) 1–12.

[5] F.Y. Meng and X.H. Chen, A hesitant fuzzy linguistic multi-granularity decision making model based

on distance measures, Journal of Intelligent and Fuzzy Systems, 28 (2015) 1519–1531.

[6] J. Montero, H. Bustince, C. Franco, J.T. Rodríguez, D. Gómez, M. Pagola, J. Fernández and E. Bar-

renechea, Paired structures in knowledge representation, Knowledge-Based Systems, 100 (2016) 50–58.

[7] C.G. Zhou, X.Q. Zeng, H.B. Jiang, L.X. Han, A generalized bipolar auto-associative memory model

based on discrete recurrent neural networks, Neurocomputing, 162 (2015) 201–208.

[8] H. Bustince, E. Barrenechea, M. Pagola, J. Fernandez, Z.S. Xu, B. Bedregal, J. Montero,H. Hagras,

F. Herrera and B.D. Baets, A historical account of types of fuzzy sets and their relationships, IEEE

Transactions on Fuzzy Systems, 24 (2016) 179–194.

[9] Y. Han, P. Shi and S. Chen, Bipolar-valued rough fuzzy set and its applications to decision information

system, IEEE Transactions on Fuzzy Systems, 23 (2015) 2358–2370.

[10] Y.J. Lai, T.Y. Liu and C.L. Hwang, TOPSIS for MODM, European Journal of Operational Research, 76

(1994) 486–500.

[11] W.R. Zhang, Bipolar fuzzy sets and relations: a computational framework for cognitive modeling and

multiagent decision analysis, Proceedings of IEEE Conf., 1994: 305–309.

[12] M.M. Xia and Z.S. Xu, International Journal of Approximate Reasoning, 52 (2100) 395–407.

Fuzzy Systems and Data Mining II 121

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-121

Machine for Uncertain Pattern

Classiﬁcation

Ben-Zhang YANG a ,Yi-Bin XIAO b , Nan-Jing HUANG a,1 , and Qi-Lin CAO c,2

a Department of Mathematics, Sichuan University, Chengdu, Sichuan, P.R. China

b Department of Mathematics, University of Electronic Science and Technology of

c Business School, Sichuan University, Chengdu, Sichuan, P.R. China

chance constrained twin support vector machine (CC-TWSVM) is proposed. This

paper studies twin support vector machine classiﬁcation when data points are un-

certain with measurement statistically noise. With some properties known for the

distribution, the CC-TWSVM model aims to ensure the small probability of error

classiﬁcation for the uncertain data. We also provide equivalent second-order cone

programming (SOCP) model of the CC-TWSVM model by the properties of mo-

ment information of uncertain data. The dual problem of SOCP model is introduced

and the optimal value of the CC-TWSVM model can be solved directly. In addition,

we also show the performance of CC-TWSVM model in artiﬁcial data and real data

by numerical experiments.

Keywords. support vector machine, robust optimization, chance constraints,

uncertain classiﬁcation.

Introduction

Nowadays, support vector machines (SVMs) are considered as one of the most effective

learning methods for classiﬁcation. The main idea of this classiﬁcation technique is by

mapping the data to the higher dimensional space with some kernel methods and then

determine a hyperplane separating binary classes with maximal margin [1,2].

Binary data classiﬁcation methods have made breakthrough progress in recent years.

Mangasarian et al. [3] proposed generalized evigenvalue proximal support vector ma-

chinie (GEPSVM). Different from canonical SVM, GEPSVM aims to ﬁnd two optimal

nonparallel planes such that each hyperplane is closer to its class and is as far as possi-

ble from the other class. Motivated by GEPSVM Jayadeva et al. [4] proposed a a twin

support vector machine (TWSVM) to solve the classiﬁcation of binary data. The main

2 Corresponding Author: Qi-Lin Cao, Business School, Sichuan University, Chengdu, Sichuan, P.R. China,

122 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classiﬁcation

idea of TWSVM is generating two nonparallel planes that have the similar properties

in GEPSVM. But different from GEPSVM, the two planes in TWSVM are obtained by

double related programming problems. At the same time, the ν -TWSVM [5] was pro-

posed for handling outliers as an extension of TWSVM. Some extensions to the TWSVM

can be founded in [6].

For the above-mentioned methods, the parameters in the training data sets are im-

plicitly assumed to be known exactly. However, in real world application, parameters are

perturbed as they are estimated from the data of the measurement and the statistical error

[7]. For instance, the real data points are always incorporating the uncertain information

in automatic acoustic identiﬁcation and other imbalanced data problems [8]. When the

data points are uncertain, some SVM models for processing uncertainties have been pro-

posed as the development of previous model. Trafalis et al. [9] proposed a robust opti-

mization model when the noise of the uncertain data is norm-bounded. Robust optimiza-

tion [10] was also introduced in the cases of chance constraints. The usage of robust op-

timization in chance constraints is to ensure small probability of error classiﬁcation for

the uncertainty. More precisely, this assurance is to require the probability of construct-

ing a maximum margin linear classiﬁer by random variables more higher. It also means

that probability which the points of one class are classiﬁed to the other is controlled by a

extremely low value. Ben Tal et al. [11,12] employed moment information of uncertain

training data to developing a different chance-constrained SVM (CC-SVM) model. How-

ever, to our best knowledge, there is no researcher considering the chance-constrained

optimization in TWSVM problem. Therefore, it is interesting and important to study the

TWSVM with chance constraints for the uncertain data classiﬁcation problem. The main

purpose of this paper is to make an attempt in this direction.

Combining the capability of processing the uncertainty of chance constraints and

the beneﬁts of TWSVM, in this paper, we propose a chance constrained twin support

vector machine (CC-TWSVM). The main method of this paper is ,by using the moment

information of uncertain data, to transform chance constrained programming into second

order cone programming. Section 1 recalls SVM and TWSVM brieﬂy. In Section 2, we

introduce the model of CC-TWSVM. Experimental results on the uncertain data sets are

presented in Section 3. Conclusions are provided in Section 4.

1. Preliminaries

In this section, we brieﬂy recall some concepts of SVM and TWSVM for binary classi-

ﬁcation problem.

1.1. SVM

Let us consider the linearly separable classiﬁcation problem. Given training set

SVM aims to ﬁnd an optimal hyperplane wT x + b = 0 which separates the data into

2

two classes based on maximizing the distance w between two support hyperplanes,

2

which can be formulated as follows

B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classiﬁcation 123

min 12 w22 +C ∑li=1 ξi

w,b

s.t. yi (wT xi + b) ≥ 1 − ξi , (1)

ξi ≥ 0, i = 1, · · · , l.

After solving (1), a new point is classiﬁed as class +1 or class −1 according to the

ﬁnally decision function f (x) = sgn(wT x + b).

1.2. TWSVM

Consider a binary classiﬁcation problem of l1 positive points and l2 negative points (l1 +

l2 = l). Suppose that data points belong to positive class are denoted by A ∈ Rl1 ×n , where

each row Ai ∈ Rn (i = 1, · · · , l) represents a data point with label +1. Similarly, B ∈ Rl2 ×n

represents all the data points with the label −1. The TWSVM determines two nonparallel

hyperplanes:

and is at least one distance from the other class points. The formulation of TWSVM are

as follows:

w+ ,b+ 2 (3)

s.t. −(Bw+ + e− b+ ) + ξ ≥ e− , ξ ≥ 0

and

w− ,b− 2 (4)

s.t. (Aw+ + e+ b+ ) + η ≥ e+ , η ≥ 0,

where C1 ,C2 are pre-speciﬁed penalty factors, e+ , e− are vectors of ones of correspond-

ing dimensions. It is apparent from the formulations that the vector of ones e+ is l2 di-

mensions and e1 is l1 . The nonparallel hyperplanes (2) can be obtained by solving (3)

and (4). Then the new point is classiﬁed by following decision function

xT wr + br = min | xT wr + br |, (5)

r=+,−

In this section, we introduce chance constrained programming (CCP) brieﬂy and propose

a chance constrained twin support vector machine (CC-TWSVM) to process uncertain

data points.

When uncertain noise exists in the datast, the TWSVM model need to be modiﬁed

to contain the uncertainty information. Suppose there are l1 and l2 training data points in

124 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classiﬁcation

Rn , use Ai = [Ai1 , · · · , Ain ], i = 1, · · · , l1 to denote the uncertain data points and the label

is positive +1. And let Bi = [Bi1 , · · · , Bin ], i = 1, · · · , l2 to denote the uncertain data points

and the label is negative −1 respectively. Then A = [A
1 , · · · , A

l ]T and B = [B

1 , · · · , B

l ]T

1 2

represent two data sets. The chance-constrained program is to determine two nonparallel

planes such that each hyperplane is closer to its class in the sense of expectation and is

as far as possible from the other class in probability. The chance-constrained TWSVM

formulations are

l1

∑ ξi

2 E{Aw+ + e+ b+ 2 } +C1

1 2

min

w+ ,b+ i=1 (6)

s.t. P{−(Bi w+ + b+ ) ≤ 1 − ξi } ≤ ε

ξi ≥ 0, i = 1, · · · , l1

and

l2

∑ ηi

2 E{Bw− + e− b− 2 } +C2

1 2

min

w− ,b− i=1

(7)

s.t. P{(Ai w− + b− ) ≤ 1 − ηi } ≤ ε

ηi ≥ 0, i = 1, · · · , l2 .

where E{·} denote the expectation under corresponding distribution, C1 ,C2 are user-

given regularization parameters, 0 < ε < 1 is a parameter close to 0 and P{·} is the prob-

ability distribution of uncertain data points of binary classes sets. The objective functions

of model ensure that minimum distance between each hyperplane to its class in average.

The chance constraints of model ensure that an upper bound on the misclassiﬁcation

probability which the point is assigned to another class. The chance constraints in the

model have the advantages of guaranteing classiﬁcation correctly with high probability.

And the determined planes constructing by maximum margin classiﬁers are robust to

uncertainties in data. But two quadratic optimization problems (6) and (7) with chance

constrained are obviously non-convex, so the model is difﬁcult to solve. So far using

different bounded inequalities is always effective technique to deal with CCP. When the

mean and covariance matrix of uncertain data points are known, then multivariate bound

[13,14,15] can be adopted to express the chance constraints by robust optimization.

Let X ∼ (μ , Σ) denote random vector X with mean μ and covariance matrix Σ, the

multivariate Chebyshev inequality states that for any closed convex set S, the supremum

of the probability that X take a value in S is

sup P{X ∈ S} = 1

1+d 2

X∼(μ ,Σ) (8)

d 2 = inf (X − μ )T ∑−1 (X − μ ).

X∈S

Assume the ﬁrst and second moment information of random variables Ai and Bi are

known. Let μi+ = E[Ai ] and μi− = E[Bi ] be the mean vector seperately. And let ∑+

i =

+ T − − T −

E[(Ai − μi ) (Ai − μi )] and ∑i = E[(Bi − μi ) (Bi − μi )] be the covariance matrix of

+

the two data set uncertain points respectively. Then the problems (6) and (7) could be

reformulated respectively as:

B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classiﬁcation 125

l1

T +T b + 1 l b2 +C

2 w+ G w+ + w + μ ∑ ξi

1 T +

min + 2 1 + 1

w+ ,b+ i=1 (9)

1

s.t. −(μi− w+ + b+ ) ≥ 1 − ξi + k ∑−

i

2

w+ , ξi ≥ 0

and

l2

1 T − T −T b + 1 l b2 +C

min 2 w− G w− + w − μ − 2 2 − 2 ∑ ηi

w− ,b− i=1 (10)

1

s.t. μi+ w− + b− ≥ 1 − ηi + k ∑+

i

2

w− , ηi ≥ 0,

1−ε

where k = ε and

l1 l1

G+ = ∑ (μi+ μi+ + Σ+ μ + = ∑ μi+

T

i ),

i=1 i=1

with

l2 l2

G− = ∑ (μi− μi− + Σ− μ − = ∑ μi− .

T

i ),

i=1 i=1

Let

1 G+ μ +T

H+ = . (11)

2 μ + l1

Then the matrix H + is positive semi-deﬁne. To ensure the strict convexity of problem

(9), we can always append a perturbation ε I (ε > 0, I is the identity matrix) such that the

matrix H + + ε I is positive deﬁne. Without loss of generality, suppose that H + is positive

deﬁne.

The dual problems of chance-constrained TWSVM models (9) and (10) can be for-

mulated as the following models

l1 T T T T

max ∑ λi − 12 s+

i H1 G H1 si − 2 l1 si H2 H2 si − μi H1 si H2 si

+ + + + 1 + + + + + + + + +

λi ,ν i=1

l 1

l1

1 T (12)

s.t. − ∑ λi μi− + kΣi− 2 ν , ∑ λi = s+ i

i=1 i=1

0 ≤ λi ≤ C1 , ν ≤ 1

and

l2 T T T T

max ∑ γi − 12 s− − − − − − − − − + + − + −

i H1 G H1 si − 2 l2 si H2 H2 si − μi H1 si H2 si

1

γi ,υ i=1

l 1

l

2

+T

2 (13)

s.t. − ∑ γi μi − kΣi υ , ∑ γi = s−

+2

i

i=1 i=1

0 ≤ γi ≤ C2 , υ ≤ 1,

126 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classiﬁcation

where

−1 −1

H+ = [H1+ , H2+ ], H− = [H1− , H2− ].

3. Numerical Experiments

In this section, our CC-TWSVM model is illustrated by numerical test based on two

types of data sets. The ﬁrst test is implemented to certify the performance of our CC-

TWSVM by artiﬁcial data. And in second test, we also test the performance of CC-

TWSVM model on real-word classifying data sets from UCI Machine Learning Repos-

itory. All results were averaged on 10 train-test experiments and carried out by Matlab

R2012a with 2.5GHz CPU, 2.5G usable RAM. The SeDuMi 3 software is employed to

solve the SOCP problems of CC-TWSVM.

set of 2-dimension data randomly. The normal distribution contribution of binary classes

of the data is

0 10 − −1 − 70

μ =+

, Σ =+

,μ = , Σ = .

2 04 0 03

8 8

6 6

4 4

2 2

0 0

−2 −2

−4 −4

−6 −6

−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3

Figure 1 showes that the performance of CC-TWSVM to two uncertain data set

points. In numerical experiments, different data points are generated by respective dis-

tribution. In data set, one class points were generated by normal distribution (μ + , Σ+ )

and the other by (μ − , Σ− ). Each class has 50 points, and 20 points are randomly picked

as the training points, the other points are the test points. In Figure 1 , the blue stars are

the points of +1 class, while -1 class with the red circles. For simplicity, we set ε to be

0.1 and 0.01 respectively. The penalty parameters C1 and C2 are selected form the set

3 http://sedumi.ie.lehigh.edu/

B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classiﬁcation 127

{10i |i = −5, · · · , 5}. After 10 times experiments, we obtain the results of two parameters

of hyperplanes and set the average parameters to be the ultimate result. The blue and red

lines are the separating hyperplane (2-D) that we look for. In fact, the value of parame-

ter ε also effect the determination of two hyperplane. When the parameter ε decreases

from 0.1 to 0.01, the average accuracy of classiﬁer is more higher and the planes are

more closet to responding classes. Figure 1. (a) and (b) perform the effect of various

parameters.

In numerical experiments, this section presents results in two real data sets. The following

datasets were used in the experiments:

• WBCD Wisconsin Breast Cancer Diagnostic dataset was also obtained from UCI

dataset [16]. WBCD data is 10-dimensional. The data set has 699 samples, with

444 benign samples are labeled as the class +1 and the remaining malignant as

the class -1.

• IONOSPHERE Ionosphere dataset was collected from UCI dataset . Ionosphere

data is of 34-dimension. The data set has 351 samples, with 225 good samples

are labeled as the class +1 and the remaining as the class -1.

The distribution properties are often so unknown that need to be estimated from data

points. If an uncertain point xi = [

xi1 , · · · , x

in ] has N samples xik , k = 1, · · · , N, then the

T

N

sample mean xi = 1

N ∑ xik is used to estimate the mean vector μi = E[

xi ], and the sample

i=1

covariance

1 N

Si = ∑ (xik − xi )(xik − xi )T

N − 1 i=1

xi − μi )(

Σi = E[( xi − μi )T ].

But these could bring possible estimation errors in some condition that the mean

vector μi and covariance matrix Σi may not obtained. Panos M. Pardalos et al. [17] has

discussed the way to processing these special cases. In our practical experiments, similar

to Pardalos, we employ mentioned methods to modify the estimation.

Since the data sets are uncertain, the measures of performance are worth studied.

Ben-Tal et al. [11] proposed using nominal error and optimal error to evaluate the perfor-

mance. In our experiment, we choose these index to calculate the accuracy of our model.

The formula of NomErr is

∑ 1yipre =yi

i

NomErr = × 100%

the amount o f training data

The optimal error (OptErr) is deﬁned on the basis of the misclassiﬁcation probabili-

ty. The chance constraints in the model (6), (7) can be reformulated to (9), (10), then we

can derive the least value of ε called εopt . Te OptErr of data point xi is deﬁned as

128 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classiﬁcation

1 if yipre = yi

OptErr =

εopt if yipre = yi

∑ OptErri

i

OptErr = × 100%.

the amount o f training data

We tested on the WBCD ﬁrstly. Because each data points in WBCD has 10 attributes,

the amount of time in calculating SOCP would take too much. We used principle com-

ponent analysis (PCA) to exact the two important features. Then 80% of the data points

was used as training and the remaining as the test data. For the setting of parameter ε ,

three parameter values {0.1, 0.05, 0.01} were adopted separately. Similar to the exper-

iments in artiﬁcial data, the penalty parameters C1 and C2 were selected form the set

{10i |i = −5, · · · , 5}.

3.1

5.5

5.5

3.08

5

5

3.06

4.5

4.5

3.04

4

4 3.02

3.5

3.5 3

3

Figure 2. The performance of CC-TWSVM in the Wisconsin breast cancer data set.

The average results over 10 runs are shown in Figure 2. In Figure 2.(a), it is obvi-

ously that NomErr decreases slightly when ε descends from 0.1 to 0.01. That is because

ε represents the upper bound of misclassiﬁcation. The result for OptErr is also the case

in Figure 3.(b). When ε decrease from 0.1 to 0.01, average OptErr rate decrease from

5.4% to 5.3% approximately. So we can get the conclusions that classifying accuracy

improves when parameter ε decreases. Since the deﬁnitions of OptErr and NomErr, it is

not difﬁcult to see that OptErr was bigger than NomErr from the previous two maps. In

addition, the model takes more time when ε increases. This is due to solving process of

second cone programming problem is related heavily to the parameters.

22

22 2.1

21

21 2.09

20

20 2.08

19

19 2.07

18

2.06

18

17

2.05

17

16

2.04

16 15

2.03

15 14

2.02

14 13

2.01

13 12 2

e=0.1 e=0.05 e=0.01 e=0.1 e=0.05 e=0.01 e=0.1 e=0.05 e=0.01

B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classiﬁcation 129

Data Sets Classes Instance Features TWSVM CC-SVM CC-TWSVM

Bliver 2 345 6 0.3521 0.3514 0.3504

Heart-c 2 303 14 0.1867 0.1875 0.1802

Hepatitis 2 155 19 0.2082 0.2074 0.1991

Ionosphere 2 351 34 0.0633 0.0625 0.0604

Votes 2 435 16 0.0824 0.0816 0.0736

WBCD 2 699 10 0.1643 0.1606 0.1578

The average results for Ionosphere set over 10 runs are shown in Figure 3.. Similar

to the process in WBCD, we obtained 3 principal attributes of Ionosphere by PCA. Based

on these principal components, 80% of the data points was used as training and the

remaining as the test data. For the setting of parameter ε , three parameter values set

{0.1, 0.05, 0.01} were adopted and the penalty parameters C1 and C2 were selected form

the set {10i |i = −5, · · · , 5} respectively. We can also get the conclusions that classifying

accuracy improves when parameter ε decreases. And in this experiment, it is easy to

see that OptErr was bigger than NomErr. Moreover, because of the usage of SeDuMi

software in solving SOCP, the model takes more time when ε increases.

We also tested our model to compare with previous model, such as TWSVM and

CC-SVM. The experiment data sets were ”Bliver”, ”Heart-c”, ”Hepatitis”, ”Inosphere”,

”Votes”, and ”WBCD” which were selected from UCI datasets. In the experiments,

the penalty parameters in three model were all same. They were selected from the set

{10i |i = −5, · · · , 5} respectively. The parameter ε in CC-SVM and CC-TWSVM model

was selected from the set {0.1, 0.05, 0.01} respectively, and 80% of the data points ware

used as training and the remaining as the test data. Comparison of previous models and

our model is given in Table 1. It is easy to see that the average misclassiﬁcation rate

of CC-TWSVM is better than original TWSVM. Furthermore, the performance of CC-

TWSVM is better than CC-SVM. This is consistent with the result that two nonparallel

planes has advantages than single hyperplane.

4. Conclusions

A new chance constrained twin support vector machine (CC-TWSVM) via chance con-

strained programming formulation was proposed, which can attend to data set with mea-

surement noise efﬁciently. This paper studied twin support vector machine classiﬁcation

when data are uncertain statistically. With chance constraints programming (CCP) in the

model, the CC-TWSVM was used to ensure the low probability of classiﬁcation error

for the uncertainty. The CC-TWSVM model could be transformed to second-order cone

programming (SOCP) by the properties of moment information of uncertain points and

the dual problem of SOCP model was also introduced. Then we obtained the twin hyper-

planes by the calculating the dual problem. In addition, we also showed the performance

of CC-TWSVM model in artiﬁcial data and real data by numerical experiments. In the

future work, how to further make the model more robust is under our consideration. In

addition, to deal with the situation of nonlinear classiﬁcation with chance constrained is

also interesting.

130 B.-Z. Yang et al. / Chance Constrained Twin SVM for Uncertain Pattern Classiﬁcation

Acknowledgement

This work was supported by the joint Foundation of the Ministry of Education of China

and China Mobile Communication Corporation (MCM20150505) and the Fundamental

Research Funds foe the Central Universities of Sichuan University (skqy201646).

References

[1] B. Scholkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regulation, Optimiza-

tion, and Beyond, MIT press, Cambridge, 2002.

[2] B. Z. Yang, M. H. Wang, H. Yang, T. Chen, Ramp loss quadratic support vector machine for classiﬁca-

tion, Nonlinear Analalysis Forum, 21 (2016), 101-115.

[3] O. Mangasarian, E. Wild, Multisurface proximal support vector classiﬁcation via generalized eigenval-

ues, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28 (2006), 69-74.

[4] Jayadeva, R. Khemchandani, S. Chandra, Twin support vector machines for pattern classiﬁcation, IEEE

Transactions on Pattern Analysis and Machine Intelligence, 29 (2007), 905-910.

[5] X. J. Peng, A v-twin support vector machine (v-TWSVM) classiﬁer and its geometric algorithms, Infor-

mation Sciences, 180 (2010), 3863-3875.

[6] Y. J. Lee, O. L. Mangasarian, SSVM: a smooth support vector machine for classiﬁcation, Computational

Optimization and Applications, 20 (2001), 5-22.

[7] D. Goldfarb, G. Iyengar, Robust convex quadratically constrained programs, Mathematical Program-

ming, 97 (2009), 495-515.

[8] Paul Bosch, Julio López, Héctor Ramı́rez, Hugo Robotham, Support vector machine under uncertainty:

An application for hydroacoustic classiﬁcation of ﬁsh-schools in Chile, Expert Systems with Applica-

tions, 40 (2013), 4029-4034.

[9] T. B. Trafalis, R. C. Gilbert, Robust classiﬁcation and regression using support vector machines, Euro-

pean Journal of Operational Research, 173, (2006), 893-909.

[10] C. Bhattacharyya, L. R. Grate, M. I. Jordan, G. L. El, I. S. Mian, Robust sparse hyperplane classiﬁer:

application to uncertain molecular proﬁling data, Journal of Computational Biology, 11 (2004), 1073-

1089.

[11] A. Ben-Tal, S. Bhadra, C. Bhattacharyya, J.S. Nash, Chance constrained uncertain classiﬁciation via

robust optimization, Mathematical Programming, 127 (2011), 145-173.

[12] A. Ben-Tal, A. Nemirovski, Selected topics in robust convex optimization, Mathematical Programming,

112 (2008), 125-158.

[13] D. Bersimass, I. Popescu, Optimal inequities in probality theory: a convex optimization approach, SIAM:

SIAM Journal on Optimization, 15 (2005), 780-804.

[14] A. W. Marshall, I. Olkin, Multivariate chebyshev inequities. The Annals of Mathematical Statistics, 31

(1960), 1001-1014.

[15] A. Nemirovski, A. Shapiro, Convex approximations of chance constrained programs. SIAM: SIAM Jour-

nal on Optimization, 17 (2010) , 969-996.

[16] A. Frank and A. Asuncion, UCI Machine Learning Repository, 2010. Available at

http://archive.ics.uci.edu/ml.

[17] X. Wang, N. Fan, P. M. Pardalos, Robust chance-constrained support vector machines with second-order

moment information. Annals of Operations Research, (2015), 10.1007/s10479-015-2039-6

Fuzzy Systems and Data Mining II 131

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-131

Monoidal T-Norm (Based) Logics

Eunsuk YANG 1

Department of Philosophy, Chonbuk National University, Jeonju, KOREA

Abstract. This paper deals with non-algebraic binary relational semantics, called

here set-theoretic Kripke-style semantics, for monoidal t-norm (based) logics. For

this, we ﬁrst introduce the system MTL (Monoidal t-norm logic) and some of its

prominent axiomatic extensions, and then their corresponding Kripke-style seman-

tics. Next, we provide set-theoretic completeness results for them.

Keywords. relational semantics, (set-theoretic) Kripke-style semantics, substructural

logic, fuzzy logic, t-norm (based) logics

After introducing algebraic semantics for t-norm (based) logics, their corresponding

Kripke-style semantics have been introduced. For instance, after Esteva and Godo intro-

ducing algebraic semantics for monoidal t-norm (based) logics in [4], their correspond-

ing Kripke-style semantics were introduced by Montagna and Ono [6], Montagna and

Sacchetti [7], and Diaconescu and Georgescu [3]. These semantics have one important

common feature as follows:

• While such semantics are called Kripke-style semantics in the sense that those

semantics are provided using forcing relations, they are still algebraic in the sense

that their completeness results are provided using the fact that such semantics are

equivalent to algebraic semantics.

Because of this fact, Yang [8,9,10] called these semantics algebraic Kripke-style

semantics. Although non-algebraic Kripke-style semantics, where the “non-algebraic”

means that their completeness results are provided without using the above fact, were

provided for some particular systems (see e.g. [9]), such semantics have not yet been

established for basic fuzzy logics in general.

The aim of this paper is to provide set-theoretic Kripke-style semantics for basic core

fuzzy logics2 . As its starting point, we investigate set-theoretic Kripke-style semantics

for the logic system MTL (Monoidal t-norm logic) and its most prominent axiomatic

1 Corresponding Author: Eunsuk Yang, Department of Philosophy & Institute of Critical Thinking and

Writing, Chonbuk National University, Rm 307, College of Humanities Blvd. (14-1), Jeonju, 54896, KOREA

Email: eunsyang@jbnu.ac.kr.

2 Here, fuzzy logics are logics complete with respect to (w.r.t.) linearly ordered algebras and core fuzzy logics

are logics complete w.r.t. the real unit interval [0, 1] (see [1,2]).

132 E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics

extensions. For this, ﬁrst, in Section 2, we discuss monoidal t-norm (based) logics and

their corresponding Kripke-style semantics. Next, in Section 3, we provide set-theoretic

completeness results for them.

For convenience, we adopt notations and terminology similar to those in [1,7,8,9,10]

and assume reader familiarity with them (together with results found therein).

Monoidal t-norm (based) logics are based on a countable propositional language with the

set of formulas FOR built inductively from a set of propositional variables VAR, propo-

sitional constants , ⊥, and binary connectives →, &, ∧, and ∨. Further connectives are

deﬁned as follows:

df2. ¬ϕ := ϕ → ⊥.

nology are followed and the axiom systems are used to provide a consequence relation.

We ﬁrst list the axioms and rules of MTL, the most basic monoidal t-norm logic.

A1. (ϕ → ψ) → ((ψ → χ) → (ϕ → χ)) (sufﬁxing, SF)

A2. ϕ → ϕ (reﬂexivity, R)

A3. (ϕ ∧ ψ) → ϕ, (ϕ ∧ ψ) → ψ (∧-elimination, ∧-E)

A4. ((ϕ → ψ) ∧ (ϕ → χ)) → (ϕ → (ψ ∧ χ)) (∧-introduction, ∧-I)

A5. ϕ → (ϕ ∨ ψ), ψ → (ϕ ∨ ψ) (∨-introduction, ∨-I)

A6. ((ϕ → χ) ∧ (ψ → χ)) → ((ϕ ∨ ψ) → χ) (∨-elimination, ∨-E)

A7. ⊥ → ϕ (ex falsum quodlibet, EF)

A8. ϕ → (verum ex quodlibet, VE)

A9. (ϕ → (ψ → χ)) ↔ (ψ → (ϕ → χ)) (permutation, PM)

A10. (ϕ → (ψ → χ)) ↔ ((ϕ&ψ) → χ) (residuation, RES)

A11. (ϕ&ψ) → ϕ (weakening, W)

A12. (ϕ → ψ) ∨ (ψ → ϕ) (prelinearity, PL)

ϕ → ψ, ϕ ψ (modus ponens, mp)

ϕ, ψ ϕ ∧ ψ (adjunction, adj)

Well-known monoidal t-norm logics are axiomatic extensions (extensions for short)

of MTL. We introduce some prominent examples.

Deﬁnition 2. The following are famous monoidal t-norm logics extending MTL:

• Łukasiewicz logic Ł is BL plus (DNE) ¬¬ϕ → ϕ.

• Gödel logic G is BL plus (CTR) ϕ → (ϕ&ϕ).

• Product logic Π is BL plus (CAN) (ϕ → ⊥) ∨ ((ϕ → (ϕ&ψ)) → ψ).

For easy reference, we let Ls be a set of the monoidal t-norm logics deﬁned previ-

ously.

E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics 133

For convenience, “,”, “⊥,” “¬,” “→,” “∧,” and “∨” are used ambiguously as propo-

sitional constants and connectives and as top and bottom elements and frame operators,

but context should clarify their meanings.

Now we provide Kripke-style semantics for Ls. First, Kripke frames are deﬁned as

follows.

Deﬁnition 4. (Cf. [7,8,10])

(i) (Kripke frame) A Kripke frame is a structure X = (X, , ⊥, ≤, ∗), where (X, , ≤

, ∗) is an integral linearly ordered commutative monoid such that ∗ is residuated,

i.e., for every a, b ∈ X, the set {c : c ∗ a ≤ b} has a supremum, denoted here a → b.

The elements of X are called nodes.

(ii) (L frame) An MTL frame is a Kripke frame, where ∗ is conjunctive (i.e., ⊥ ∗ = ⊥)

and left-continuous (i.e., if there exists sup{ci : i ∈ I}, then sup{ci ∗ a : i ∈ I} =

sup{ci : i ∈ I} ∗ a). Consider the following conditions: for all a, b ∈ X,

• (DIVF ) min{a, b} ≤ a ∗ (a → b).

• (DNEF ) ¬¬a ≤ a.

• (CTRF ) a ≤ a ∗ a.

• (CANF ) = a → ⊥ or = (a → (a ∗ b)) → b.

BL frames are MTL frames satisfying (DIVF ); Ł frames are BL frames satisfying

(DNEF ); G frames are BL frames satisfying (CTRF ); Π frames are BL frames

satisfying (CANF ). We call all these frames (including MTL frames) L frames.

An evaluation on a Kripke frame is a forcing relation between nodes and proposi-

tional variables, constant, and arbitrary formulas satisfying the following conditions: For

every propositional variable p,

(Atomic hereditary condition, AHC) if a p and b ≤ a, then b p;

(min) ⊥ p,

for the propositional constant ⊥,

(⊥) a ⊥ iff a = ⊥, and

for arbitrary formulas,

(∧) a ϕ ∧ ψ iff a ϕ and a ψ;

(∨) a ϕ ∨ ψ iff either a ϕ or a ψ;

(&) a ϕ&ψ iff there exist b, c ∈ X such that a ≤ b ∗ c, b ϕ, and c ψ;

(→) a ϕ → ψ iff for each b ∈ X, if b ϕ, then a ∗ b ψ.

Deﬁnition 5. (i) (Kripke model) A Kripke model is a pair (X , ), where X is a

Kripke frame and is a forcing on X .

(ii) (L model) An L model is a pair (X , ), where X is an L frame and is a forcing

on X .

Deﬁnition 6. Given a Kripke model (X , ), a node a of X and a formula ϕ, we say

that a forces ϕ to express a ϕ. We say that ϕ is true in (X , ) if ϕ, and that ϕ is

valid in the frame X (expressed by X |= ϕ) if ϕ is true in (X , ) for each forcing

on X .

134 E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics

Lemma 1. (Hereditary condition, HC) Let X be a Kripke frame. For every formula ϕ

and for any two nodes a, b ∈ X , if a ϕ and b ≤ a, then b ϕ.

X . For each atomic formula p and for each a ∈ X , let a p iff a ≤ v(p). Then (X , )

is an L model, and for each formula ϕ and for each a ∈ X , we have that a ϕ iff

a ≤ v(ϕ).

Proof. Here we consider the formulas (PL), (DIV ), (DNE), (CT R) and (CAN) as exam-

ples.

(PL): By the condition (∨), it is sufﬁcient to prove that ϕ → ψ or ψ → ϕ. By

Proposition 1, we can instead show that ≤ v(ϕ → ψ) or ≤ v(ψ → ϕ). Proposition 1

also ensures v(ϕ → ψ) = v(ϕ) → v(ψ) for all formulas ϕ and ψ. If v(ϕ) ≤ v(ψ), then

∗ v(ϕ) ≤ v(ψ) and thus ≤ v(ϕ → ψ). If v(ψ) ≤ v(ϕ), then ∗ v(ψ) ≤ v(ϕ) and

thus ≤ v(ψ → ϕ).

(DIV ): Lemma 2 ensures that in order to prove (ϕ ∧ ψ) → ((ϕ&(ϕ → ψ)) → ψ),

it is sufﬁcient to show that for each node a ∈ X, if a ϕ ∧ ψ, then a ϕ&(ϕ → ψ). By

Proposition 1, we can instead assume a ≤ v(ϕ ∧ ψ) and show a ≤ v(ϕ&(ϕ → ψ)). Note

that Proposition 1 also ensures v(ϕ ∧ ψ) = min{v(ϕ), v(ψ)} and v(ϕ&ψ) = v(ϕ) ∗ v(ψ)

for all formulas ϕ and ψ. Then, since min{v(ϕ), v(ψ)} ≤ v(ϕ) ∗ (v(ϕ) → v(ψ)) by

(DIV F ), we have a ≤ v(ϕ&(ϕ → ψ)).

(DNE): As above, it is sufﬁcient to prove that for each a ∈ X, if a ¬¬ϕ, then

a ϕ. By Proposition 1, we instead assume a ≤ v(¬¬ϕ) and show a ≤ v(ϕ). Note that

v(¬ϕ) = v(ϕ → ⊥) = ¬v(ϕ). Then, since a ≤ v(¬¬ϕ) = ¬¬v(ϕ) and ¬¬v(ϕ) ≤ v(ϕ)

by (DNE F ), we have a ≤ v(ϕ).

(CT R): As above, it is sufﬁcient to prove that for each a ∈ X, if a ϕ, then a ϕ&ϕ. Let

a ϕ. By Proposition 1, we have a ≤ v(ϕ). Then, using the monotonicity and (CT RF ),

we also have a ≤ a ∗ a ≤ v(ϕ) ∗ v(ϕ). Hence, by the condition (&) and Proposition 1, we

obtain a ϕ&ϕ.

(CAN): We need to show that either ϕ → ⊥ or (ϕ → (ϕ&ψ)) → ψ. Ob-

viously, v(ϕ) = ⊥ ensures ≤ v(ϕ → ⊥) since v(⊥ → ⊥) = v(⊥) → v(⊥) = v().

Thus, by Proposition 1, we have ϕ → ⊥ in case v(ϕ) = ⊥. Let v(ϕ) = ⊥. In or-

der to prove (ϕ → (ϕ&ψ)) → ψ, we assume a ϕ → (ϕ&ψ) and show a ψ.

By Proposition 1, we instead assume a ≤ v(ϕ → (ϕ&ψ)) and show a ≤ v(ψ). Then,

since a ≤ v(ϕ → (ϕ&ψ)) = v(ϕ) → v(ϕ&ψ) = v(ϕ) → (v(ϕ) ∗ v(ψ)) and = (v(ϕ) →

(v(ϕ) ∗ v(ψ))) → v(ψ) by (CAN F ), we have a ≤ v(ψ).

We leave the proofs for the other cases to the interested reader.

E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics 135

Now, we provide completeness results for Ls. A theory T is said to be linear if, for

each pair ϕ, ψ of formulas, we have T ϕ → ψ or T ψ → ϕ. By an L-theory, we

mean a theory T closed under rules of L. By a regular L-theory, we mean an L-theory

containing all of the theorems of L. Since we have no use of irregular theories, by an

L-theory, we henceforth mean an L-theory containing all of the theorems of L.

Let T be a linear L-theory. We deﬁne the canonical L frame determined by T as a

structure X = (Xcan , can , ⊥can , ≤can , ∗can ), where can = T , ⊥can = {ϕ : T L ⊥ →

ϕ}, Xcan is the set of linear L-theories extending can , ≤can is ⊇ restricted to Xcan , i.e,

a ≤can b iff {ϕ : a L ϕ} ⊇ {ϕ : b L ϕ}, and ∗can is deﬁned as a ∗can b := {ϕ&ψ : for

some ϕ ∈ a, ψ ∈ b} satisfying integral commutative monoid properties corresponding

to L frames on (Xcan , can , ≤can ). Notice that we construct the base can as the linear

L-theory that excludes nontheorems of L, i.e., excludes any formula ϕ such that L ϕ.

The linearly orderedness of the canonical L frame depends on ≤can restricted on Xcan .

First, we can easily show the following.

Proof. It is easy to show that a canonical L frame is partially ordered. We show that this

frame is connected and so linearly ordered. Suppose toward contradiction that neither

a ≤can b nor b ≤can a. Then, there are ϕ, ψ such that ϕ ∈ b, ϕ ∈ a, ψ ∈ a, and ψ ∈ b. Note

that, since can is a linear theory, ϕ → ψ ∈ can or ψ → ϕ ∈ can . Let ϕ → ψ ∈ can

and thus ϕ → ψ ∈ b. Then, by (mp), we have ψ ∈ b, a contradiction. The case, where

ψ → ϕ ∈ can , is analogous.

Next, let vcan be a canonical evaluation function from formulas to sets of formulas,

i.e, vcan (ϕ) = {ϕ}. We deﬁne a canonical evaluation as follows:

Lemma 3. can can ϕ → ψ iff for each a ∈ Xcan , if a can ϕ, then a can ψ.

Proof. By (a), we need to show that ϕ → ψ ∈ can iff for all a ∈ Xcan , if ϕ ∈ a, then

ψ ∈ a. For the left-to-right direction, we assume ϕ → ψ ∈ can and ϕ ∈ a, and show

ψ ∈ a. The deﬁnition of ∗can ensures (ϕ → ψ)&ϕ ∈ can ∗can a = a. Since L proves

((ϕ → ψ)&ϕ) → ψ, we have ((ϕ → ψ)&ϕ) → ψ ∈ can and thus ((ϕ → ψ)&ϕ) → ψ ∈

a. Therefore, we obtain ψ ∈ a by (mp). We prove the other direction contrapositively.

Suppose ϕ → ψ ∈ can . We set a0 = {Z : there exists X ∈ can and can (X&ϕ) → Z}.

Clearly, a0 ⊇ can , ϕ ∈ a0 , but also ψ ∈ a0 . (Otherwise, can (X&ϕ) → ψ and thus

can X → (ϕ → ψ); therefore, since can X, by (mp), we have can ϕ → ψ, a

contradiction.) Then, by the Linear Extension Property of Theorem 12.9 in [2], we have

a linear theory a ⊇ a0 with ψ ∈ a; therefore ϕ ∈ a but ψ ∈ a.

evaluation.

For (AHC), we must show that: for each propositional variable p,

136 E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics

Let a can p and b ≤can a. By (a), we have p ∈ a and a ⊆ b, and thus p ∈ b. Hence, by

(a), we have b can p.

For (min), we must show that: for each propositional variable p,

⊥can can p.

By (a), we need to show that p ∈ ⊥can . Since ⊥can = {ϕ : T L ⊥ → ϕ}, p ∈ ⊥can .

We next consider the condition for the propositional constant ⊥.

For (⊥), we must show that:

a can ⊥ iff a =can ⊥can .

By (a), we need to show that ⊥ ∈ a iff a =can ⊥can . This is obvious since ⊥can = {ϕ :

T L ⊥ → ϕ}.

Now we consider the conditions for arbitrary formulas.

For (∧), we must show

a can ϕ ∧ ψ iff a can ϕ and a can ψ.

By (a), we need to show that ϕ ∧ ψ ∈ a iff ϕ ∈ a and ψ ∈ a. We can prove the left-to-right

direction using the axiom (∧-E) and the rule (mp). We can also prove the right-to-left

direction using the rule (adj).

For (∨), we must show

a can ϕ ∨ ψ iff either a can ϕ or a can ψ.

By (a), we need to show that ϕ ∨ ψ ∈ a iff either ϕ ∈ a or ψ ∈ a. We can prove the left-

to-right direction using the fact that a is linear and linear theories are also prime theories.

We can also prove the right-to-left direction using the axiom (∨-I) and the rule (mp).

For (&), we must show

a can ϕ&ψ iff there exist b, c ∈ Xcan such that b can ϕ, c can ψ, and a ≤can b ∗can c.

By (a), we need to show that ϕ&ψ ∈ a iff there exist b, c ∈ X such that ϕ ∈ b, ψ ∈ c,

and a ≤can b ∗can c. For the right-to-left direction, we assume that there exist b, c ∈ Xcan

such that ϕ ∈ b, ψ ∈ c, and a ≤can b ∗can c. Then, using the deﬁnition of ∗can , we obtain

ϕ&ψ ∈ a. For the left-to-right direction, we assume that, for all b, c ∈ Xcan , if ϕ ∈ b and

ψ ∈ c, then a ≤can b ∗can c does not hold true, and we show ϕ&ψ ∈ a. Let ϕ ∈ b and

ψ ∈ c. Since a ≤can b ∗can c does not hold true, we obtain ϕ&ψ ∈ a.

For (→), we must show

a can ϕ → ψ iff for each b ∈ Xcan , if b can ϕ, then a ∗can b can ψ.

By (a), we need to show that ϕ → ψ ∈ a iff for each b ∈ X, if ϕ ∈ b, then ψ ∈ a ∗can b.

For the left-to-right direction, we assume ϕ → ψ ∈ a and ϕ ∈ b, and show ψ ∈ a ∗can

b. The deﬁnition of ∗can ensures (ϕ → ψ)&ϕ ∈ a ∗can b. Then, since L proves ((ϕ →

ψ)&ϕ) → ψ, using Lemma 3, we obtain ψ ∈ a ∗can b. We prove the right-to-left direction

contrapositively. Suppose ϕ → ψ ∈ a. We need to construct a linear theory b such that

ϕ ∈ b and ψ ∈ a ∗can b. Let b0 be the smallest regular L-theory extending can with

{ϕ} and satisfying a ∗can b0 = {Z : there is X ∈ a and can (X&ϕ) → Z}. Clearly,

ϕ ∈ b0 , but ψ ∈ a∗can b0 . (Otherwise, can (X&ϕ) → ψ and thus can X → (ϕ → ψ)

for some X ∈ a; therefore, ϕ → ψ ∈ a, a contradiction.) Then, by the Linear Extension

E. Yang / Set-Theoretic Kripke-Style Semantics for Monoidal T-Norm (Based) Logics 137

Property, we can obtain a linear theory b such that b0 ⊆ b and a ∗can b = {Z : there is

X ∈ a and can (X&ϕ) → Z}; therefore, ϕ ∈ b but ψ ∈ a ∗can b.

Let a model M for L be an L model. Using Lemma 4, we can show that the canoni-

cally deﬁned (X , can ) is an L model. Then, by construction, can excludes our chosen

nontheorem ϕ and the canonical deﬁnition of |= agrees with membership. Therefore, we

can say that, for each nontheorem ϕ of L, there exists an L model in which ϕ is not

can |= ϕ. This gives us the following weak completeness of L.

Theorem 1. (Weak completeness) For any formula ϕ, if ϕ is valid in every L frame, then

L ϕ.

Furthermore, using Lemma 4 and the Linear Extension Property, we can show the

strong completeness of L as follows.

Theorem 2. (Strong completeness) L is strongly complete w.r.t. the class of all L frames.

4. Concluding Remarks

logics. But we have not yet considered such semantics for fuzzy logics based on more

general structures. We will investigate this in a subsequent paper.

Acknowledgments: This work was supported by the Ministry of Education of the Repub-

lic of Korea and the National Research Foundation of Korea (NRF-2016S1A5A8018255).

References

[1] P. Cintula, R. Horčı́k and C. Noguera, Non-associative substructural logics and their semilinear exten-

sions: axiomatization and completeness properties, Review of Symbolic Logic 6 (2013), 394-423.

[2] P. Cintula, R. Horčı́k and C. Noguera, The quest for the basic fuzzy logic, in: Petr Hájek on Mathematical

Fuzzy Logic, F. Montagna, ed., Springer, Dordrecht, 2015, pp. 245-290.

[3] D. Diaconescu and G. Georgescu, On the forcing semantics for monoidal t-norm-based logic, Journal

of Universal Computer Science 13 (2007), 1550-1572.

[4] F. Esteva and L. Godo, Monoidal t-norm based logic: towards a logic for left-continuous t-norms, Fuzzy

Sets and Systems 124 (2001), 271-288.

[5] P. Hájek, Metamathematics of Fuzzy Logic, Kluwer, Amsterdam, 1998.

[6] F. Montagna and H. Ono, Kripke semantics, undecidability and standard completeness for Esteva and

Godo’s Logic MTL∀, Studia Logica 71 (2002), 227-245.

[7] F. Montagna and L. Sacchetti, Kripke-style semantics for many-valued logics, Mathematical Logic

Quarterly 49 (2003), 629-641.

[8] E. Yang, Algebraic Kripke-style semantics for relevance logics, Journal of Philosophical Logic 43

(2014), 803-826.

[9] E. Yang, Two kinds of (binary) Kripke-style semantics for three-valued logic, Logique et Analyse 231

(2015), 377-394.

[10] E. Yang, Algebraic Kripke-style semantics for substructural fuzzy logics, Korean Journal of Logic 19

(2016), 295-322.

This page intentionally left blank

Data Mining

This page intentionally left blank

Fuzzy Systems and Data Mining II 141

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-141

Support Thresholds

Nourhan ABUZAYED and Belgin ERGENÇ1

Computer Engineering Department, Izmir Institute of Technology, Urla, Izmir, Turkey

requirements of items are two important challenges of frequent itemset mining

algorithms. Existing dynamic itemset mining algorithms are devised for single

support threshold whereas multiple support threshold algorithms assume that the

databases are static. This paper focuses on dynamic update problem of frequent

itemsets under MIS (Multiple Item Support) thresholds and introduces Dynamic

MIS algorithm. It is i) tree based and scans the database once, ii) considers

multiple support thresholds, and iii) handles increments of additions, additions

with new items and deletions. Proposed algorithm is compared to CFP-Growth++

and findings are; in dynamic database 1) Dynamic MIS performs better than CFP-

Growth++ since it runs only on increments and 2) Dynamic MIS can achieve

speed-up up to 56 times against CFP-Growth++.

multiple support thresholds

Introduction

Recently, an intensive research focused on association rule mining, which is one of the

main functions of data mining [1] Association rule was first introduced by Agrawal et

al. [2] and is defined as, X% of the customers who buy item A also buy item B and

denoted as AÆB. Association rules are meant to find the impact of a set of items on

another set of items. An itemset (items that co-occur in a transaction) frequency is

referred as the support count, which is the number of transactions that contain the

itemset. An itemset is frequent if its support count satisfies the minimum support

minsup threshold [3]. Confidence of an association rule XÆY is the ratio of

Y to the number of transactions that contain X [2 and 4].

transactions that contain X

Association rule mining has two main steps; 1) finding frequent itemsets/patterns,

2) generating association rules [5]. The first step is more expensive and several

algorithms have been proposed to find the frequent itemsets from huge databases. The

most classical one is the Apriori algorithm that uses candidate generation and testing

approach. Other subsequent algorithms using Apriori-like technique were introduced in

[6-12]. FP-Growth [4] and Matrix Apriori [13 and 14] are more recent algorithms that

try to overcome the drawback of candidate generation and multiple scans of the

database.

1

Corresponding Author: Belgin ERGENÇ, Computer Engineering Department, Izmir Institute of

Technology, Urla, Izmir, Turkey; Email: belginergenc@iyte.edu.tr.

142 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds

dependence on single user given minsup and 2) their inapplicability on dynamic

databases. Single support is not enough to represent the characteristics of the items and

causes rare item problem [15]. Some algorithms like MSapriori [16], CFP-Growth [17],

CFP-Growth++ [18] and MISFP-Growth [19] are introduced to find frequent patterns

under multiple support thresholds. For the second disadvantage, several algorithms

have been introduced in [20-27]. These algorithms perform faster and use less system

resources since they update frequent association rules by considering only the updates

instead of repeating all mining process from beginning.

All mentioned works handle either the dynamic itemset mining with single support

threshold or static itemset mining with multiple support thresholds. In this paper,

Dynamic MIS (Multiple Item Support) algorithm that provides a solution to the

dynamic itemset mining under multiple support thresholds problem is introduced. This

algorithm is tree based, scans the database only once, avoids the candidate generation

problem, and handles increments of additions, additions with new items and deletions

by using dynamic MIS-tree. The closest work to ours is introduced by incremental

tuning mechanism [28]. Proposed algorithm, Dynamic MIS is compared to CFP-

Growth++ using four datasets. Findings reveal that; in dynamic database, Dynamic

MIS performs better than CFP-Growth++ since it runs only on increments, speed-up

gained by Dynamic MIS can reach up to 56 times with large sparse datasets.

The organization of this paper is as follows: Section 1 introduces Dynamic MIS

algorithm with builder and increment handling parts. Section 2 shows the performance

evaluation. Section 3 is dedicated to the conclusion remarks.

Dynamic MIS algorithm provides a solution to the dynamic itemset mining under

multiple support thresholds problem by maintaining dynamic MIS-tree and two header

tables that keep the support counts of all items of the database. Frequent pattern

generation from the tree is done by related module of CFP-Growth++ algorithm [18].

Throughout the section, we use the following example illustrated. Table 1 presents

a sample database D and Table 2 illustrates the user given multiple item support (MIS)

for each item in decreasing order and items’ actual support in the database D. In the

right most column of Table 1, the transactions’ items are in an order of support values

as given in Table 2.

Table 1. Transaction database D [17]. Table 2. MIS and actual support of each item in

TID Item bought Item bought (ordered) D [17].

100 D, C,A, F A, C, D, F

Item A B C D E F G H

200 G, C, A, F, E A, C, E, F, G MIS 80 80 80 60 60 40 40 40

300 B, A, C, F, H A, B, C, F, H (%)

400 G, B, F B, F, G Support 60 60 80 20 20 80 40 20

500 B, C B, C (%)

To build the MIS-tree, the MIS-tree builder algorithm illustrated in Figure 1 is used.

First, the MIS sorted list is created from the MIS values in Table 1 and ordered in de-

N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds 143

creasing order (Line 1) then the root node of the tree is created (Line 2). Primary and

secondary header tables are created (Line 3) as shown in Figure 2.

INPUT: Database D, Minimum item supports MIS

OUTPUT: MISsorted, MIS-tree

BEGIN

1 Build MISsorted list (in decreasing order)

2 Create the root of MIS-tree as null

3 Create primary and secondary header tables

4 Insert items into primary table (count=0)

5 Scan D

6 FOR each transaction T in D do:

7 Sort all items in T (as MISsorted)

8 Add T to the tree

9 END FOR

10 Calculate the support of items in D

11 Update the supports in the tables

12 Relocate items between header tables

END

After that, items are ordered as MISsorted then inserted into the primary header table

with item’s count 0 (Line 4). Database D is scanned, and the transactions are added to

the tree (Line 5-9). First, the items in the new transaction are sorted in decreasing order

according to MISsorted list as in the right most column of Table 1.Then transaction is

added to the tree; if the transaction shares prefix with previous transactions, these

prefixes will be incremented by one, otherwise; new nodes will be created starting from

the root node with item’s count equal to one. Item’s count in the node and header table

is incremented. Nodes of same item are linked all through the tree and to the header

table. Supports of all items in D are calculated then updated in the header tables.

Eventually, the items are located in the two header tables; here items with support

more than the MIN MIS value (40%) are inserted into the primary header table, the rest

are inserted in the secondary header table. Likewise the node links are arranged.

Table 3. The incremental database d.

TID Items Items( ordered)

1 C, B, H B ,C, H

2 G, B, F B ,F, G

3 C, D, H C, D, H

INPUT: MIS-tree, MISsorted, increment d

OUTPUT:Dynamic MIS-tree

BEGIN

1 Scan d

2 FOR each transaction T in d do:

3 Sort items in T (like MISsorted )

4 Add T to the tree

5 END FOR

6 Calculate the support of items

7 Update the supports in the tables

8 Relocate items between header tables

END

Figure 3. Update process in Dynamic MIS for additions. Figure 4. Dynamic MIS-tree after adding d.

The pseudo code of the update process for additions is given in Figure 3. When new

transactions (Table 3) arrive, they are scanned to be added to the tree (Line 2-5). First,

144 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds

items in the new transaction are sorted in descending order of MISsorted list then

transactions are added to the tree one by one as seen in Figure 4. Each item’s count in

this transaction is incremented in the primary table. Then, the nodes of same item are

linked all through the tree and to the header tables of the same figure. Supports of

items are calculated then updated in the header tables (Line 6 - 7). Lastly, items are

relocated between header tables by comparing item’s support with MIN MIS value

(Line 8). The items (A and G) are transferred from primary to secondary header table,

because their supports become less than the MIN MIS value (40%).

The pseudo code of the update process for additions with new items is given in Figure

5. Let us explain this process by using the MIS-tree shown in Figure 2, incremental

database (with new items J, K, L) given in Table 5 and MIS values of new items given

Table 4. The first step is combining the new MIS values in Table 4 with the MIS values

of the old items in Table 2 to get Table 6.

Table 4. MIS values for new items in d. Table 5. The incremental database d with new items J, K, L.

Item J K L TID Item bought Item bought ( ordered)

MIS value 70% 35% 30% 1 C, B, K, J, H, L B ,C, J, H, K, L

2 K,H H, K

3 K, B, C B , C, K

Table 6. MIS values of all items.

Item A B C J D E F G H K L

MIS (%) 80 80 80 70 60 60 40 40 40 35 30

When new items appear, the MISsorted is updated by adding the new MIS values in

descending order as in Line 1. After that, new items in MISnew are appended to the

primary header table with item’s count 0 (Line 2). These two lines are the main

difference between additions and additions with new items.

OUTPUT: MISsorted, Dynamic MIS-tree

BEGIN

1 Build MISsorted (MISsorted + MISnew)

2 Insert new items into primary header

table (count=0)

3 Scan d

4 FOR each transaction T in d do:

5 Sort items in T (like MISsorted )

6 Add T to the tree

7 END FOR

8 Calculate the support of all items

9 Update the supports in the tables

10 Relocate items between header tables

END

Figure 5. Dynamic MIS for additions with new items. Figure 6. Dynamic MIS-tree after adding d.

At the end, some items will be transferred between two header tables. Here, item

(G) is transferred from primary to secondary because its new support (25%) is less than

the new MIN MIS value (30%) and item (H) is transferred from secondary to primary

because with its new support of (37%). Figure 6 presents MIS-tree after adding the new

three transactions.

N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds 145

Let us explain the pseudo code of the update process for deletions which is shown in

Figure 7 by using the increment of deletions shown in Table 7. This example is applied

on the tree of Figure 2. The new transactions in d are scanned, and then deleted from

the tree as seen in Figure 8. Some items’ counts are decremented. These counts’

supports are calculated and updated in the tables of tree. According to the new items’

supports, some items are relocated between header tables. In this example; the support

of item (G) is 33.3%, which is less than the MIN MIS value (40%) and it is moved into

the secondary header table.

TID Item bought Item bought ( ordered)

100 D,C,A,F A ,C, D, F

400 B,F,G B, F,G

OUTPUT: Dynamic MIS-tree

BEGIN

1 Scan d

2 FOR each transaction T in d do:

3 Sort tems in T (like MISsorted)

4 Delete T from the tree

5 END FOR

6 Calculate the support of all items

7 Update the supports

8 Relocate items between header tables

END

Figure 7. Update process in Dynamic MIS for deletions. Figure 8. Dynamic MIS-tree after deletions.

The nodes with count 1 are decremented and deleted from the tree, but their

records are kept in its specified table. The result Dynamic MIS-tree is illustrated in

Figure 8.

2. Performance Evaluation

Dynamic MIS is compared with the popular tree based algorithm, CFP-Growth++ [2].

Several experiments are executed on 4 datasets with different properties (T: average

size of the transactions, D: number of transactions, N: number of items) as shown in

Table 8. D1 and D4 are real; D2 and D3 are synthetic datasets. Density2 of a dataset

indicates the similarity of the transactions. D3 is generated to be used only in the

experiment additions with new items.

Table 8. Properties of datasets.

(%)3

D1 (Retail) Real 10.3 88162 16470 0.06

D2 (T40I1D100K) Synthetic 40 100K 942 4.25

D3 Synthetic 1.1 100K 5356 0.02

D4 (Kosarak) Real 8.1 990002 41270 0.02

146 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds

All experiments are implemented on an Intl(R) core i7 -5500u CPU@ 2.40 GHz

with 8GB main memory, and Microsoft Windows 10 operating system. All programs

are implemented on C# environment.

For our experiments, we use two formulas [12] to assign MIS values to items in

the datasets: M (i)= β f (i) and MIS (i)=

{ M (i) M(i) > LS

LS Otherwise

f(i) is the actual frequency of item i in the data. LS is the user-specified lowest

minimum item support allowed. β (0 ≤ β ≤ 1) is a parameter that controls how the

MIS values for items should be related to their frequencies. If β = 0, we have only one

minimum support, LS, which is the same as the traditional association rule mining. If β

= 1 and f(i) ≥ LS, f(i) is the MIS value for i [16]. This formula is used to generate MIS

values to algorithms which use multiple support thresholds as in [16, 17, 18 and 28].

Computational complexity of building the initial tree is same for both algorithms. It is

(T * V); where T is the number of transactions, and V the average transaction length.

The complexity of the pruning procedure in CFP-Growth++ is O (N * C) where N is

the number of nodes holding the items to be pruned, C is the number of their children.

The merging procedure in CFP-Growth++ is O (N2 * K) where N is number of nodes

in the tree and K is the node links. However in Dynamic MIS the pruning and merging

procedures are replaced by relocating items between header tables procedure which has

a linear complexity of O (N) where N is the number of items to be transferred. The

complexity of adding increments to the tree is O (T * V) where T is the number of the

incremental transactions, and V the average transaction length.

additions is measured by dividing the dataset is into two parts. The part with D = (100 -

x)% forms the initial dataset and the remaining part with d = x% of the transactions

forms the increments. MIS values are kept constant. D1 has thirteen splits of 1% - 13%,

D2 has ten splits of 5% to 50% and D4 has eighteen splits of 5% - 90%.

Figure 9. Speed-up on Retail with additions. Figure 10. Speed-up with additions

The speed-up by running Dynamic MIS instead of re-running CFP-Growth++

when the database is updated is shown in Figure 9 and Figure 10. Speed-up of Dynamic

MIS is from 22.21 to 55.94 on D1 (Figure 9), from 1.56 to 1.35 on D2, and from 37.67

to 5.69 on D4 respectively as seen in Figure 10. The reasons behind these speed-up are

1) Dynamic MIS runs only on the increment whereas CFP-Growth++ runs from the

N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds 147

beginning, 2) Dynamic MIS generates frequent patterns from the items of primary

header table only whereas CFP-Growth++ requires pruning and merging of MIS-tree.

D3 that is generated by IBM_Quest_data_generator [29] to be able to control the

existence of new items that do not exist in the original dataset. Eighteen split sizes in

the range of 5% - 90% are used. LS and β are constant as 0.01 and 0.5 respectively.

The number of new items in each split is constant and equal to 100. As shown in the

Figure 11 speed-up decreases from 5.76 to 1.72 while the split size increases.

Figure 11. Speed-up with additions with new items. Figure 12. Speed-up with deletions.

The last comparison is to determine how the size of deletions affects the performance

of algorithm. Each split contains 20 % of the transactions of the original dataset. MIS

values are kept constant. The speed-up by running Dynamic MIS instead of re-running

CFP-Growth++ when the database is up-dated with deletions can be seen in Figure 12.

The speed-up increases from 2.26 to 44.88 in D1, from 2.06 to 40.16 in D4 and from

1.12 to 1.25 in D2 while the split size decreases.

3. Conclusion

Single support threshold and dynamic aspect of databases bring additional challenges

on frequent itemset mining algorithms. Dynamic MIS algorithm is proposed as a

solution to dynamic update problem of frequent itemset mining under multiple support

thresholds. It is tree based and handles increments of additions, additions with new

items, deletions and is faster especially with large sparse database.

Acknowledgements

This work is partially supported by the Scientific and Technological Research Council

of Turkey (TUBITAK) under ARDEB 3501 Project No: 114E779

References

[1] M. Chen, J. Han, P. S. Yu, Data mining: An overview from a database perspective. IEEE Transaction

on knowledge and Data Engineering, 8(1996), 866–883.

148 N. Abuzayed and B. Ergenç / Dynamic Itemset Mining Under Multiple Support Thresholds

[2] R. Agrawal, T. Imielinski, A. Swami, Mining association rules between sets of items in large databases,

In: ACM SIGMOD International conference on Management of data, USA (1993), 207–216.

[3] J. Han, M. Kamber, J. pei, Data mining concepts and techniques, Morgan Kaufmann Publishers,

Location-Based Services Jochen Schiller, Agnes Voisard (2006), 157–218.

[4] J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation, In: ACM SIGMOD

International Conference on Management of Data, ACM New York, USA (2000), 1–12.

[5] R. Agrawal, R. Srikant, Fast algorithms for mining association rules in large databases, In: The 20th

International Conference on Very Large Data Bases, San Francisco, CA, USA (1994), 487–499.

[6] H. Mannila, H. Toivonen, A.I. Verkamo, Efficient algorithms for discovering association rules, In:

AAAI Workshop on KDD, Seattle, WA, USA (1994), 181–192.

[7] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, A.I. Verkamo, Fast discovery of association rules, In

Advances in KDD. MIT Press, 12(1996), 307–328.

[8] A. Savasere, E. Omiecinski, S.B. Navathe, An efficient algorithm for mining association rules in large

databases, In: The 21st VLDB Conference, Zurich, Switzerland (1995), 432–443.

[9] J.S. Park, M. Chen, P.S. Yu, An effective hash-based algorithm for mining association rules, In: ACM

SIGMOD International Conference on Management of Data, San Jose, CA, USA (1995), 175–186.

[10] R. Srikant, Q. Vu, R. Agrawal, Mining association rules with item constraints, In: ACM KDD

International Conference, Newport Beach, CA, USA (1997), 67–73.

[11] R.T. Ng, L.V.S. Lakshmanan, J. Han, A. Pang, Exploratory mining and pruning optimizations of

constrained associations rules, In: ACM-SIGMOD Management of Data, USA (1998), 13–24.

[12] G. Grahne, L. Lakshmanan, X. Wang, Efficient mining of constrained correlated sets, In: The 16th

International Conference on Data Engineering, San Diego, CA, USA (2000), 512–521.

[13] J. Pavon, S. Viana, S. Gomez, Matrix Apriori: Speeding up the search for frequent patterns, In: The

24th IASTED International Conference on Database and Applications, Austria (2006), 75–82.

[14] B. Yıldız, B. Ergenç, Comparison of two association rule mining algorithms without candidate

generation, In: The 10th IASTED International Conference on Artificial Intelligence and Applications,

Innsbruck, Austria (2010), 450–457.

[15] H. Mannila, Database methods for data mining, Tutorial for the 4th ACM SIGKDD International

Conference on KDD, New York, USA (1998).

[16] B. Liu, W. Hsu, Y. Ma, Mining association rules with multiple minimum supports, In: The 5th ACM

SIGKDD International Conference on KDD, San Diego, CA, USA (1999), 337–341.

[17] Y. Hu, Y. Chen, Mining association rules with multiple minimum supports: a new mining algorithm

and a support tuning mechanism, Decision Support Systems, 42(2006), 1–24.

[18] R.U. Kiran, P.K. Reddy, Novel techniques to reduce search space in multiple minimum supports-based

frequent pattern mining algorithms, In: The 14th International Conference on Extending Database

Technology, ACM, New York, USA, (2011), 11–20.

[19] S. Darrab, B. Ergenç, Frequent pattern mining under multiple support thresholds, In: The 16th Applied

Computer Science Conference, WSEAS Transactions on Computer Research, Turkey, 4(2016), 1–10.

[20] D.W. Cheung, J. Han, V.T. Ng, C.Y. Wong, Maintenance of discovered association rules in large

databases, An incremental updating technique, In: The 12th IEEE International Conference on Data

Engineering, New Orleans, Louisiana, USA, (1996), 106–114.

[21] D.W. Cheung, S.D. Lee, B. Kao, A general incremental technique for maintaining discovered

association rules, In: The 5th International Conference on Database Systems for Advanced

Applications, Melbourne, Australia, (1997), 185–194.

[22] D. Oğuz, B. Ergenç, Incremental itemset mining based on Matrix Apriori, DEXA-DaWaK, Vienna,

Austria, (2012), 192–204.

[23] D. Oğuz, B. Yıldız, B. Ergenç, Matrix based dynamic itemset mining algorithm, International Journal

of Data Warehousing and Mining, 9(2013), 62–75.

[24] Y. Aumann, R. Feldman, O. Lipshtat, H. Manilla, Borders: An efficient algorithm for association

generation in dynamic databases, Journal of Intelligent Information System, 12(1999), 61–73.

[25] S. Shan, X. Wang, M. Sui, Mining Association Rules: A continuous incremental updating technique,

In: International Conference on WISM, IEEE Computer Society, Sanya, China (2010), 62–66.

[26] B. Dai, P. Lin, iTM: An efficient algorithm for frequent pattern mining in the incremental database

without rescanning, In: The 22nd International Conference on Industrial, Engineering and Other

Applications of Applied Intelligent Systems, Tainan, Taiwan (2009), 757–766.

[27] W. Cheung, O.R. Zaiane, Incremental mining of frequent patterns without candidate generation or

support constraint, In: IDEAS, Hong Kong, China (2003), 111–116.

[28] F.A. Hoque, M. Debnath, N. Easmin, K. Rashad, Frequent pattern mining for multiple minimum

supports with support tuning and tree maintenance on incremental database, Research Journal of

Information Technology, 3(2011), 79–90.

[29] Frequent Itemset Mining Implementations Repository, http://fimi.ua.ac.be/data/

Fuzzy Systems and Data Mining II 149

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-149

for Credit Card Data Analysis

Ayahiko NIIMI 1

Faculty of Systems Information Science, Future University Hakodate,

2-116 Kamedanakano, Hakodate,

Hokkaido 041-8655, Japan

Abstract. In this study, two major applications are introduced to develop advanced

deep learning methods for credit card data analysis. Credit card information is con-

tained in two data sets; credit approval dataset and card transaction dataset. The

credit card dataset has two problems. One problem is using credit card approval

dataset, it is necessary to combine multiple models, each referring to a different

clustered group of users. The other problem is using card transaction dataset, since

the actual unauthorized credit card use is very small, these imprecise solutions do

not allow the appropriate detection of fraud. To solve these problems, we proposed

deep learning algorithm to apply credit card dataset. The proposed methods are val-

idated using benchmark experiments with other machine learnings. To evaluate our

proposed method, we use two credit card datasets, credit approval dataset by UCI

machine learning repository and credit transaction dataset constructed by random.

The experiments conﬁrm that deep learning exhibits comparable accuracy to the

Gaussian kernel support vector machine (SVM). The proposed methods are also

validated using large scale transaction dataset. Moreover, we apply our proposed

method for the time-series benchmark dataset. Deep learning parameter adjustment

is difﬁcult. By optimizing the parameters, it is possible to increase the learning

accuracy.

Keywords. Data Mining, Deep Learning, Credit Approval Dataset, Card Transaction

Dataset

Introduction

Deep learning is a state-of-the-art research topic in the machine learning ﬁeld with ap-

plications for solving various problems [1, 2]. This paper investigates the application of

deep learning in credit card data analysis.

Credit card data are mainly used in user and transaction judgments. User judgment

determines whether a credit card should be issued to the user satisfying particular criteria.

On the other hand, transaction judgment refers to whether the validity of a transaction is

correct [3]. We determined the deep learning processes required for solving each of these

problems, and we proposed appropriate methods for deep learning [4, 5].

To verify our proposed methods, we use benchmark experiments with other machine

learnings, which conﬁrm the accuracy of the deep learning methods similar to that of

1 Corresponding Author: Ayahiko Niimi, Faculty of Systems Information Science, Future University

150 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis

the Gaussian Kernel SVM. In the ﬁnal section of this paper, we provide suggestions for

future deep learning experiments.

We only used a small scale transaction dataset for evaluation experiment, and did

not use a large-scale dataset [6]. In this paper, The proposed methods are also validated

using large scale transaction dataset. Moreover, we apply our proposed method for the

time-series benchmark dataset.

First, in section 1, we will introduce the characteristics of the data-set of the credit

card. Then, in section 2, we will introduce Deep Learning. In section 3, we will discuss

the data processing infrastructure that is suitable for analysis of credit card data. In sec-

tion 4, we describe the experiment, and the results are shown in section 5. We discuss

about the results in section 6. Finally, in section 7, we describe conclusions and future

works.

1. credit approval dataset

2. card transaction dataset

For each user submitting a credit card creation application, there is a record of the deci-

sion to issue the card or to reject the application. This is based on the user’s attributes, in

accordance with the general usage-trend models.

However, to reach this decision, it is necessary to combine multiple models, each

referring to a different clustered group of users.

In actual credit card transactions, the data is complex, constantly changing, and conti-

nously arrives online as follows:

(ii) Each transaction takes less than one second for completion.

(iii) Approximately one hundred transactions arrive per second during peak time.

(iv) Transactions data arrive continuously.

Therefore, credit card transaction data can be precisely called a data stream. How-

ever, even if we use data mining for such data, an operator can monitor around only 2,000

transactions per day. Therefore, we have to detect suspicious transaction data effectively

by analyzing less than 0.02% of the total number of transactions. In addition, fraud de-

tection is extremely low from analyzing massive amounts of transaction data, because

real fraud occurs at an extremely low rate, i.e., within a 0.02% to 0.05% of all of the

transaction data.

In a precious paper, transaction data in CSV format were described as attributed in a

time order [3]. Credit card transaction data have 124 attributes, 84 are called transactional

A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 151

data, including an attribute used to discriminate whether the data refers to fraud; the

others are called behavioral data, and they refer to the credit card usage. The inﬂow ﬁle

size is approximately 700 MB per month.

Mining the credit card transaction data stream presents inherent difﬁculties, since it

requires performing efﬁcient calculations on an unlimited data stream with limited com-

puting resources. Therefore many streams mining methods seek an approximate or prob-

abilistic solution instead of an exact one. However, since the actual unauthorized credit

card use is very small, these imprecise solutions do not allow the appropriate detection

of fraud.

2. Deep Learning

Deep learning is a new technology that recently attracted much attention in the ﬁeld of

machine learning. It signiﬁcantly improves the accuracy of abstract representations by re-

constructing deep structures such as neural circuitry of the human brain. The deep learn-

ing algorithms were honored in various competitions such as International Conference

on Representation Learning.

Deep learning is a generic term for multilayer neural networks, which were re-

searched for a long time [1, 2, 7]. Multilayer neural networks decrease the overall calcu-

lation time by performing calculation on hidden layers. Thus, they were prone to exces-

sive over training, as an intermediate layer was often used for approximately every single

layer.

However, the technological advances suppressed over training, whereas GPU uti-

lization and parallel processing increased the number of hidden layers.

A sigmoid or a tanh function was commonly used as an activation function (see

Equation 1, 2), although recently, a maxout function was also used (section 2.1).

The dropout technique was implemented to prevent over training (section 2.2).

hi (x) = tanh(xT W...i + bi ) (2)

2.1. Maxout

or deep convolutional neural network that uses a new type of activation function, the

maxout unit [2].

In particular, given an input x ∈ Rd (x may be v, or it may be a hidden layer ’s

state), a maxout hidden layer implements the function

where zij = xT W...ij + bij , W ∈ Rd×m×k and b ∈ Rm×k are learned parameters. In

a convolutional network, a maxout feature map can be constructed by taking the maxi-

mum across k afﬁne feature maps (i.e., pool across channels, in addition to spatial loca-

152 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis

tions). When training with dropout, we perform the element-wise multiplication with the

dropout mask immediately prior to the multiplication by weights, in all cases; inputs are

not dropped to the max operator. A single maxout unit can be interpreted as a piecewise

linear approximation of an arbitrary convex function. Maxout networks learn not just the

relationship between hidden units, but also the activation function of each hidden unit.

Maxout abandons many of the mainstays of traditional activation function design.

The representation it produces is not sparse at all, though the gradient is highly sparse,

and the dropout will artiﬁcially sparsify the effective representation during training. Al-

though maxout may learn to saturate on one side or another, this is a measure zero event

(so it is almost never bounded from above). Since a signiﬁcant proportion of parame-

ter space corresponds to the function delimited from below, maxout learning is not con-

strained at all. Maxout is locally linear almost everywhere, whereas many popular acti-

vation functions have signiﬁcant curvature. Given all of these deviations from standard

practice, it may seem surprising that maxout activation functions work at all, but we ﬁnd

that they are very robust, easy to train with dropout, and achieve excellent performance.

zij = X T W...ij + bij (5)

2.2. Dropout

predict an output y given an input vector v [2].

In particular, these architectures contain a series of hidden layers h= {h(1), . . . , h(L)}.

Dropout trains an ensemble of models consisting of a subset of the variables in both v

and h. The same set of parameters θ is used to parameterize a family of distributions

p(y|v; θ, μ), where μ ∈ M is a binary mask determining which variables to include

in the model. On each example, we train a different submodel by following the gra-

dient log p(y|v; θ, μ) for a different randomly sampled μ. For many parameterizations

of p (usually for multilayer perceptrons) the instantiation of the different submodels

p(y|v; θ, μ) can be obtained by elementwise multiplication of v and h with the mask μ.

The functional form becomes important when the ensemble makes a prediction by

averaging together all the submodels’ predictions. Previous studies on bagging averages

used the arithmetic mean. However, this is not possible with the exponentially many

models trained by dropout. Fortunately, some models easily yield a geometric mean.

When p(y|v; θ) = softmax(v T W + b), the predictive distribution deﬁned by renormaliz-

ing the geometric mean of p(y|v; θ, μ) over M is simply given by softmax(v T W/2 + b).

In other words, the average exponential prediction for many submodels can be computed

simply by running the full model with the weights divided by two. This result holds

exactly in the case of a single layer softmax model. Previous work on dropout applies

the same scheme in deeper architectures, such as multilayer perceptrons, where the W/2

method is only an approximation of the geometric mean. This approximation was not

characterized mathematically, but performed well in practice.

A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 153

In this section, we consider the data processing infrastructure that is suitable for analysis

of credit card data, as well as the applications of deep learning to credit card data analysis.

3.1. R

R is a language and environment for statistical computing and graphics [8]. It is a GNU

project similar to the S language and environment which was developed at Bell Labora-

tories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R

can be considered as a different implementation of S. There are some important differ-

ences, but much code written for S runs unaltered under R. R is available as Free Soft-

ware and is widly used. It includes many useful libraries, such as multivariate analysis,

machine learning, and it is suitable for data mining.

However, R performs processing in memory, therefore, it is not suitable for large

amounts of data processing.

Google BigQuery [9] and Amazon Redshift [10] are systems corresponding to the in-

quiries using large amounts of data. These cloud systems can easily store a large amount

of data and processing it at high speed. Therefore, we can use them to analyze data trends

interactively. However, data processing, such as machine learning, needs to be further

developed.

Apache Hadoop is a platform for handling large amount of data as well [11]. Apache

Hadoop divides the process into mapping and reducing, wich operate in parallel; the Map

processes data, whereas the Reduce summarizes the results. In combination, these pro-

cesses realize high-speed processing of large amounts of data. However, since process-

ing is performed in batches the Map/Reduce cycle can be completed before all data are

stored. It is difﬁcult to apply separate algorithms for Map/Reduce different batches. In

particular, it is difﬁcult to implement the algorithm repeatedly for the same data, as is

required in machine learning.

Apache Storm is designed to process a data stream [12]. For incessantly ﬂowing data,

data conversion is executed. The data source is called the Spout and the part that per-

forms the conversion process is called the Blot. Apache Storm is a model that performs

processing by a combination of Bolt from Spout.

Apache Spark is also a platform that processes large amounts of data [13]. Apache Spark

has generalized the Map/Reduce processing. It processes by caching the work memory,

and it is designed to execute efﬁcient iterative algorithms by maintaining shared data,

154 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis

which is used for repeated processing in the memory. In addition, a machine learning

and graphs algorithms library is prepared, and it can be an easily build environment for

stream data mining.

H2O is a library of deep learning for Spark [14, 15].

SparkR is an R package that provides a light weight frontend for Apache Spark

from R [16]. In Spark 1.5.0, SparkR provides distributed data frame implementation that

supports operations, such as selection, ﬁltering, aggregation, similar to R data frames,

dplyr, but on large datasets. SparkR also supports distributed machine learning using

MLlib.

In the present paper, we perform credit card data analysis using R and Spark. It is

possible to use an extensive library with R to gain the high performance by parallel and

distributed processing of Spark.

4. Experiments

We used the credit approval dataset by UCI Machine Learning repository to evaluate the

experimental results [4].

All attribute names and values were reassigned to meaningless symbols to protect

the data conﬁdentiality.

In addition, the original dataset contains missing values. In the experiment, we use

a pre-processing dataset [17], as presented in Table 1.

Number of Instances: 690

Number of Attributes 15 + class attribute

Class Distribution: +: 307 (44.5%),

-: 383 (55.5%)

Number of Instance for Training: 590

Number of Instance for Test: 100

Deep learning uses the R library of H2O [14, 15]. H2O is a library for Hadoop and

Spark, but it also has an R package.

For comparison, we also use ﬁve typical machine learning algorithms. In addition,

the deep learning parameters (activation functions and dropout parameter) are changed

ﬁve times. In this experiment, the hidden layer neurons are set to (100, 100, 200) for deep

learning. The parameters used are shown in Table 2.

XGBoost is an optimized general purpose gradient boosting library [18]. The li-

brary is parallelized and provides an optimized distributed version. It implements ma-

chine learning algorithms under the gradient boosting framework, including a general-

ized linear model and gradient boosted decision trees. XGBoost can also be distributed

and scaled to Terascale data.

The activation functions used here are summarized in Table 3 [15].

Moreover, to ascertain whether there is a bias in the results of the training data

and the test data, we perform 10-fold cross-validation using the entire dataset. In this

experiment, the hidden layer neurons are set to (200, 200, 200).

In the experiment, we use the following environment.

A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 155

Deep learning Rectiﬁer with Dropout

Rectiﬁer

Tanh with Dropout

Tanh

Maxout with Dropout

Maxout

Logistic Regression

Support Vector Machine Gaussian Radial Basis Function

Linear SVM

Random Forest

XGBoost

Function Formula

eα −e−α

Tanh f (α) = eα −e−α

Rectiﬁed Linear f (α) = max(0, α)

Maxout f (·) = max(wi xi + b),

rescale if maxf (·) ≥ 1

Tanh with Dropout Tanh with Dropout

Rectiﬁer with Dropout Rectﬁed Linear with Dropout

Maxout with Dropout Maxout with Dropout

• CPU Intel Xeon 3.3 GHz

• Memory 1GB

• Disk 8GB SSD

• R version 3.2.2

• H2O version 3.0.0.30

In this paper, The proposed methods are also validated using large scale transaction

dataset. We made a dataset from the actual card transaction dataset which contains the

same number of attributes(130 attributes) and the value of each attribute made by random

with the same range. The data set has about 300,000 transactions which include about

3,000 illegal usages. We made a dataset with six months data for experiment. Because

this dataset has random values, it could not use to evaluate accuracy. We used this dataset

to estimate machine specs and calculation times.

The percentage of fraud count is too low in the dataset. We used all illegal usages

(approximately 3,000) and sampling normal usages (approximately 3,000) in the exper-

iment.

We used the Amazon EC2 r3.8xlarge (32 cores, 244GB memory) for experiment. As

a preliminary experiment, deep learning’s parameters of hidden layer nurons (100, 100,

200) and epochs (200) were used, but the learning did not converge. Therefore, in the

experiment, the parameters of the deep learning (hidden layer nurons (2048, 2048, 4096)

156 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis

and epochs (2000) and hidden dropout ratios (0.75, 0.75, 0.7)) were used. The “Maxout

With Dropout” is used for activation function.

The experimental results are currently being analyzed.

For comparison of the proposed method, we evaluate our proposed method by using

public time-series benchmark data. We used the gas sensor dataset from the UCI Machine

Learning repository [4, 19, 20].

We are going to apply for experiment and tune parameters and analyze the obtained

results.

5. Experimental Results

Teble 4 shows the experimental results. We run each algorithms ﬁve times and the Table

4 presents the average. Because the machine learning algorithms that we used have no

initial value dependent, the results of the algorithms are the same, all ﬁve times.

Algorithm Error Rate

Rectiﬁer with Dropout (Deep Learning) 18.4

Rectiﬁer (Deep Learning) 18.8

Tanh with Dropout (Deep Learning) 17.6

Tanh (Deep Learning) 22.8

Maxout with Dropout (Deep Learning) 12.4

Maxout (Deep Learning) 16.2

Logistic Regression 18.0

Gaussian Kernel SVM 11.0

Linear Kernel SVM 14.0

Random Forest 14.0

XGBoost 14.0

The deep learning results depend on the initial parameters. Deep learning of accu-

racy with the Maxout with Dropout produces a result close to the Gaussian kernel SVM.

Table 5 shows the results of the 10-fold cross-validation. N and Y are class attributes.

Stability results are obtained regardless of the dataset.

A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis 157

N Y Error Rate

Totals 332 287 0.138934 =86/619

Totals 337 280 0.132901 =82/617

Totals 333 277 0.136066 =83/610

Totals 318 302 0.135484 =84/620

Totals 325 295 0.156452 =97/620

Totals 306 316 0.180064 =112/622

Totals 338 296 0.140379 =89/634

Totals 336 281 0.136143 =84/617

Totals 327 301 0.141720 =89/628

Totals 318 305 0.146067 =91/623

Average of Error Rate 0.144421

6. Considerations

The presently conducted experiments conﬁrm that deep learning has the same accuracy

as the Gaussian kernel SVM.

In addition, the 10-fold cross-validation experiment indicates that it is deep learning

offers higher precision.

In this experiment, we used the H2O library for deep learning, with the deep learning

modules written in Java were activated each time. Therefore, we cannot assessment the

execution time.

Deep learning parameter adjustment is difﬁcult. By optimizing the parameters, it is

possible to increase the learning accuracy.

There are some different approaches for time-series dataset [21, 22]. These ap-

proaches are different from the proposed method, but it is useful to improve our proposed

method.

7. Conclusion

In this paper, we consider the application of deep learning in credit card data analysis.

We introduce two major applications and propose methods for deep learning. To verify

our proposed methods, we use benchmark experiments with other machine learnings.

Through these experiments, it is conﬁrmed that deep learning has the same accuracy as

the Gaussian kernel SVM. The proposed methods are also validated using large scale

transaction dataset.

In the future, we will consider evaluation an experiment using the transaction data

and real datasets.

Acknowledgment

The authors would like to thank to Intelligent Wave Inc. for many comment of credit card

transaction datasets.

158 A. Niimi / Deep Learning with Large Scale Dataset for Credit Card Data Analysis

References

[1] Y. Bengio. Learning Deep Architectures for AI. Foundations & Trends R in Machine Learning,

2(2009):1-127.

[2] I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio. Maxout Networks. ArXiv

e-prints, Feb., 2013.

[3] T. Minegishi and A. Niimi. Detection of Fraud Use of Credit Card by Extended VFDT, in World

Congress on Internet Security (WorldCIS-2011), London, UK, Feb., (2011), 166–173.

[4] M. Lichman. UCI Machine Learning Repository. (2013), (Access Date: 15 September, 2015). [Online].

Available: http://archive.ics.uci.edu/ml

[5] T. J. OZAKI. Data scientist in ginza, tokyo. (2015), (Access Date: 15 September, 2015). [Online]. Avail-

able: http://tjo-en.hatenablog.com/

[6] A. Niimi. Deep Learning for Credit Card Data Analysis, in World Congress on Internet Security

(WorldCIS-2015), Dublin, Ireland, Oct., (2015), 73–77.

[7] Q. Le. Building High-Level Features using Large Scale Unsupervised Learning. in Acoustics, Speech

and Signal Processing (ICASSP), 2013 IEEE International Conference on, May, (2013), 8595–8598.

[8] R: The R project for statistical computing. (Access Date: 15 September, 2015). [Online]. Available:

https://www.r-project.org/

[9] Google cloud platform. what is BigQuery? - Google BigQuery. (Access Date: 15 September, 2015).

[Online]. Available: https://cloud.google.com/bigquery/what-is-bigquery

[10] AWS Amazon Redshift. Cloud Data Warehouse Solutions. (Access Date: 15 September, 2015). [Online].

Available: https://aws.amazon.com/redshift/

[11] Apache Hadoop. Welcome to Apache Hadoop! (Access Date: 15 September, 2015). [Online]. Available:

https://hadoop.apache.org/

[12] Apache Storm. Storm, distributed and fault-tolerant realtime computation. (Access Date: 15 September,

2015). [Online]. Available: https://storm.apache.org/

[13] Apache Spark. Lightning-Fast Cluster Computing. (Access Date: 15 September, 2015). [Online]. Avail-

able: https://spark.apache.org/

[14] 0xdata — H2O.ai — Fast Scalable Machine Learning. (Access Date: 15 September, 2015). [Online].

Available: http://h2o.ai/

[15] A. Candel and V. Parmar. Deep Learning with H2O. H2O, (2015), (Access Date: 15 September, 2015).

[Online]. Available: http://learnpub.com/deeplearning

[16] SparkR (R on Spark) — Spark 1.5.0 Documentation. (Access Date: 15 September, 2015). [Online].

Available: https://spark.apache.org/docs/latest/sparkr.html

[17] T. J. OZAKI. Credit Approval Data Set, modiﬁed. (2015), (Access Date: 15 September, 2015).

[Online]. Available: https://github.com/ozt-ca/tjo.hatenablog.samples/tree/

master/r_samples/public_lib/jp/exp_uci_datasets/card_approval

[18] dmlc XGBoost extreme Gradient Boosting. (Access Date: 15 September, 2015). [Online]. Available:

https://github.com/dmlc/xgboost

[19] A. Vergara, S. Vembu, T. Ayhan, M. Ryan, M. Homer, and R. Huerta. Chemical Gas Sensor Drift Com-

pensation using Classiﬁer Ensembles. Sensors and Actuators B: Chemical, 166(1), (2012), 320–329.

[20] I. Rodriguez-Lujan, J. Fonollosa, A. Vergara, M. Homer, and R. Huerta. On the Calibration of Sensor Ar-

rays for Pattern Recognition using the Minimal Number of Experiments. Chemometrics and Intelligent

Laboratory Systems, 130, (2014), 123–134.

[21] S. Yin, X. Xie, J. Lam, K. C. Cheung, and H. Gao. An Improved Incremental Learning Approach for

KPI Prognosis of Dynamic Fuel Cell System. IEEE Transactions on Cybernetics, PP(99), (2015), 1–10.

[22] S. Yin, H. Gao, J. Qiu, and O. Kaynak. Fault Detection for Nonlinear Process with Deterministic Dis-

turbances: A Just-In-Time Learning Based Data Driven Method. IEEE Transactions on Cybernetics,

PP(99), (2016), 1–9.

Fuzzy Systems and Data Mining II 159

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-159

Algorithm over Uncertain Databases with

Sampling

Hai-Feng LI1, Ning ZHANG, Yue-Jin ZHANG and Yue WANG

School of Information, Central University of Finance and Economics, Beijing 100081,

China

Abstract. Uncertain data is the data accompanied with probability, which makes

the frequent itemset mining have more challenges. Given the data size n,

computing the probabilistic support needs O(n(logn)2) time complexity and O(n)

space complexity. This paper focuses on the problem of mining probabilistic

frequent itemsets over uncertain databases and proposed PFIMSample algorithm.

We employ the Chebyshev inequation to estimate the frequency of the items,

which decreases certain computing from O(n(logn)2) to O(n). In addition, we

propose the sampling technique to improve the performance. Our extensive

experimental results show that our algorithm can achieve a significantly improved

runtime cost and memory cost with high accuracy.

sampling

Introduction

The restraint of physical factors, the data preprocessing and the data privacy protecting

methods will bring uncertainty to data, which is significant over continuous arrived

data [1]. By introduce the probability of data occurrence, we can improve the robust of

data mining method, and guarantee that the data analysis can achieve exact and precise

knowledge, which is much valuable for user decision. Frequent itemset mining

algorithms over certain databases have achieved many good results [2-4]. Nevertheless,

the uncertainty of data [5, 6] brings new challenges.

According to the different definitions of frequent itemset over uncertain data, the

mining methods can be categorized into two types: one is based on the expected

support and another is based on the probabilistic support [7-22]. The methods based on

the expected support mainly used the expectation of the itemset support to evaluate

whether an itemset is frequent; the methods based on the probabilistic support

considered that an itemset is frequent when its support is larger than the minimum

support with a specified high probability. If the database size is n, then the former have

O(n) time complexity and O(1) space complexity, and the latter have O(n(logn)2) time

complexity and O(n) space complexity[11]. Clearly, the former has a much higher

1

Corresponding Author: Hai-Feng LI, School of Information, Central University of Finance and

Economics, Beijing 100081, China; E-Mail: mydlhf@cufe.edu.cn.

160 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm

performance. The latter, however, can represent the probabilistic characters of frequent

itemsets.

Since the computing of probability support is complicate, it is more challengeable.

In this paper, we focus on this problem and propose a frequency estimating method

based on Chebyshev Inequation and sampling method, to implement the approximately

computing of probabilistic support, and guarantee the accuracy by theoretical analysis.

Also, we use the experiments to testify this method.

This paper is organized as follows. Section 1 introduces the preliminaries of

frequent itemset mining. Section 2 proposed our PFIMSample algorithm in detail.

Section 3 presents the experimental results. Section 4 concludes this paper.

1. Preliminaries

1 2 n

is the size of * .

n

We call an itemset X with size n the k-itemset. Assuming X has item xt (0 t d k ) with

a probability pt , then X is an uncertain itemset, denoted as

X {x1 , p1; x2 , p2 ; ; xk , pk } . For uncertain dataset UD {UT1 ,UT2 , UTv } , each

UTi (i 1 v) denotes a transaction based on * , which has an id and the corresponding

(tid , X )

itemset X, denoted as . Figure 1 is a simple uncertain dataset, which, if using

possible world model, can be converted multiple certain dataset with a probability, and

each certain dataset is called a possible world.

Definition 1 (Count Support): Given the uncertain database UD and itemset X, the

occurrence count is called the count support of X, denoted as /UD ( X ) 㧘

/( X )

for

short.

Definition 2 (Possible World) [9]: Given the uncertain database, the generated

possible world PW has |UD| transactions, each transaction Ti is a subset of UTi ,

, in which Ti UTi .

PW {T1 , T2 , T|UD|}

denoted as

Providing the uncertain transaction is independent, then the probability of the

possible world, p(PW), can be computed by the following method. If an item x exists in

Ti and UTi , then we get the probability of x, the p(x); if x exists in UTi but not in Ti ,

then we get the probability of x , the p( x ). Then we can multiply all the probabilities.

p( PW ) 3 ( 3 p( x ))( 3 p( x))

xUTi xTi xTi

The computing equation is .

Using < to denote the possible worlds generated from UD, then the size of < will

increase exponentially w.r.t. the size of UD. That is, if UD has m transactions, and each

6im1ni

transaction has nm items, then < has 2 possible worlds.

H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm 161

The left image of Figure 1 show the uncertain dataset has 2 transactions, each one

22

has two items. As can be seen in the right image of Figure 1, there are 2 16

possible worlds, each of which has a occurrence probability. As an example, possible

world PW6 has two transactions T1 and T2 , which are all {A}. Then the probability of

PW6 is

p( PW6 ) p{ A}UT1 { A}T1 ({ A}) p{B}UT1 {B}T1 ({B}) p{ A}UT2 { A}T2 ({ A})

= * * *

p{C}UT2 {C}T2 ({C})

=0.6*0.3*0.2*0.7=0.025. As can be seen, the summary of

probability of all the possible worlds is 1.

In the uncertain database UD, the frequent itemset is defined by the possible world

model. If the itemset X has the support / PW ( X ) in each possible world PW, then the

probability pPW ( X ) is the probability of PW, the p(PW). We can use a 2-tuple

< / PW ( X ) , pPW ( X ) > to denote it. In UD, X has 2

6im1ni

such tuples, which can be denoted

P (X )

with the probability summed vector / ( X ) .

Definition 3 (Probabilistic Frequent Itemset) [10, 23]: Given the uncertain

database UD, the minimum support O and the minimum probabilistic confidence W ,

and itemset X is a ( O , W )-probabilistic frequent itemsetif the probabilistic support

/WP ( X ) t O /P (X ) P

, in which W =Max{i| / ( X ) ti > W }.

For an uncertain database with size n, we can use a divide-and-conquer method

[11], which has the time complexity O(n(logn)2) and space complexity O(n). As can be

seen, n is the key factor that determines the computing efficiency. If we can decrease n,

then the runtime cost will decrease linearly.

162 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm

According to the law of large number, when n is large enough, the data tend to fit

the normal distribution, with which we proposed our mining algorithm PFIMSample

based on the sampling method. We described the detail as follows.

1) To scan the database and get the data statistics characteristics, that is, the average

and the variance of the itemset probability.

2) To scan the database and compute the count support and the expected support, in

which the expected support, is the sum of the probabilities.

3) For a given sampling parameter, we use random sampling over the database so that

the acquired data fits the normal distribution. Since we assume that the uncertain

database fits the normal distribution initially if the data is massive enough, we use

the simple system sampling method, which can guarantee the mining efficiency

with a similar distribution. On the other hand, since sampling will decrease the

database size, which may reduce the mining accuracy; thus, we will evaluate it

with our experiments, and we finally find that the accuracy is not relative to the

sampling rate. According to each item, we scan the sampling database and

compute the probabilistic support, which if is larger than the minimum support,

then is a 1-frequent itemset.

4) To match all the n-probabilistic frequent itemsets and generate the (n+1)-

probabilistic itemset, and compute the probabilistic support to determine whether

they are frequent.

5) To repeat the 4) phase until no new probabilistic frequent itemsets are generated,

then output the results.

In the PFIMSample algorithm, when the independent item is generated, to

guarantee the accuracy, we will scan the database but not the sampling database to

compute the probabilistic support. Consequently, the computing cost will be

O(n(logn)2). We use the heuristic rule based pruning strategy in the 3) phase.

According to the Chebyshev Inequation, a given variable X has the expected

support E(X) and the standard variance D(X); for random constant ε>0, we can get P( |

X - E(X) | ≥ ε ) ≤ D2(X) / ε². That is, in a arbitrary dataset, the ratio that it locates

within m D(X) centered by the expected support is at least 1-1/ m2, and m is a positive

number larger than 1. For an example, if m=5, then at least 1-1/25=96% data has the

probability that the support is larger than E(X)-5D(X). Thus, before we can determine

the frequency of a distinct X, we will first compute the expected support and the

standard variance, if E(X)-mD(X) is larger than the minimum support, then X is a

probabilistic frequent itemset with 1-1/m2 probability. Since computing the expected

support is O(n), which is far less than the cost of computing the probabilistic support,

then we can prune the itemsets efficiently. This efficiency will be better follows the

larger n.

To guarantee that the memory cost is low, we use a prefix-tree to maintain the

itemsets, as well the count support, the expected support and the probabilistic support.

Note that our algorithm does not store the probabilistic density function of the itemset,

which is due the fact that the space complexity of a probabilistic density function is

O(n), and many itemsets will result in massive memory usage . Since the probabilistic

support will be computed once, the probabilistic density functions can be deleted once

the probabilistic support is achieved, which can significantly improve the performance.

H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm 163

3. Experimental Results

We compared the performance and the accuracy when the minimum probabilistic

confidence is set 0.9. Our algorithm was implemented with Python 2.7 under Windows

7, and run over i7-4790M 3.6GHZ CPU and 4GB memory. Two uncertain datasets

were used to evaluate our algorithm: One is the GAZELLE that contains the real e-

commerce click stream, another is the synthetic data T25I15D320K generated by the

IBM generator. We assigned a probability generated from Gaussian distribution for

each item, which is widely accepted by the current research over uncertain data [16].

We showed the characteristics of the two datasets in Table 1. Our sampling method is

a framework that can be extensively applied on the existing algorithm, we employed

the state-of-the-art method TODIS [11] as the benchmark algorithm accordingly. That

is, TODIS is actually the condition that our algorithm PFIMSample with sampling rate

1.

Table 1. The Characteristics of Uncertain Datasets

GAZELLE 59602 3 497

T25I15D320K 320000 26 1000

We first conducted PFIMSample algorithm over two datasets with different sampling

rate. From Figure 2 and 3 we can see, when the minimum support was fixed, the

mining efficiency will reduced in line with the incremental sampling rate. When the

sampling rate is 0.01, the mining cost will be very low, that is, the runtime can be 100

folds better. Furthermore, to reduce the minimum support will result in the same trend

of the performance, which was much significant over T25I15D320K dataset. This is for

the reason that T25I15D320K dataset is denser than GAZELLE dataset.

Figure 2. Runtime VS Sampling rate (GAZELLE) Figure 3. Runtime VS Sampling rate (T25I15D320K)

Figure 4 and 5 compared the memory usage over different sampling rates. We can see

that the memory cost turned larger but not significantly, when the sampling rate

increased. On the other hand, the memory usage was not related to the minimum

164 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm

support, which is because we used the relative minimum support. Moreover, the

memory usage is low when mining over the sparse dataset GAZELLE.

Figure 4. Memory cost VS Sampling rate Figure 5. Memory cost VS Sampling rate

(GAZELLE) (T25I15D320K)

3.3. Precision and Recall

We used the Precision and the Recall to evaluate the accuracy of our algorithm. For an

original mining results D and the ones that used the sampling rate D’, we defined

Precision=|D ∩ D’|/|D|, and Recall=|D ∩ D’|/|D’|. As a result, the larger the precision

and the recall, the higher accuracy our algorithm will be. Table 2 shows the Precision

and the Recall using different sampling rates over two datasets. As can be seen, when

the minimum support is 0.08, our algorithm can achieve 100% accuracy; also, it can

achieve more than 90% accuracy over T25I15D320K on most cases. In addition, the

accuracy of our algorithm is not related to the sampling rate since we use the random

samples.

Table 2. Precision and Recall

support

0.01 100% 100%

0.02 100% 100%

0.03 100% 100%

0.04 100% 100%

GAZELLE 0.08 0.05 100% 100%

0.06 100% 100%

0.07 100% 100%

0.08 100% 100%

0.09 100% 100%

0.01 87% 95%

0.02 91% 95%

T25I15D320K 0.08 0.03 95% 92%

0.04 95% 92%

0.05 91% 91%

H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm 165

0.07 95% 88%

0.08 100% 96%

0.09 91% 95%

4. Conclusions

This paper made a study on probabilistic frequent itemset mining over uncertain

databases. The proposed algorithm PFIMSample employed the Chebyshev inequation

to estimate the count of frequent items, and thus can partly reduce the computing cost

from O(n(logn)2) to O(n). Moreover, we used the sampling method to improve the

performance with a high accuracy. Our extensive experimental results over two

datasets showed that our algorithm was effective and efficient.

Acknowledgement

(61100112, 61309030), National Social Science Foundation of China (13AXW010),

Discipline Construction Foundation of Central University of Finance and Economics.

References

[1] B. Babcock, S. Babu, M. Datar, et al. Models and issues in data stream systems. Proceedings of PODS,

2002.

[2] J. Han, H. Cheng, D. Xin, et al. Frequent pattern mining: current status and future directions. Data

Mining & Knowledge Discovery. 15(2007):55-86.

[3] J. Chen, Y. Ke, W. Ng. A survey on algorithms for mining frequent itemsets over data streams.

Knowledge and Information System, 16(2008), 1-27.

[4] C. C. Aggarwal, P. S. Yu. A survey of uncertain data algorithms and applications. IEEE Transaction on

Knowledge and Data Engineering, 21(2009), 609-623.

[5] A. Y. Zhou, C. Q. Jin, G. R. Wang, et al. A survey on the management of uncertain data. Chinese

Journal of Computers, 31(2009).

[6] J. Z. Li, G. Yu, A. Y. Zhou. Challenge of uncertain data management. Chinese Computer

Communications, 5(2009).

[7] J. Xu, N. Li, X. J. Mao, et al. Efficient probabilistic frequent itemsets mining in big sparse uncertain data.

Proceedings of PRICAI, 2014.

[8] Y. Konzawa, T. Amagasa, H. Kitagawa. Probabilistic frequent itemset mining on a gpu cluster. IEICE

Transactions of Information and Systems, E97-D(2014) , 779-789.

[9] Q. Zhang, F. Li, K. Yi, Finding frequent items in probabilistic data. Proceedings of SIGMOD, 2008

[10] T. Bernecker, H. P. Kriegel, M. Renz, et al, Probabilistic frequent itemset mining in uncertain

databases. Proceedings of SIGKDD, 2009.

[11] L. Sun, R. Cheng, D. W. Cheung, et al, Mining uncertain data with probabilistic guarantees.

Proceedings of SIGKDD, 2010.

[12] T. Bernecker, H. P. Kriegel, M. Renz, et al, Probabilistic frequent pattern growth for itemset mining in

uncertain databases. Proceedings of SSDM, 2012.

[13] L. Wang, R. Cheng, S. D. Lee, et al, Accelerating probabilistic frequent itemset mining: a model-based

approach. Proceedings of CIKM, 2010.

[14] L. Wang, D. Cheung, R. Cheng, et al. Efficient mining of frequent item sets on large uncertain

databases. IEEE Transaction on Knowledge and Data Engineering, 24(2012), 2170-2183.

166 H.-F. Li et al. / Probabilistic Frequent Itemset Mining Algorithm

data. Proceedings of ICDM, 2010.

[16] Y. Tong, L. Chen, Y. Cheng, et al. Mining frequent itemsets over uncertain databases. Proceedings of

VLDB, 2012.

[17] P. Tang, E. A. Peterson. Mining probabilistic frequent closed itemsets in uncertain databases.

Proceedings of ACMSE, 2011.

[18] E. A. Peterson, P. Tang. Fast approximation of probabilistic frequent closed itemsets. Proceedings of

ACMSE, 2012.

[19] Y. Tong, L. Chen, B. Ding, Discovering threshold-based frequent closed itemsets over probabilistic

data. Proceedings of ICDE, 2012.

[20] C. Liu, L. Chen, C. Zhang. Mining probabilistic representative frequent patterns from uncertain data.

Proceedings of SDM, 2013.

[21] C. Liu, L. Chen, C. Zhang. Summarizing probabilistic frequent patterns: a fast approach. Proceedings

of SIGKDD, 2013.

[22] B. Pei, S. Zhao, H. Chen, et al. FARP: Mining fuzzy association rules from a probabilistic quantitative

database. Information Sciences, 237(2013), 242-260.

[23] P. Y. Tang and E. A. Peterson. Mining probabilistic frequent closed itemsets in uncertain databases,

Proceedings of ASC, 2011.

Fuzzy Systems and Data Mining II 167

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-167

Routing in Data Center Networks

Hu-Yin ZHANG a, b,1 , Jing WANG b, Long QIAN b and Jin-Cai ZHOU b

a

Shenzhen Research Institute of Wuhan University, Shenzhen, Guangdong, China

b

School of Computer Science, Wuhan University, Wuhan, Hubei, China

considerably large slice of operational expenses. Many energy saving strategies

have been proposed, most of them follow the point of bandwidth or throughput to

complete the design of energy saving model. This paper provides a new

perspective of energy saving in data center networks, which basic idea is to ensure

the higher priority traffics have the shorter routes. Combine the bandwidth

constraints with the aim of energy saving, and keep the balance between energy

consumption and traffic priority demand. Simulations show that our routing

algorithm can effectively reduce the transmission delay of the higher priority

traffics and reduce the power consumption of data center networks.

Introduction

With the change of traffic model, large-scale data center networks (DCNs) are often

deployed in Fat-Tree architecture as the non-blocking network. It has over provisioned

network resources and inefficient power usage. Thus, the goal of network power

conservation is to make the power consumption on networking devices proportional to

the traffic load [1]. Many researchers have investigated energy saving for DCNs from

different aspects. The article [2] proposed energy saving routing based on elastic tree.

In [3], the authors proposed a data center energy-efficient network-aware scheduling.

The article [4] presented an energy efficient routing algorithm with the network load

balancing and energy saving. In [5], the authors proposed a bandwidth guaranteed

energy efficient DCNs scheme from the perspective of routing and flow scheduling.

The article [6] aimed to reduce the power consumption of DCNs from the routing

perspective while meeting the throughput performance requirement.

In the DCNs, the network delay is also an important parameter which can reflect

the network performance [7]. The traffic with high priority usually has a strict demand

of transmission delay. In this paper, a new energy efficient routing algorithm is

proposed with the considering of traffic priority. Its basic idea is to make sure the

higher priority traffics get the shorter routings, combine the bandwidth constraints, and

balance between energy consumption and traffic priority demand.

1

Corresponding Author: Hu-Yin ZHANG, Shenzhen Research Institute of Wuhan University,

Shenzhen, China; School of Computer Science, Wuhan University, Wuhan, China; E-mail:

zhy2536@whu.edu.cn.

168 H.-Y. Zhang et al. / Priority Guaranteed and Energy Efﬁcient Routing in Data Center Networks

Figure 1 shows the Fat-Tree architecture, which contains three tiers of switch modules,

the Ck core switches, the Ak aggregation switches and the Sk edge switches. i is from one

to n, it represents the number of switches in the tier. This is conventionally denoted as

a v(c, a, s) network. In order to achieve the goal of efficient energy, it needs to use the

links as little as possible, so that we can make switches work in the sleep mode as

many as possible. The Eq.(1) describes the minimum link number, which is intended to

use a minimum number of switches in the v(c, a, s) network.

z

w

R 㨪 y R(Ck , Ak , Sk ) , i = 1,2, ⋯ n (2)

k~

Rw in Eq.(1) is an array which array R sum the nodes and then take a linear

transform in Eq.(2). It expresses the number of active switches in each tier. In the array

R, Ck , Ak , Sk represent the node name of active switches in each tier respectively. The

problem is how to obtain the optimal array R with the priority guaranteed traffics, and

establish the least links.

In order to establish the least links, the bandwidth utilization of the used links needs to

get a maximum value. In array R, the higher priority traffic will choose the shorter

routing path. However, we will encounter the problem in figure 2.

When traffic 1 (higherpriority) used the A–>B–>E path, traffic 2 and 3 have no

path to use, and the failure bandwidth (FB) occurs. If we analyze the traffic

requirements from the overall situation, optimize the routing and change the traffic 1

into the path A->D->E, then traffic 2 and 3 can both have their paths, and the FB is 0.

Although the higher priority traffic needs to choose a new route, it did not increase its

forwards, so we can regarded this change as no increase of transmission delay.

H.-Y. Zhang et al. / Priority Guaranteed and Energy Efﬁcient Routing in Data Center Networks 169

The goal of our priority guaranteed and energy efficient routing (PER) algorithm is

no FB with the priority guarantee, and obtain the optimal array _ which can make the

DCNs topology to gained the maximum bandwidth utilization and the minimum links,

then we turn idle switches into sleep mode for energy saving.

2. PER Algorithm

The scheme is to compute transmission paths for all flows in the DCNs topology and

reduce the energy consumption of switches in this topology as little as possible.

Any Failure

Bandwidth ?

As shown in figure 3, the PER algorithm works in the following steps:

x Step 1, according to the priority level, the highest priority traffics obtains the

shortest routing configuration.

170 H.-Y. Zhang et al. / Priority Guaranteed and Energy Efﬁcient Routing in Data Center Networks

x Step 2, update priority parameter, and then configure the lower priority traffic.

x Step 3, see if there is any failure bandwidth, if yes jump to step 4, else jump to

step 6.

x Step 4, the priority guaranteed optimization algorithm.

x Step5, is the optimized new routing of the higher priority traffic longer than

the existing one? If yes, the routing of the higher priority traffic maintain the

existing one, the lower priority traffic choose the longer path. If no, the

optimized new routing will be executed. Then repeat step 3.

x Step 6, judge that if all configurations are completed, if not completed, repeat

the step 3, if all configurations are completed, generate the energy efficient

routing topology, and then turn the idle switches into sleep mode.

This algorithm is designed for selecting route path for the flows with different priorities.

Each selected path can eliminate the failure bandwidth and make the link bandwidth

utilization rate as high as possible. If there are many available paths for a flow, the

problem can be converted to an undirected graph G=(S, E). Assume that the weight is

the bandwidth left in the link. We need to find out the shortest path from node b6 to b ,

and maximize the link bandwidth utilization. So the more bandwidth left, the bigger

weight link has. We use these path selection rules as follow:

x Rule 1, set an accessorial vector SA, each of its components SA[i][j]

represents the weight of the link from source node Sk to node A .

x Rule 2, the state of the SA: if there is a link from node Sk to A , SA[i][j]

represents the weight of this link, if no link, SA[i][j]= -1. So, we choose (Sk ,A )

for the SA[i][j]= Min{ SA | A V }. If there are links with same weight, we

choose the node which has the minimum subscript.

x Rule 3, another accessorial vector AC, each of its components AC[j][k]

represents the weight of the link from source node A to C .

x Rule 4, the state of the AC: if there is a link from node A to C , AC[j][k]

represents the weight of this link, if no link, AC[j][k]= -1. So, we choose

(A ,C ) for the AC[j][k] = Min{ AC | C V }. If there are links with same

weight, we choose the node which has the minimum subscript.

x Rule 5, we store the nodes which be selected from the rule 1 to 4.

x Rule 6, if there is any failure bandwidth, all flows in the links which related to

failure bandwidth should be reconfigured according to the Rule 1 to 4. We

should select the route path for the flow which has caused the

failure bandwidth firstly, and then for other flows according to

priorities in descending order.

x Rule 7, all the nodes including in the routing that generated from Rule 6 are

stored in the accessorial array D, then we do a comparison between the array

D and R.

x Rule 8, if the nodes of higher priority traffics in array D is more than R, the

higher priority traffics will preserve the status inarray R and then copy R to D,

or else the higher priority traffics will choose the status inarray D and then

copy D to R.

H.-Y. Zhang et al. / Priority Guaranteed and Energy Efﬁcient Routing in Data Center Networks 171

When the path selections for all traffics are completed, the higher priority flows

are configured with the less number of routing nodes, and the array R stores switch flag

nodes which we used in the links. Therefore, we can sleep the idle switches in order to

save the data center energy consumption.

3. Evaluations

We evaluate our PER algorithm by using Fat-Tree topologies and Matlab7.11 platform.

We compare the results to the random routing without priority guaranteed. We use

simulation model with the network /(16,32,32) which includes eighty nodes of

switches. The available bandwidth of each link is randomly generated, and it does not

exceed 10M. We select twelve traffics, and set their priorities and flow capacities

randomly. To simplify the simulation system, we assume that the data processing

abilities of each layer are same. We set the transmission delay that begin from the

current node and arrive at the next node from 30 to 50ms randomly.

Figure 4 shows the transmission delay of twelve traffics with different priorities.

The average values of the results of three times are adopted. The dotted line with

blocks shows the value of random routing transmission delay, the solid line with dots

represents the PER algorithm transmission delay. From this figure we can see that the

transmission delay of the PER algorithm is less than the random routing, and the

fluctuation of the PER algorithm is small.

100

Active switches

80

60

40

20

0

Fat-Tree Random PER

172 H.-Y. Zhang et al. / Priority Guaranteed and Energy Efﬁcient Routing in Data Center Networks

Figure 5 shows the energy consumption of three kinds of topologies based on the

network /(16,32,32). In the Fat-Tree topology, eighty switches remain in the active

state even if no traffic in some of them. In the random routing, we used almost half of

the switches in the same network, so nearly half of the number of switches can be turn

into the sleep mode. In the PER, because of the increasing utilization of link bandwidth,

about 75% of the switches can be turn into sleep mode in this network, it reduce the

energy consumption greatly.

4. Conclusion

In this paper, we address the power saving problem in DCNs from a routing

perspective. We establish the network model, and introduce the priority guaranteed and

energy efficient routing problem. Then we propose a routing algorithm to solve the

problem of improving energy consumption in DCNs with the guarantee of traffic

priorities. The evaluation results demonstrate that our algorithm can effectively reduce

the transmission delay of the higher priority traffics and the power consumption of

DCNs compared with the random routing.

Acknowledgements

This work was supported by the National Natural Science Foundation of China under

Grant No. 61540059, and the Shenzhen science and technology projects under Grant

No. JCYJ20140603152449639.

References

[1] L.A. Barroso, U. Hlzle, The case for energy-proportional computing, Computer, 40(2010):33–37.

[2] B. Heller, S. Seetharaman, P. Mahadevan, et al.., Elastic Tree: Saving energy in datacenter networks.

Proc of the 7th USENIX Symp on Networked Systems Design and Implementation (NSDI 10). New

York: ACM, 2010:249–264.

[3] D Kliazovich, P Bouvry, S.U. Khan, DENS: Data Center Energy-Efficient Network-Aware Scheduling,

Cluster Computing, 16(2013):65–75.

[4] S Dong, R Li, X Li, Energy Efficient Routing Algorithm Based on Software Defined Data Center

Network, Journal of Computer Research and Development, 52(2015): 806–812.

[5] T Wang, B Qin, Z Su,Y Xia, M Hamdi, et al.., Towards bandwidth guaranteed energy efficient data

center networking, Journal of Cloud Computing, 4(2015):1–15.

[6] M Xu, Y Shang, D Li, X Wang, Greening data center networks with throughput-guaranteed power-aware

routing, Computer Networks, 57(2013):2880–2899.

[7] W. Lao, Z. Li, Y. Bai, Methodology and Realization of Measure on Network Performance Parameter,

Computer Applications & Software, 21(2004).

Fuzzy Systems and Data Mining II 173

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-173

Random Access Memory Manufacturing

Process Using Artificial Neural Network

Chun-Wei CHANG and Shin-Yeu LIN1

Department of Electrical Engineering, Chang Gung University, Taiwan

access memory (DRAM) manufacturing process, we propose a yield-rate predictor

using an artificial neural network (ANN). The inputs to the ANN are the

machining parameters in each step of the manufacturing process for a DRAM

wafer, and the output is the yield rate of the corresponding wafer. In this study, a

three-layer feed-forward back propagation ANN is used and trained by

input-output pairs of data collected from real manufacturing process. We have

tested the proposed ANN for five cases, and each case has different size of training

data set. The test results show that the average of the absolute prediction errors in

all five cases are very small, and as the size of the training data set increases, the

prediction accuracy increases and the associated standard deviation decreases.

Keyword̆̆. data mining, DRAM, yield analysis, artificial neural network, fault

detection.

Introduction

Due to the lengthy manufacturing process for a dynamic random access memory

(DRAM) chip [1-2], it would be beneficial if any manufacturing error can be detected

earlier before the whole process is completed. To do so, on-line machine-fault

detection should be performed to prevent further damage to the wafers in process [3-5].

In general, a yield rate that is much lower than average indicates a possible fault with

high probability. Therefore, an on-line prediction of the yield rate would be helpful to

the machine-fault detection.

For any integrated circuitry, the machining parameters in each step of the

manufacturing process are usually specified. However, no physical or mathematical

model exists to relate the machining parameters with the yield rate. To cope with this

modeless problem, data mining technique can be used to investigate this relationship by

extracting the information from the manufacturing data [6-7]. Therefore, in this paper,

we propose using an ANN to build up the functional relationship between the

machining parameters and the yield rate, and use the constructed ANN to serve as a

yield rate predictor [8-10]. The training algorithm for the proposed ANN will be

introduced, and the ANN will be trained by real manufacturing data. To investigate the

1

Corresponding Author: Chun-Wei CHANG, Department of Electrical Engineering, Chang Gung

University, Kwei-Shan, Tao-Yuan 333, Taiwan; E-mail: shinylin@mail.cgu.edu.tw.

174 C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process

effect of the size of the data set on training the ANN, the prediction accuracy of the

ANN trained by different size of data sets will be investigated in this paper.

This paper is organized in the following manner. Section 1 presents the proposed

ANN. Section 2 presents the test results of the proposed ANN. Section 3 draws a

conclusion.

1. Construction of ANN

There are two parts for constructing an ANN as a yield rate predictor. The first part is

to collect the data of machining parameters and the corresponding yield rate of DRAM

wafers to serve as a training data set. The second part is using the training data set to

train the ANN.

There are hundreds to thousands of processing steps for manufacturing a DRAM chip.

Each DRAM wafer may repeatedly visit the same machine but with different setup of

machining parameters. To train the ANN, a pair of input and output data for each wafer

is collected. The collected input data are the machining parameters, which consist of

the following types, average thickness of oxide coating, range of thickness of oxide

coating, average Nitride thickness, range of Nitride thickness, polish time of chemical

mechanical planarization, photo dose, photo focus, etc. The output data is the yield rate

of the wafer, e. g. 90%. Therefore, the input-output pair of data is formed by a multiple

input data and a single output data, and the collected input-output pairs of data will

serve as the training data set for the ANN.

T

x [ x1 , , xN ]

Let , where x1 , , xN represent the N machining parameters. Let

y ( x) represent the yield rate of the wafer, which is a function of the vector of

machining parameters x . Let M denote the number of input-output pairs of

collected training data set. We employ a feed-forward back propagation ANN that

consists of an input layer, one hidden layer and an output layer [11]. Fig. 1 shows the

three-layer ANN consisting of N input neurons, q hidden-layer neurons, and 1

Z

output neuron, where i , j , i 1, …, q , j 1,…, N , and E k , k 1,…, q

represent the arc weights.

The N neurons in the input layer correspond to x , and the single output neuron

is for y ( x ) . The input layer neurons directly distribute each component of x to

neurons of hidden layer. Hyperbolic tangent sigmoid function shown in Eq. (1) is used

as the activation function of the hidden layer neurons.

e x e x

tanh( x) (1)

e x e x

C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process 175

Figure 1.

Z 1 ,1 tanh()

E1

x1 Z 2,1 2

Zq ,1 tanh() E2

Z1,2 y( x)

¦

3

x2 E3

Zq ,2 tanh()

..

. Z1,N ..

.

Eq

Z3,N q

xN

Zq , N tanh()

ʳʳʳʳʳʳʳʳʳʳ

Figure 1. A three-layer feed-forward back propagation ANN

The procedures to train the ANN using the training data set, which are the M

input-output pairs of collected data, can be stated in the following. For a given input

xi to the ANN that is presented in Fig. 1, we let the corresponding output of the ANN

be denoted by yˆ (xi | ω, β) , which can be calculated by the following formula:

q N

yˆ (xi | ω, β) ¦ E tanh(¦Z , jθij )

1 j 1

(2)

weights of the ANN; xij is the jth component of xi . The training problem for the

considered ANN is to find the vectors of arc weights ω and β that will minimize

the following mean square error (MSE) problem:

M

1

min

ω ,β M

¦{ y(x ) yˆ (x

i 1

i i | ω, β)}2 (3)

Levenberg-Marquardt algorithm [12] as the iterative training algorithm to solve (3).

Stopping criteria of the employed training algorithm are when any of the following two

conditions occurs: (i) the sum of the mean squared errors, i.e. the objective value of the

MSE problem, is smaller than 0.01, and (ii) the number of epochs exceeds 1000.

2. Test Results

In this section, the prediction accuracy of the trained ANN is investigated. In addition,

the relationship between the size of training data set and the prediction accuracy is also

investigated. Therefore, five test cases with various size of training data set are set up.

In all test cases, the number of machining parameters is set to N =78. The employed

three-layer ANN consists of 78 input neurons, 150 (= q ) hidden layer neurons and one

176 C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process

output neuron. The number of epochs exceeding 1000 is chosen as the termination

criteria for training the ANN. The value of M , which is the size or the number of

input-output pairs of training data set, is set to M =50 for case 1, 150 for case 2, 400

for case 3, 700 for case 4 and 1000 for case 5. For each case, we collect 2M pairs of

input-output data from real manufacturing process and separate them into two sets, the

training and testing data sets. The training data set is used to train the employed

three-layer ANN, and the testing data set is utilized to test the prediction accuracy of

the trained ANN.

The prediction accuracy of the trained ANN is defined by the average of the

percentage of the absolute error between the actual and the predicting yield rate, which

is denoted by e and can be calculated by the following equation

1 M

y(xi ) yˆ (xi | ω, β)

e

M

¦|

i 1 y ( xi )

| u100% (4)

M

1

Ve

M

¦ (e e )

i 1

i

2

(5)

where ei | | u100%

y ( xi )

For each of the five cases, after the ANN is trained by the corresponding training

data set, we test the trained ANN using the corresponding testing data set. The

prediction accuracy e for the trained ANN in all five cases is presented in Table 1,

and the associated standard deviation V e in all five cases is reported in Table 2. From

Table 1, we see that e =4.72 for case 5, and as the size of the training data set

increases, the prediction accuracy increases. From Table 2, we see that V e =4.73 for

case 5, and as the size of the training data set increases, V e decreases, which implies

that the prediction accuracy is more stable. Therefore, from the results presented in

Tables 1 and 2, we see that larger training data set enhances the prediction accuracy of

the ANN before exceeding the size that causes overtraining. To give a more insightful

prediction results of the trained ANN, a histogram of the number of tested input-output

pair with respect to the percentage of the prediction error, which is defined as

y (xi ) yˆ (xi | ω,β)

u 100% , is presented in Figure 2.

y ( xi )

Table 1. Prediction accuracy of the trained ANN for the five cases

case 1 2 3 4 5

Size, M 50 150 400 700 1000

C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process 177

Table 2. Standard deviation of the prediction accuracy of the trained ANN for the four cases

case 1 2 3 4 5

Size, M 50 150 400 700 1000

36.00 15.13 9.08 5.55 4.73

180

Number of input-uotput pairs

160

140

120

100

80

60

40

20

0

-30 -26 -22 -18 -14 -10 -6 -2 2 6 10 14 18 22 26 30

Percentage of prediction error

From Figure 2, we see that most of the tested input-output pairs of data are with

very small prediction error, which confirms that the proposed ANN can serve as a good

yield rate predictor for DRAM manufacturing process.

3. Conclusion

In this paper, a three-layer feed-forward and back propagation ANN is presented and is

used to serve as a predictor for the yield rate of a DRAM manufacturing process. The

proposed ANN is trained and tested using real manufacturing data. The test results

reveal that the prediction errors are very small, and as the size of the training data set

increases, the prediction accuracy of the ANN increases and the associated standard

deviation decreases. Therefore, the presented ANN is qualified to serve as a yield rate

predictor for the future purpose of fault detection.

Acknowledgments

This research work is supported in part by Chang Gung Memorial Hospital under grant

BMRP29.

178 C.-W. Chang and S.-Y. Lin / Yield Rate Prediction of a DRAM Manufacturing Process

References

[1] K. Chandrasekar, S. Goossens, C. Weis, M. Koedam, B. Akesson, N. Wehn and K. Goossens, Exploiting

expendable process-margins in DRAMs for run-time performance optimization, Design, Automation &

Test in Europe Conference & Exhibition, 2014, 1-6.

[2] P. S. Huang, M. Y. Tsai, C. Y. Huang, P. C. Lin, L. Huang, M. Chang, S. Shih and J. P. Lin, Warpage,

stresses and KOZ of 3D TSV DRAM package during manufacturing process, 14th International

Conference on Electronic Materials and Packaging, 2012, 1-5.

[3] S. Hamdioui, M. Taouil and N. Z. Haron, Testing open defects in memristor-based memories, IEEE

Trans. on Computers, 64(2015), 247-259.

[4] R. Guldi, J. Watts, S. PapaRao, D. Catlett, J. Montgomery and T. Saeki, Analysis and modeling of

systematic and defect related yield issues during early development of a new technology, Advanced

Semiconductor Manufacturing Conference and Workshop, 4(1998), 7-12.

[5] L. Shen and B. F. Cockburn, An optimal march test for locating faults in DRAMs, Records of the 1993

IEEE International Workshop on Memory Testing, 1993, 61-66.

[6] A. Purwar and S. K. Singh, Issues in data mining: a comprehensive survey, IEEE International

Conference on Computational Intelligence and Computing Research, 2014, 1-6.

[7] J. Han and M. Kamber. Data mining concepts and techniques. 2nd ed. Morgan Kaufmann Publishers,

2006.

[8] B. Dengiz, C. Alabas-Uslu and O. Dengiz, Optimization of manufacturing systems using a neural

network metamodel with a new training approach, Journal of the Operational Research Society,

60(2009), 1191-1197.

[9] N. Alali, M. R. Pishvaie and V. Taghikhani, Neural network meta-modeling of steam assisted gravity

drainage oil recovery processes, Journal of Chemistry & Chemical Engineering, 29(2010), 109-122.

[10] T. Chen, H. Chen and R. Liu, Approximation capability in C(Rn) by multilayer feed-forward networks

and related problems, IEEE Transactions on Neural Networks, 6(1995), 25-30.

[11] J. A. Anderson. An introduction to neural network. MIT Press, Boston, USA, 1995.

[12] B. M. Wilamowski and H. Yu, Improved computation for Levenberg-Marquardt training, IEEE Trans.

On Neural Network, 21(2010), 930-937.

Fuzzy Systems and Data Mining II 179

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-179

with Exact Methods

Hai-Feng LI 1 and Yue WANG

School of Information, Central University of Finance and Economics,

Beijing, China, 100081

lenging problem. The state-of-the-art algorithm uses O(nlog 2 n) time complexity

to conduct the mining. We focus on this problem and design a framework, which

can discover the probabilistic frequent itemsets with traditional exact frequent item-

set mining methods; thus, the time complexity can be reduced to O(n). In this

framework, we supply a minimum conﬁdence to convert the uncertain database to

exact database; furthermore, a sampling method is used to ﬁnd the reasonable min-

imum conﬁdence so that the accuracy is guaranteed. Our experiments show our

method can signiﬁcantly outperform the existing algorithm.

Keywords. Uncertain Database; Exact Database; Probabilistic Frequent Itemset

Mining; Exact Frequent Itemset Mining; Data Mining

Introduction

Frequent itemset mining is one of the important techniques in data mining, which dis-

covers the patterns from databases to support the commercial decisions. Recently, new

applications have been developed in web site, Internet and wireless networks, which will

generate many uncertain data, that is, each data will be adhered with a probability to show

the existence of the data[3], Table 1 shows an example of the uncertain database with

4 items {a, b, c, d}. In such cases, traditional exact frequent itemset mining algorithms

were studied in the recent years[1] were not effective yet since new feature brings us

new challenges; thus, new methods need to be designed to handle this data environment.

The existing uncertain frequent itemset mining methods can be split into two categories.

One is based on the expected support to achieve the results[2], another is to discover

the probabilistic frequent itemsets according to the deﬁnition of probabilistic support[4].

The probabilistic frequent itemsets, in comparison to the expected frequent itemsets, can

better represent the probability of the itemsets; thus, the mining problem obtain more

focus. Nevertheless, the mining is hard since converting the uncertain database to exact

database is an NP-hard problem. One will use O(nlog 2 n) time complexity and O(n)

space complexity to conduct the probabilistic support computing for an itemset. Clearly

to see, when the database size n is large, the mining cost will be huge.

1 Corresponding Author: Hai-Feng Li, School of Information, Central University of Finance and Economics,

180 H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods

ID Uncertain Transaction

1 a:0.6 b:0.4 d:1

2 a:0.8 c:0.6

3 b:1 c:0.9

4 a:0.8 b:0.8 c:0.6 d:0.6

5 d:0.7

In this paper, we will focus on the mining problem and present an approximate

method to convert the uncertain database to exact database so that the runtime can be re-

duced. The rest of the paper are organized as follows. section 1 presents the preliminaries

and then present the challenge of the problem. Section 2 introduces our method. Section

3 evaluates the performance with our experimental results. Finally, section 4 concludes

the paper.

1.1. Preliminaries

notes the size of Γ. A subset X ⊆ Γ is an itemset. Suppose each item xt (0 <

t ≤ |X|) in X is annotated with an existing probability p(xt ), we call X an un-

certain itemset, which is denoted as X = {x1 , p(x1 ); x2 , p(x2 ); · · · ; x|X| , p(x|X| )},

|X|

and the probability of X is p(X) = Πi=1 p(xi ). An uncertain transaction U T is an

uncertain itemset with an ID. An uncertain database U D is a collection of uncertain

transactions U Ts (0 < s ≤ |U D|). If X ∈ U Ts , then we use p(X, U Ts ) to de-

note the probability that X occurs in U Ts . As a result, in U D, X occurs t times

with a probability pt (X, U D) = ΣΠX∈U Ts ,count(U Ts )=t (p(X, U Ts ))ΠX∈U Ts (1 −

p(X, U Ts )). The list {p1 (X, U D), p2 (X, U D), · · · , p|U D| (X, U D)} is the probabil-

ity density function.Given an itemset X, the number it occurring in an uncertain

database is called the support of X, denoted Λ(X). Consequently, we use ΛP τ (X) ≥ i

to denote the probability that X occurs more than i times, which is actually the

{pi (X, U D), pi+1 (X, U D), · · · , p|U D| (X, U D)}.

conﬁdence τ and uncertain database U D, itemset X is a probabilistic frequent itemset

τ (X) ≥ λ, in which the Λτ (X) is the maximal support of

iff the probabilistic support ΛP P

ΛP (1)

In this paper, we will discover all the probabilistic frequent itemsets from the uncer-

tain databases for the given λ and minimum probabilistic conﬁdence.

H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods 181

Mininum Probabilistic Conﬁdence

Dataset

0.009 0.09 0.9 0.1 0.01 0.001

Runtime Cost(Sec) 34.5 35.4 34.8 35.2 35.6 34.7

GAZELLA(λ=0.08)

Memory Cost(MB) 66.8 67.4 67.4 67.1 67.4 66.9

Runtime Cost(Sec) 1979 1931 1882 1894 1873 1989

T25I15D320K(λ=0.1)

Memory Cost(MB) 1275 1275 1275 1275 1275 1275

To address this problem, many research have be studied. Zhang et al. ﬁrstly intro-

duced the conception of probabilistic frequent items[4], and employed the dynamic pro-

gramming(DP) technique to perform mining, which was improved by Bernecker et al.[5]

with using the a priori rule for further pruning. With this method, the time complexity is

O(n2 ) and the space complexity is O(n). Sun et al. improved the method by regarding

the probability computation as the convolution of two vectors and thus used the divide-

and-conquer method(DC)[6] to conduct mining, in which the fast fourier transform can

reduce the computing complexity from O(n2 ) to O(nlog 2 n). The probabilistic frequent

itemset and the expected frequent itemset were proved having relationships in [7] based

on standard normal distribution. Tong et al. surveyed all the methods in [8].

As can be seen, the most efﬁcient method to computing the probabilistic support has a

signiﬁcantly high cost, which, as a result, will reduce effective of the mining method in

real applications. We develop a novel method, which does not consider the method of

improving the mining method but design a framework to mining probabilistic frequent

itemsets with traditional exact frequent itemset mining methods. In this framework, we

build a relationship between uncertain data and exact data with a supplied parameter,

called the minimum conﬁdence

.

With the minimum conﬁdence

, we can convert the uncertain database to the exact

one as follows. We will scan the uncertain database, if the probability of an item is

smaller that

, then we will consider it as not existing in the exact database, otherwise

existing. The reason behind it is based on an instinctive consideration, that is, an item

with a small probability contributes little in getting an effective probability of its high

occurrences. Once an exact database is generated, we can employ the traditional frequent

itemset mining algorithm to discover the results. The pseudocode is shown in Algorithm

1. As an example, when we set

= 0.5, then the uncertain database in Table 1 can be

converted to the database in Table 3, in which all the items with probability smaller than

0.5 are removed directly.

In this paper, we ignored τ for two reasons. On the one hand, in [8], Tong et. al

evaluated that τ has little impact over the mining results with their experiments; we

also conducted experiments with the state-of-the-art algorithm TODIS, whose results are

shown in Table 2. As can be seen, when we ﬁx the minimum support, the runtime cost

and the memory cost kept almost unchanged no matter how τ changes. On the other

hand, we employ a novel framework to convert the uncertain database to exact one, over

which the traditional mining methods can be used, and thus τ is not useful and can be

ignored accordingly.

182 H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods

Table 3. The Database Converted from the Uncertain Database when = 0.5

ID Transaction

1 ad

2 ac

3 bc

4 abcd

5 d

Require: UD: an initial uncertain database;

D: the converted exact database;

T: the transaction in D;

: minimum conﬁdence;

λ: minimum support;

1: for each uncertain transaction U Ti in U D do

2: for each uncertain item U I in U Ti do

3: if U I.prob ≥

then

4: add U I in transaction Ti ;

5: add Ti in D;

6: perform exact frequent itemset mining algorithm with λ;

Analysis: Our method is in line with the database size n, that is, the conversion

from uncertain data to exact data need O(n) time complexity; furthermore, an other

advantage is that it can read the data into the memory synchronously, which can almost

be ignored. On the other hand, since our ﬁnal mining is based on the exact database, the

mining speed can be much improved. In comparison to the uncertain database mining,

the time complexity will be reduced to O(n) at least. Suppose the count of itemsets that

need to be computed is m, then our method has the time complexity O(mn), the most

effective method to directly discover the probabilistic frequent itemsets, however, needs

o(mnlog 2 n). Clearly to see, when the database size n is large, the mining speed will be

improved signiﬁcantly.

Even though the performance can be improved, the mining results will be approx-

imate. The minimum conﬁdence

is the key parameter to determine how approximate

the mining results. Consequently, how to decide

is the main problem so far. Table 4 is

the precision and recall of our method when we set the minimum support to 0.1, 0.08

and 0.06; also, we set the minimum conﬁdence to 0.9, 0.8 and 0.7. As can be seen, the

precision and the recall will reach to their highest value when for a special minimum

conﬁdence. That is to say, if we ﬁnd this special minimum conﬁdence, the accuracy will

be high.

To address this problem, we employed a sampling method to ﬁnd this special param-

eter. Before we convert the uncertain database, we will take samples from the database as

a sub-database, which will ﬁrstly be converted and we can use our method to determine

the minimum conﬁdence; then, the mining can be conducted over the entire database.

H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods 183

Precision Recall

Data Minsup

0.9 0.8 0.7 0.9 0.8 0.7

0.1 100% 100% 100% 100% 75% 75%

GAZELLE 0.08 80% 100% 100% 100% 100% 100%

0.06 85% 100% 100% 100% 70% 70%

0.1 11% 88% 100% 100% 72% 56%

T25I15D320K 0.08 25% 95% 100% 100% 74% 64%

0.06 24% 98% 100% 100% 81% 70%

uncertain data data size avg. size min. size max. size item count mean variance item corr.

T25I15D320K 320,002 26 1 67 994 0.87 0.27 38

GAZELLE 59,602 3 2 268 497 0.94 0.08 166

3. Experiments

gorithm UMiner[6]. The minimum conﬁdence is the main parameter in our evaluations.

The method was implemented with Python 2.7 running on Microsoft Windows 7. The

experimental computer has a 3.60GHZ Intel Core i7-4790M CPU and 12GB memory.

We employed 2 datasets as the evaluation datasets. One is created with the IBM data

generator and another is a real-life dataset. The data characteristics are presented in Ta-

ble 5. Given the item number u and the average transaction size v, we demonstrate the

approximate correlation among transactions with uv .

We presented the runtime cost of our method in comparison to the UMiner algo-

rithm. As can be seen in Figure1(minsup={0.1, 0.08, 0.06}), the minimum conﬁdence

was set from 0.1 to 0.9, the runtime cost was signiﬁcantly lower than UMiner. The

smaller the minimum conﬁdence, the higher the mining cost; thus, when we set the small-

est minimum conﬁdence 0.1, the highest mining cost can achieve speedup of one hun-

dred over GAZELLE dataset, as well 30 folds faster over T25I15D320K dataset. This

184 H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods

showed that our method was more efﬁcient over the sparse datasets. Moreover, we pre-

sented the memory cost of our method. In Figure2, the memory cost was not impacted

by the minimum support. When the minimum conﬁdence was small, the memory usage

was high, which, however, still smaller that the one of UMiner algorithm.

4. Conclusions

databases and proposed a novel method. We did not directly improve the current mining

algorithm but converted the uncertain databases to exact ones, in which a sample method

was used to ﬁnd a reasonable parameter for accuracy guarantee. With such a method,

many traditional efﬁcient algorithms over exact databases can be employed directly for

probabilistic frequent itemset mining. Our experiments showed our method was efﬁcient.

Acknowledgement

61309030), Beijing Higher Education Young Elite Teacher Project(YETP0987), Disci-

pline Construction Foundation of Central University of Finance and Economics, Key

project of National Social Science Foundation of China(13AXW010).

References

[1] J.Han, H.Cheng, D.Xin, and X.Yan, Frequent pattern mining: current status and future directions, Data

Mining and Knowledge Discovery,Vol.15(2007),55-86

[2] C.K.Chui, B.Kao, and E.Hung, Mining Frequent Itemsets from Uncertain Data, Proceedings of

PAKDD’2007

[3] C.C.Aggarwal, and P.S.Yu. A survey of uncertain data algorithms and applications. Transaction of

Knowledge and Data Mining, Vol.21(2009), 609-623

[4] Q.Zhang, F.Li, and K.Yi, Finding Frequent Items in Probabilistic Data, Proceedings of SIGMOD’2008

H.-F. Li and Y. Wang / Mining Probabilistic Frequent Itemsets with Exact Methods 185

[5] T.Bernecker, H.P.Kriegel, M.Renz, F.Verhein, and A.Zueﬂe, Probabilistic Frequent Itemset Mining in

Uncertain Databases, Proceedings of SIGKDD’2009

[6] L.Sun, R.Cheng, D.W.Cheung, and J.Cheng, Mining Uncertain Data with Probabilistic Guarantees,

Proceedings of KDD’2010

[7] T.Calders, C.Garboni, and B.Goethals. Approximation of Frequentness Probability of Itemsets in Uncer-

tain Data, Proceedings of ICDM’2010

[8] Y.Tong, L.Chen, Y.Cheng, and P.S.Yu. Mining Frequent Itemsets over Uncertain Databases, Proceed-

ings of VLDB’2012

[9] P.Tang, and E.A.Peterson. Mining Probabilistic Frequent Closed itemsets in Uncertain Databases, Pro-

ceedings of ACMSE’2011

186 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-186

Using Satellite Telemetry Big Data

Feng ZHOU a, De-Chang PI a,1, Xu KANG a and Hua-Dong TIAN b

a

College of Computer Science and Technology, Nanjing University of Aeronautics and

Astronautics, Nanjing, Jiangsu, China

b

China Academy of Space Technology, Beijing, China

modes, and complex telemetry big data, which make it difficult to evaluate their

performance degradation. In this paper, a novel data mining analysis method is

proposed to analyze the satellite’s telemetry big data, in which sample entropy is

calculated to characterize states and the support vector data description is utilized

to analyze the satellite performance degradation process. The experimental results

show that our proposed method could generally describe the performance degrada-

tion process of satellites. Meanwhile, it also provides an important approach for

the ground-station-monitor to analyze the performance of satellites.

vector data description

Introduction

With more and more satellites being sent into space these years, the ground in-orbit

managements have to handle such challenges as satellite’s high control precision, vari-

ous working modes, and high complexity. As advanced technologies and new materials

are utilized in satellites [1, 2], the sudden failure is not the primary failure mode for

most satellite failures, which is replaced by performance degradation. The theory of

analyzing satellite performance degradation only focuses on the overall performance of

equipment, regardless of failure modes, which is different from analyzing sudden fail-

ures.

In 2001, the University of Wisconsin and the University of Michigan, together

with other 40 industry partners, were united to establish the Intelligent Maintenance

Systems (IMS) research center under the U.S. National Science Foundation. After then,

many methods of performance degradation assessment have been proposed, such as the

pattern discrimination model (PDM) based on a cerebellar model articulation controller

(CMAC) neural network [3], self-organizing map (SOM) and back propagation neural

network methods [4], hidden Markov model (HMM) and hidden semi-Markov model

(HSMM) [5], etc. However, these methods are deficient in some aspects. For example,

the results of CMAC assessment method are greatly influenced by parameter setting

1

Corresponding Author: De-Chang PI, College of Computer Science and Technology, Nanjing Univer-

sity of Aeronautics and Astronautics, 29 Yudao Street, Nanjing, Jiangsu, 210016, China. E-mail:

dc.pi@nuaa.edu.cn.

F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 187

and the assessment results of the SOM, neural network method and HMM cannot di-

rectly reflect degradation degree. In order to accommodate the characteristics of as-

sessment for different key components, the analysis theory of performance degradation

has been developed from single degradation variable to a more diverse practical direc-

tion. Although some new theories and methods have emerged, the researches on the

performance degradation of satellite are still limited. M Tafazoli [6] studied in-orbit

failures for more than 130 different spacecraft and revealed that the spacecraft are vul-

nerable to failures occurring in key components. MAW [7] analyzed the space radiation

environment of thermal coatings and proposed degradation models for the optical prop-

erties of thermal coatings. However, these methods mainly focus on failure data and

also require relevant experience.

The conventional analysis methods for satellite performance degradation have

some shortcomings such as experimental difficulties and high cost. Satellites telemetry

big data contain monitoring information, abnormal states, space environment, and oth-

ers, which reflect the operational status and payload of satellites. A novel analysis

method for satellite performance degradation with telemetry big data is proposed in this

paper. This method uses data mining techniques and provides a quantitative description

for satellite performance degradation process.

Recently, the presented performance degradation methods are based on physical

rules or models [8, 9], this methods need to understand the internal structure of the sat-

ellite which is a difficult work for analyst. However, our proposed method uses the data

sampled in satellite operation process to analyze satellite performance degradation pro-

cedure without needing to determine the relationship of equipment accurately. What’s

more, our proposed method studies the characteristics of historical data, summarizes

the regulation of change, and analyzes the performance degradation process automati-

cally. To the best of our knowledge, a similar approach to performance degradation of

satellite has not appeared yet. Furthermore, it also can be extended to apply to failure

prediction.

1. Related Concepts

(ApEn) proposed by Pincus [11]. The advanced algorithm is able to quantify the com-

plexity rate of a nonlinear time series.

For a data series X N x 1 , x 2 ,...x n , where N is the length of the series, two

parameters are defined: m is the embedded dimension of the vector to be formed and

r is the threshold that serves as a noise filter. The steps to calculate SamEn are shown

as follows:

1) N m 1 patterns (vectors) are generated, and each pattern owns m dimensions.

The pattern is represented as following:

188 F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data

using Eq. (2).

dª

¬X

m

i , X j º¼

m

max[| x i k x( j k ) |]

(2)

k 0, , m 1, j 1,

1 , N m 1, j z i

X m j matches i where the number of matching pattern N m i is the number

pattern X

m

quences with m points can be achieved by using Eq. (3):

N m 1

1

)

m

r

N m 1

¦ C i m

r

(3)

i 1

4) When the dimension expands to m 1 , steps 1-3 are repeated to find out ) m +1 r .

The theoretical value of the SamEn is defined as follows:

SamEn m, r

N of

^

lim ln ª¬) m 1 r ) m r º¼ ` (4)

can be achieved when m 2 , r 0.1 ~ 0.25 std X , where std X denotes the standard

deviation of X ^x 1 , x2 , xN ` .

Support Vector Data Description [12] (SVDD) is inspired by the Support Vector Clas-

sifier. The method is robust against outliers in the training set and is capable of tighten-

ing the description by using negative examples.

A hypersphere that contains all or most samples of the target class is defined

as X = ^x1 ,x2 , xn ` . The hypersphere is bounded by the core of the hypersphere a and

radius R . If the hypersphere covers all the training samples of target class, the classifi-

cation is established by the empirical error which is equal to zero, and the structural

error is defined as H a,R =R 2 .

As the distance from xi to the core a should not be larger than radius R for all the

samples of the target class X , the constraint of the minimization problem can be de-

2

scribed as xi -a d R 2 .

To account for the possibility of outliers in the training set, the distance between xi

and the core a should not be strictly smaller than R , but larger distances should be

penalized. Therefore, slack variable [ i is brought in, and the minimization problem is

transformed into

F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 189

N

min H R,a,[ =R 2 C ¦ [i s.t xi -a d R 2 [i [i t 0 i =1,2, N

2

,,N (5)

i 1

The penalty factor C makes a trade-off between the volume and errors. The minimi-

zation problem in Eq. (5) can be calculated by using Eq. (6).

L R,a,Di ,[i =R2 C ¦[i ¦Di ^R2 [ 2 xi 2axi a 2 ` ¦ J i[i Di t 0,J i t 0 (6)

i i i

In Eq. (6), D i and J i are the Lagrange multipliers. L should be minimized with re-

spect to R , a , and [ i , and maximized with respect to Di and J i . Respectively taking

their partial derivatives equal to zero, and then get the following constraint Eq. (7):

¦Dx = Dx

¦D =1 ¦ C D i J i =0 i (7)

i i i

a=

¦D

i i i

i i i i

max L = ¦ D i xi xi ¦ D iD j xi x j (8)

i =1 i ,j

Thus, the optimization problem can be further transformed into Eq. (9):

max L =1 ¦ D iD j K G xi ,x j , V s.t 0 d D i d C

i ,j

(9)

K G x,y, V = exp x -y 2

V2

Eq. (9) shows that the core of the hypersphere is a linear combination of the objects.

Only objects xi with D i t 0 are needed in the description. Therefore, these objects are

called the support vectors of the description (SVs). To test an object z , the distance to

the core of the hypersphere and the radius R are respectively calculated by Eq. (10).

d = z -a =K G z , z -2¦ D i K G z , xi + ¦ D iD j K G xi , x j

i i ,j

(10)

R 2 = xsv -a =1-2¦ D i K G xi , xsv ¦ D iD j K G xi , x j

2

i i ,j

The test object z is accepted when this distance is not greater than the radius

(i.e. d d R ).

190 F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data

The SamEn of a time period is taken as its performance feature. And the vector

composed of the performance features of parameters within the same time period is

called performance eigenvector.

In this study, parameters are not limited to those of the objective equipment, but

they also contain a number of closely related equipment parameters. As parameters are

relative to specialized knowledge, their selections are conducted based on the domain

and expert knowledge.

Definition 2 (Health Model)

With SVDD method, the model obtained by training the performance eigenvector

of satellite in the healthy status is called health model (model).

According to the theory of SVDD, the model described in definition 2 is composed

of the support vectors of healthy state vector (model.SV), corresponding coefficients

( model.V ), number of support vectors (model.len), hypersphere bounded by the core

(model.a) and the radius (model.R)

Definition 3 (Performance Degradation Degree)

Here, dec denotes the distance between the performance eigenvector of satellite

and the core of hypersphere. The performance degradation degree deg which reflects

the “health condition” [13] is defined by the difference between dec and the radius of

hypersphere model.R, that is, deg = dec – model.R (in Figure 1).

It means that performance degradation process of the objective equipment may oc-

curs when the value of deg is larger than 0. When the value increases monotonously,

the speed of performance degradation process of the objective equipment increases

accordingly. As the degree cannot be negative, set deg = 0 when dec – model.R <0.

Figure 1. Principle of the performance degradation degree

Figure 1 shows the principle of performance degradation degree. However, the mod-

el cannot contain all the health status features of the satellite for the operating mode of

satellite is complex, and the training sets in healthy status of each operating mode are

limited. A satellite may remain in the healthy status under other operating modes, espe-

cially when deg is positive.

Figure 2 shows the overall framework of the analysis for satellite performance degrada-

tion presented in this study, which has four main steps.

Step 1. Select parameters of the satellite according to expert knowledge. Then, medi-

F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 191

an filter method is used to reduce the noise in satellite telemetry big data so as to gen-

erate a new clean dataset.

Step 2. Extract the performance features from the selected parameters through Step 1

according to Definition 1. And compose the final set of the performance eigenvectors.

Step 3. Select the performance eigenvectors in the healthy status as the training set,

and build a health model with SVDD method.

Step 4. To measure the degradation status of the new performance eigenvector, cal-

culate the performance degradation degrees according to Definition 3 and the results of

the model obtained in Step 3.

Satellite

Satellite telemetry

telemetry

r

data

data

Parameter

Parameter selection

selection Expert

Expert knowledge

knowledge

Telemetry

r data

elemetry data

processing

processing

Processing

Median Filter

Filter

Eigenvectors

Eigenvectors in

in Eigenvectors

Eigenvectors for

f r

fo Sample

Sample Entropy

Entropy

healthy

healthy states

states analysis

analysis extraction

extraction

Support

u port Vector

Sup Vector Data

Data Health

Health Model

Model

Description

Description

Performance

Perfo

f rmance

Degradation

Degradation Degree

Degree

The telemetry big data of one satellite is used as experimental data, which recorded

from 2011-05-01 00:00:00.0 to 2011-12-29 18:16:59.987, 14 million data frames that

contain several failures and performance degradation information. In our experiments,

seven important parameters in this dataset are selected by expert knowledge.

The telemetry big data is stored in Oracle 11g, and the algorithms are coded by Java.

The operating system used is Windows Server 2008 R2 Standard with the Intel (R)

Xeon (R) Eight-core E5606 processor with 8 G RAM.

(1) The outliers caused by decoding or other errors are removed according to the

ranges of the seven parameters. And further, the median filter method in every 30s is

used to reduce the noise in the dataset. Finally, a new dataset is achieved.

(2) The values of time series are normalized into the range [-1, 1] for each parameter

and each time series are equally divided into 800 groups. The performance features of

192 F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data

each group are extracted by Definition 1. Finally, seven performance feature sequences

are obtained with a length of 800. The performance eigenvector is composed of the

features of seven parameters in the group with same number.

(1) Performance eigenvector under healthy status are selected as the training data,

SVDD method is used by setting V =1 in this experiment, and then the health model of

satellite is established.

(2) The remaining dataset is used as test data to verify the obtained health model,

and the degradation degree is calculated according to Definition 3. Figure 3 shows the

final results.

The degradation degrees are unsteady, and the curve is not smooth but fluctuant.

This is mainly due to the recognition accuracy of SVDD and cyclical factors of original

data that does not affect the overall reaction on the degradation process of satellite. In

order to reduce the interference of these factors, a relative algorithm is employed, and

the wavelet denoising sequence is obtained as Figure 3 shows. Overall, the average

degradation degree presents an increasing trend. Given the long period, the accidental

factors cannot influence the degradation degree all the time. Therefore, we conclude

that the satellite has entered the performance degradation state based on Definition 3.

0.35

0.3 wavelet denoise sequence

0.25

degradation degree

0.2

0.15

0.1

0.05

0

1 100 200 300 400 500 600 700 800

group number

Aerospace experts confirm that two major failures of satellite did occur from late Ju-

ly to late August (between the 246th group and 370th group) for unknown reasons, and

these two failures are corresponding to the two peaks nearby. That proves the correct-

ness of our proposed definition, especially explaining the degradation peak and the

high degradation degree level after the peak. In conclusion, the proposed method can

efficiently describe the performance degradation process of satellite.

By the way, our proposed method as a data-driven method to performance degrada-

tion of satellite has not appeared yet, it also can be used for failure prediction in a varie-

ty of engineering applications, such as aircraft engines.

F. Zhou et al. / Performance Degradation Analysis Method Using Satellite Telemetry Big Data 193

4. Conclusions

A method for satellite performance degradation with telemetry big data is proposed in

this paper while studies for solving this problem are limited. The experimental analysis

shows that the proposed method can extract effective state information from the param-

eters and provide a quantitative description for satellite performance degradation.

Moreover, the analysis on the performance degradation of satellite with telemetry big

data has a significant meaning in in-orbit research and management for satellites.

In our study, the definitions may have some limitations; for example, the degradation

degree of the experiment is unstable but fluctuant. The sample entropy algorithm may

take much time to trim redundant parameters in massive data, which will be improved

in our future work.

Acknowledgment

This paper is supported by the National Natural Science Foundation of China (Grant

No. U1433116).

References

[1] Z.Z. Zhong, D.C. Pi D. Forecasting Satellite Attitude Volatility Using Support Vector Regression with

Particle Swarm Optimization. Iaeng International Journal of Computer Science, 41(2014), 153-162.

[2] F. Zhou, D.C. Pi. Prediction Algorithm for Seasonal Satellite Parameters Based on Time Series Decom-

position. Computer Science, 43(2016), 9-12 (in Chinese).

[3] J. Lee. Measurement of machine performance degradation using a neural network model. Computers in

Industry, 30(1996), 193-209.

[4] R. Huang, L. Xi, et al. Residual life predictions for ball bearings based on self-organizing map and back

propagation neural network methods. Mechanical Systems and Signal Processing, 21(2007), 193-207.

[5] X.S. Si, W. Wang, C.H. Hu, et al. Remaining useful life estimation–A review on the statistical data driv-

en approaches. European Journal of Operational Research, 213(2011), 1-14.

[6] M. Tafazoli. A study of on-orbit spacecraft failures. Acta Astronautica, 64(2009), 195-205.

[7] W. Ma, Y. Xuan, Y. Han, et al. Degradation Performance of Long-life Satellite Thermal Coating and Its

Influence on Thermal Character . Journal of Astronautics, 2(2010), 43-45.

[8] G. Jin, D.E. Matthews, Z. Zhou. A Bayesian framework for on-line degradation assessment and residual

life prediction of secondary batteries inspacecraft. Reliability Engineering &System Safety, 113(2013),

7-20

[9] X. Hu, J. Jiang, D. Cao, et al. Battery Health Prognosis for Electric Vehicles Using Sample Entropy and

Sparse Bayesian Predictive Modeling. IEEE Transactions on Industrial Electronics, 63(2015), 2645-

2656.

[10] S.M. Pincus. Assessing serial irregularity and its implications for health. Annals of the New York Acad-

emy of Sciences, 954(2001), 245-267.

[11] D. Weinshall, A. Zweig, et al. Beyond novelty detection: Incongruent events, when general and specific

classifiers disagree. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(2012), 1886-

1901.

[12] G. Yan, F. Sun, H. Li, et al. CoreRank: Redeeming “Sick Silicon” by Dynamically Quantifying Core-

Level Healthy Condition. IEEE Transactions on Computers, 65(2016), 716-729.

194 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-194

Investment Strategy of Stock Based on

Sector Rotating

Li-Min HE a, Shao-Dong CHEN a, Zhen-Hua ZHANG b,1, Yong HU c and Hong-Yi

JIANG a

a

School of Finance, Guangdong University of Foreign Studies, China

b

School of Economics and Trade, Guangdong University of Foreign Studies,

Guangdong, China

c

Institute of Big Data and Decision Making, Jinan University, Guangdong, China

Abstract. This study firstly proposes Meta-Investment Strategy, derived from the

concept of Meta-Search in network and Meta-Cognition in psychology. We

compare enormous web information to all A shares in China, process of searching

information to stock selection and search engines to equity funds. Based on the

sector rotation theory and decision tree model, through the construction of

indicator system and the statistical model, some stock selection rules according to

funds information can be extracted. After classifying the period from 2016.02 to

2016.04 as recovery, we selected finance industry. By importing 12 stock

indicators of all the component stocks in finance industry as input variables and

whether it is heavily held by stocks funds as target variable, a decision tree model

is constructed. Finally, by entering data of the last quarter in 2015, the predictive

classification results are obtained. Result shows that Meta-Investment Strategy

outperformed CSI300 and CSI300 of Finance Sector (000914) and obtained

significant excess return from 2016.02.01 to 2016.04.30.

data mining, stock selection model

Introduction

In each surge of stock market in China, there are always hot industries which lead the

upward trend periodically. If investors can seize these fleeting investment opportunities

of hot industries, their portfolios can acquire excess return. Sector rotation has become

one of the most important means in investment research of stock market.

Sector rotation refers to a phenomenon that in every phase of business cycle and

stock market cycle, different industries take turns to outperform the market. The

research on sector rotation theory abroad is more mature than domestic ones. It

originated from the famous “The Investment Clock” [1], which classified the business

cycle into four phases and concluded the performance of different industries. Sassetti

and Tani [2] outperformed market returns by using 3 market-timing techniques on 41

1

Corresponding Author: Zhen-Hua Zhang, School of Economics and Trade, Guangdong University of

Foreign Studies, Guangzhou 510006, China; E-mail: zhangzhenhua@gdufs.edu.cn.

L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 195

funds of the Fidelity Select Sector family over the period 1998 to 2003. The domestic

research of sector rotation focuses on the phenomenon itself and its underlying causes,

including business cycle, monetary cycle, industrial cycle and behavioral finance.

However, only a few researches probe into sector rotation as an investment

strategy. Peng and Zhang [3] empirically analyzed the sector rotation effect in Chinese

stock market and proved the feasibility of sector investment strategy. By adopting

association rules algorithm, many strong association rules of stock market were mined

from a massive amount of data [4]. In this research, manufacturing and petrochemical

industry stock indexes (the core of association rules) are closely related to other sector

indexes (except for finance, real estate, food & beverage and media).

In addition, because of the immature Chinese capital market, irrational investments

contribute to instability of stock market. This leads to the divergence of market

performance and economic fundamentals. In this case, stock fund, representative of

professional investors, can often forecast directions of financial market. That is why we

put forward the concept of “meta-investment”.

We first present the concept of Meta-Investment based on Meta-Search and Meta-

Cognition. And then, by fusing Meta-Investment and Sector Rotation Strategy, we

apply this concept to stock investment according to the investment results of some

funds and institutions. In order to get comprehensible rules, we adopt decision tree

model to construct the final investment strategy. Simulation results show the

advantages of the present method.

Yang [5] proposed that the essence of sector rotation is an economic phenomenon.

Namely, factors influencing business cycle also induce sector rotation in capital market.

These factors include investment, monetary shock, external shock and consumption of

durable goods. In his study, by introducing phases of business cycle as dummy variable

to the classical CAPM model, the sector rotation strategy gains 0.2% excess Jensen

Alpha returns. Dai & Lin [6] put forward the innate logic of sector rotation and

business cycle. Business cycle is determined by external shock while industrial

structure decides internal forms of business cycle. The process is shown in Figure 1.

Financial Situation of

Business Cycle Different Industries

Sector Rotation

196 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

Monetary policy is an important contributory factor of stock market. In the long run,

the performance of stock market is based on real economy. However, the change in

liquidity resulting from the conversion of monetary policy can influence the stock

market in the short run. The interpretation of sector rotation based on monetary policy

is that different industries have different sensitivity to liquidity. Conover, Jenson,

Johnson and Mercer [7] used federal FED discount rate as an indicator of monetary

policy to build a sector rotation strategy based on monetary environment. After

classifying the monetary phases, sensitivity to liquidity of different industries was

tested. Subsequently, cyclical industries sensitive to liquidity were invested during

monetary easing while noncyclical industries were invested during monetary tightening.

This strategy gained excess return.

relationship among different industries. Therefore, investment logic is formed to gain

excess return—selling industries that outperformed the market early and buying

industries that outperformed the market lately. Chen used the DAG method to conduct

an empirical analysis of the relationship of price index among different industries. He

proposed three explanations for sector rotation, which included associations among

different industries formed by business cycle, upstream & downstream relationship and

investment characteristics [8].

1.4. Conclusion

perspective of real economy, sector rotation originates from the changes of business

cycle and monetary shock. In addition, industrial structure determines the expression

form of sector rotation. Namely, the difference of income elasticity of demand, cost

structure [9], sensitivity to liquidity and profit transmission relationship among

different industries decide the form of sector rotation. Moreover, sector rotation can be

interpreted from a perspective of behavioral finance, which views sector rotation as a

market speculation. The proportion of retail investors in China Capital market is

relatively high. Thus, there are a lot of noise traders (According to Shiller, they pursue

fashion and fanaticism and incline to overreact to changes of stock prices.). Meanwhile,

informed traders among institutional investors joint together to lure retail traders to

gain excess profit by manufacturing the stock market [10].

2. Meta-Investment Strategy

Cognition. A Meta-search Engine is a search tool that uses other search engines' data to

produce their own results from the Internet [11]. Meta-search engines take input from a

user and simultaneously send out queries to third party search engines for results.

L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 197

Therefore, sufficient data is gathered, formatted by their ranks and presented to the

users.

It is well known that Meta-Cognition is "cognition about cognition", "thinking

about thinking", or "knowing about knowing". The term Meta-Cognition literally

means cognition about cognition, or more informally, thinking about thinking defined

by American developmental psychologist Flavell [12]. Flavell defined Meta-Cognition

as knowledge about cognition and control of cognition. It comes from the root word

"meta", meaning beyond. It can take many forms; it includes knowledge about when

and how to use particular strategies for learning or for problem solving. There are

generally two components of Meta-Cognition: knowledge about cognition, and

regulation of cognition.

Meta-Memory, defined as knowing about memory and mnemonic strategies, is an

especially important form of Meta-Cognition. Differences in Meta-Cognitive

processing across cultures have not been widely studied, but could provide better

outcomes in cross-cultural learning between teachers and students.

Some evolutionary psychologists hypothesize that Meta-Cognition is used as a

survival tool, which would make Meta-Cognition the same across cultures. Writings on

Meta-Cognition can be traced back at least as far as On the Soul and the Parva

Naturalia of the Greek philosopher Aristotle.

As representatives of professional investors, stock funds can explore intrinsic

values of investment objectives before the market. Therefore, through the application

of the stock funds investment result, investing by “standing on the shoulders of giants”

can be a brand-new idea. Meta-Investment Strategy in this study is based on funds. It

compares enormous web information to all A shares, stock selection to search process

and equity funds to search engines. Through the construction of indicator system and

building of statistical modeling, the stock selection rules of stock funds can be

extracted for portfolio construction.

Decision Tree C5.0 Algorithm

Increasing methods of Data mining and machine learning have been applied to the

financial field. There have been many models of stocks selection, such as Neural

Network, Random Forest, Support Vector Machine (SVM), Genetic Algorithm (GA),

Rough Set Theory and Concept Lattices etc.

The aim of this research is to probe into Meta-Investment Strategy (based on sector

rotation theory) by searching for proper data mining and machine learning algorithm.

To realize the goal, firstly the comprehensibility of investment strategies has to be

considered. Therefore, algorithm which can be used to extract understandable rules is

the main approach in this study.

However, Neural Network and Random Forest are more suitable for large sample

of data. Besides, Neural Network cannot be used for rule extraction. SVM is applicable

to relatively small sample, whereas it is difficult to extract rules. In summary, Neural

Network, SVM) and Genetic Algorithm (GA) are suitable for prediction instead of rule

extraction. Thus, Decision Tree, Rough Set and Concept Lattices methods are more

suitable than the other prediction methods for the research purpose.

198 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

Some researchers had put forward the decision tree [13, 14, 15] and random forest

[16] used in the field of investment decisions. For example, Hu and Luo [14] applied

the decision tree model to sector selection, Sorensen and Miller et al. [15] utilized the

decision tree approach for stock selection. Liu et al. [16] proposed a random forest

model applied to bulk-holding stock forecast.

However, no available research directly applied stock funds’ investment result to

investment practice currently. In addition, although there were some related researches

on bulk-holding stock [16], they were not directly combined with investment practice.

Most importantly, because Meta-Investment Strategy is firstly launched in this

study, there are no specialized algorithms for meta-investment Strategy at present.

After comparison, C5.0 Decision Tree which is easy to extract understandable rules is

preferred.

Secondly, conditional attributes are continuous in data set. In terms of applying

Rough Set and Concept Lattices for extracting rules, discretization process is necessary,

which requires proper discretization model. C5.0 Decision Tree method, without the

discretization process, is comparatively easier to implement than the former.

Moreover, traditional extraction methods of comprehensible rules, which are used

to extract information from massive original data directly, are difficult for this research

because of several problems: (1) Massive data, large number of indicators and scattered

information make it difficult to extract rules; (2) Implicit rules of investment vary from

different periods because of various financial conditions and policies. Therefore, the

prediction accuracy is limited and rules are likely to contradict with each other; (3)

Operational speed is relatively slow when coping with massive data.

We aim to use Meta-Investment Strategy and rule extraction algorithm to solve the

aforesaid problems. Relevant research is limited in this field. It is built on existing

investment strategies and thus accuracy is improved. In this way, data size and

conflicting information are relatively less, which makes the extracted rules more

reasonable. Therefore, C5.0 is chosen for rule extraction.

modeling and rule extraction. Therefore, the decision tree model is used for rule

extraction.

For one thing, decision tree model is one of supervised learning methods. In

decision tree model, each example is a pair consisting input objects and target variables.

Through analyzing the training data, a set of inference rules is produced for mapping

new examples. For another, the goal of this study, namely, extracting stock funds’

stock selection funds is in accordance with the output of decision tree, which is a

inference rule set.

This study uses C5.0 Decision Tree Model and SPSS Modeler for rule extraction.

The splitting criterion is the normalized information gain (difference in entropy). The

attribute with the highest normalized information gain is chosen to make decision. C5.0

Decision Tree introduced boosting algorithm to enhance the accuracy [13, 14, 15].

Based on sector rotation theory and C5.0 Decision Tree Model, the modeling

process is shown by Figure 2.

L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 199

Selection of Industry

Is Heavily Held by Stock Funds

System Constrction

Data Back-testing

method, and six-stage method and so on. Merrill Investment Clock divided the business

cycle into four phases by using OECD “output gap” estimates and CPI inflation data.

Zhang & Wang [17] proposed that because traditional “output gap” and CPI inflation

data were quarterly released, it’s difficult to identify the economic inflection point.

Therefore, monthly figures including Macroeconomic Prosperity Index and Consumer

Price Index (The same month last year=100), shown in Figure 3, can be used as the

main indicators for classification of business cycle.

Source: CSMAR

For these reasons, this study adopt the four-stage method with Macroeconomic

Prosperity Index and Consumer Price Index (The same month last year=100) (in Table

200 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

1), by which the business cycle is divided into four stages: recovery, overheat,

stagflation and recession. Classification results of business cycle are shown in Table 2.

Table 1. Classification of the Four Stages in a Business Cycle

Phase

Recession Recovery Overheat Stagflation

Indicator

Consumer Price Index

ω ω χ χ

(The same month last year=100)

2009.03-2009.07 Recovery 67.93% 2012.08-2012.10 Recovery -4.40%

2009.08-2010.02 Overheat -13.34% 2012.11-2013.07 Overheat 1.52%

2010.03-2010.08 Stagflation -12.67% 2013.08-2013.10 Stagflation 0.64%

2010.09-2010.12 Overheat 8.47% 2013.10-2015.01 Recession 44.00%

2011.01-2011.07 Stagflation -6.82% 2015.02-2016.01 Stagflation -12.16%

2011.08-2012.07 Recession -21.65% 2016.02-2016.04 Recovery 8.81%

finance is one of the most recommended industries to invest on, which is strong

focused on by three of the four major funds (Guotai Junan Securities, Shenyin &

Wanguo Securities, and Orient Securities) in Recovery Stage (Table 3).

Table 3. Industry Selection of Different Securities

Guotai Energy, Finance, Energy, Materials, Telecom, Consumer Health Care, Utilities,

Junan Consumer Finance Goods, Health Care Consumer Goods

Securities Discretionary

Shenyin Nonferrous Metals, Nonferrous Metals, Agriculture & Utilities, Health Care,

& Real Estate, Mining, Real Estate, Fishing, Health Finance,

Wanguo Finance, Ferrous Metals Care, Network Transportation

Securities Information Equipment,

Technology Electrical

Components

Orient Food & Beverage, Mining, Nonferrous Health Care, Food & Finance, Ferrous

Securities Nonferrous Metals, Metals, Beverage, Metals, Chemicals,

Real Estate, Transportation, Machinery, Utilities, Real Estate, Food &

Restaurant & Food Ferrous Metals Construction Beverage

Services, Tourism, Materials

Finance

Guoxin Real Estate, Agriculture, Home Utilities, Real Estate,

Securiteis Transportation, Appliances, Mining, Transportation, Transportation, Home

Mining, Restaurant Nonferrous Metals, Health Care, Food & Appliances, Electrical

& Food Services, Machinery, Trading Beverage Components,

Nonferrous Metals and Retailing Nonferrous Metals

L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 201

economic fluctuation. Therefore, finance industry is chosen as a sample to investigate

on meta-investment strategy.

Since 2009, mainstream securities have studied the investment clock in China. Typical

investigations include Guotai Junan Securities (2009), Shenyin Wanguo Securities

(2009), Orient Securities (2009) and Guoxin Securities (2010). The methodologies

including classification of domestic business cycle and statistical processing to

different industries are similar despite the different industry classification benchmark

and time range.

Chinese capital market is immature, for Chinese financial market greatly changes

because of different policies in different periods. During the period from 2016.02.01 to

2016.04.30, data is relatively more comprehensive and timely. In this way, extracted

rules are more likely to comply with implicit ones of Chinese capital market. In

addition, there are 51 stocks in finance industry at present. If we choose data before

2015, data size will be reduced greatly. For example, Guoxin Securities (002736) went

public in December, 2014 while Orient Securities (600958), Guotai Junan Securities

(601211), Dongxing Securities (601198) and Shenwan Hongyuan Group (000166)

went public in 2015. In conclusion, the chosen timeframe is considered from three

aspects: sector rotation theory, data size and timeliness.

It’s important to note that this study judges this period (2016.02.01-2016.04.30) to

be recovery. Subsequently, the training samples are confined to the first three quarters

in 2015. Financial and technical indicators are imported as input variables. Because of

the time lags of financial indicators, whether the stock in financial industry is heavily

held in the next quarter is set as target variable. Classification rules are produced

through C5.0 Decision Tree. Then the data of the last quarter in 2015 is imported for

classification and prediction and a portfolio is constructed with each chosen stock

weighted equally. Finally, performance of this portfolio is back tested from 2016.02.01

to 2016.04.30.

Table 3 summarizes the findings of the four studies mentioned above.

Since the research period in this study is recovery from 2016.02.01 to 2016.04.30,

finance industry is chosen according to the conclusion above.

All the input variables and target variable are shown in Table 4.

We manually chose input variables from four dimensions - profitability, operating

capacity, technical factors and indicators per share according to theory of financial

statements theories and previous researches [18-21].

In this study, sample size for model construction is 149. Samples are divided into

training set (70%) and test set (30%). According to Industry Classification of China

Securities Regulatory Commission (CSRC), number of stocks in China’s financial

industry is about 50. Because data used for model construction is confined to the first

three quarters in 2015, after excluding invalid samples, 149 samples in total are

available.

This study doesn’t adopt traditional stock selection model. Instead, we combine

sector rotation strategy and Meta-Investment Strategy. Therefore, after classifying

202 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

business cycle, choosing finance industry and research period (the first three quarters in

2015 for model construction), size of data that needs to be processed and noise data are

greatly reduced.

Table 4. Description of Variables

Indicators

Profitability ROA (TTM) ROATTM Input Variables &

Metric Variable

EBIT Margin (TTM) EBITMarginTTM

Cash to Total Profit Ratio (TTM) CPRTTM

Indicators EPS (TTM) EPSTTM Input Variables &

Per Share Metric Variable

Net Cash Flow Per Share (TTM) NCFPSTTM

Net Cash Flow Per Share (TTM) NCFOATTM

Net Cash Flow From Investing NCFIATTM

Activities (TTM)

Operational Price-Earnings Ratio (PE TTM) PERTTM Input Variables &

Ability Metric Variable

Price-Sales Ratio(PS TTM) PSRTTM

Price to Cash Flow (PCF TTM) PCFTTM

Technical Prior Three-Month Momentum Momentum Input Variables &

Indicators Metric Variable

Whether it is >0.02% of Net Value of Stock HH Target Variable &

Heavily Held by Funds Nominal Variable

Equity Funds

“Yes”=1,“No”=0

In the training set, there are 12 input variables and 1 target variable. “Whether It Is

Heavily Held by Stock Funds” is target variable. In this way, 13 indicators in total are

imported during model construction. When applying this model, by importing 12 input

variables, target variable is forecasted.

Additional notes of target variables: whether market value of a stock held by

Public Stock Funds is greater than 2% of Net Asset of all Public Stock Funds.

The research period of this study is confines to the first three quarters in 2015. Because

twelve of the input variables are from lagging financial statements, this study sets the

rules shown in Table 5.

Table 5. Usage of different Types of Report

Seasonal Report of the 1st quarter Holdings of equity funds on 2015.06.30 Training Sample

Semi-annual Report Holdings of equity funds on 2015.09.30 Training Sample

Seasonal Report of the 3rd quarter Holdings of equity funds on 2015.12.31 Training Sample

L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 203

By importing 13 stock indicators of all the component stocks in finance industry in the

first three quarter of 2015 as input variables and whether it is heavily held by stocks in

the next quarter as target variable, a set of inference rules is generated as follow.

Detailed rules are shown in appendix.

Rule 1 - estimated accuracy 89.22% [boost 96.1%]

NCFIATTM <= -4.903720 [ Mode: 1 ] => 1.0

NCFIATTM > -4.903720 [ Mode: 0 ]

Momentum <= -0.0665 [ Mode: 0 ] => 0.0

Momentum > -0.0665 [ Mode: 1 ]

Momentum <= 0.3515 [ Mode: 1 ]

PCFTTM <= 3.331410 [ Mode: 0 ]

PCFTTM <= -178.095000 [ Mode: 1 ] => 1.0

PCFTTM > -178.095000 [ Mode: 0 ] => 0.0

PCFTTM > 3.331410 [ Mode: 1 ] => 1.0

Momentum > 0.3515 [ Mode: 0 ] => 0.0…

Hence, we extract some rules and explain them.

For example, the first rule: NCFIATTM <= -4.903720 [ Mode: 1 ] => 1.0.

It means Net Cash Flow of Investment Activities (Trailing Twelve Months) which

is less than -4.903720 is chosen (1.0). In the field of commercial bank management,

banks have fixed demand of asset allocation. In China, the main investment activity of

commercial banks is purchasing treasury bonds. Because of the expansion of a bank’s

asset, the less the Net Cash Flow of Investment Activities (NCFIA) is, the faster of its

expansion. For example, in 2015 a bank has asset of RMB ¥100 Yuan, in which 30% is

allocated as one-year treasury bonds. In 2016 this bank has asset of RMB ¥120 Yuan,

in which 30% is allocated as one-year treasury bonds. The annual rate of return is 3%.

Therefore, in its financial statement, Net Cash Flow of Investment Activities is -6 (-

120*0.3+100*0.33). Minus sign means capital outflow while positive sign means

capital inflow.

The second rule: Momentum > -0.0665 [ Mode: 1 ] .

It means Three-Month Momentum which is greater than -0.0665 is chosen (1.0). In

short-term investment, there is an effect called “Momentum effect”. That is to say, rate

of return of a stock has the tendency of following the original trend.

From the above explanation of two most important rules, we know that the

extracted rules are reasonable. Certainly, we can also explain the others, which shown

in appendix.

4. Results Analysis

performance of 300 stocks traded in the Shanghai and Shenzhen stock exchanges.

Therefore, it can be used as performance benchmark.

204 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

By importing 12 financial indicators in the last quarter of 2015 (up to the end of

2015.12.31) and prior three-month momentum before 2016.02.01 of all the component

stocks in finance industry, classification results are produced. From those stocks with

classification value of “1”, stocks with a confidence level higher than 90% are chosen.

Subsequently, a portfolio is constructed with each chosen stock weighted equally.

Finally, performance of this portfolio is back tested from 2016.02.01 to 2016.04.30.

Classification results and performance of the portfolio are shown in Table 6.

Table 6. Classification Results

value, whether it is confidence

heavily held by equity

funds )

000001 Ping An Bank 1 1

000776 Guangfa Securities 1 1

002142 Ningbo Bank 1 1

600000 Shanghai Pudong Development Bank (SPDB) 1 1

600016 Minsheng Bank (CMBC) 1 1

600036 China Merchants Bank㧔CMB㧕 1 1

601166 Industrial Bank (CIB) 1 1

601198 Dongxing Securities 1 1

601318 Ping An Insurance (Group) Company of China 1 1

601328 Bank of Communications 1 1

601818 China Everbright Bank Company 1 1

Result below (Figure 4. & Table 7) shows that Meta-Investment Strategy

outperformed CSI300, CSI300 of Finance Sector (000914) and yielded significant

excess return with a winning rate of 68.97% from 2016.02.01 to 2016.04.30.

Figure 4. Performance

L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 205

Period 2016.02.01-2016.04.29

Cumulative Return of CSI300 (399300) 9.12%

Cumulative Return of CSI300 Finance Sector (000914) 9.45%

Cumulative Return of Portfolio Based on Meta-Investment Strategy 12.53%

Winning Rate (Ratio of Days outperforming CSI300 to Total Days) 68.97%

In this study, 149 samples are divided into training set (70%) and test set (30%).

Number of trails of boosting is five. It is used for amplifying sample size and

enhancing accuracy.

After selecting training set and test set and five iterations, the overall accuracy is

up to 96.1%. It is necessary to note that samples in each iteration are different to some

extent. The first model is built on equal probability sampling of training set, while the

second model is mainly based on the incorrectly classified samples of the first model.

The third model focuses on incorrectly classified samples of the second model and so

forth. Therefore, the estimated accuracy is different among rules.

It is also necessary to explain that the purpose of setting “Whether it is Heavily

Held by Equity Funds” as target variable is not for forecasting bulk-holding stocks of

stock funds. Instead, our purpose is to apply investment result of stock funds, extract

principals and rules and invest by “standing on the shoulders of giants”. Therefore, this

study uses the comparison among cumulative return of portfolio, cumulative return of

CSI300 Finance sector and cumulative return of CSI300 to test effects of extracted

rules and stock selection model.

5. Conclusions

order to extract comprehensive rules and select proper investment strategies, this study

is based on investment results of stock funds.

There is mounting evidence in the literature that sector rotation phenomenon

incorporate the economic cycle. Armed with this evidence, we investigate on the nature

of sector rotation strategy from three aspects (business cycle, monetary shock and lead-

lag relationship). In this way, we draw to a conclusion that sector rotation originates

from the changes of business cycle and monetary shock. In addition, industrial

structure determines the expression form of sector rotation.

Furthermore, this study firstly proposes Meta-Investment Strategy, which is an

extension from the concept of Meta-Cognition and Meta-Search Engine. Meta-

Investment Strategy is based on stock funds. To facilitate understanding, we compare

enormous web information to all A shares, process of searching information to stock

selection and search engines to equity funds. Through the construction of indicator

system and building of statistical modeling, the stock selection rules of funds can be

extracted for portfolio construction.

Finally, we combines sector rotation theory and decision tree model. After

classifying the period from 2016.02 to 2016.04 as recovery, we selected finance

206 L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock

industry as input variables and whether it is heavily held by stocks funds as target

variable, the decision tree model is constructed. Subsequently, by entering data of the

last quarter in 2015, the predictive classification results are obtained. Result shows that

Meta-Investment Strategy outperformed CSI300 and CSI300 of Finance Sector

(000914) and obtained significant excess return from 2016.02.01 to 2016.04.30.

However, due to limitations of time, energy and data resources, the data back-

testing does not includes another three phases of economy, namely, overheat,

stagflation and recession. Follow-up studies will consider loosing restrictions on

research period and industries. Moreover, decision tree model in this study is static. A

dynamic decision tree model will be constructed in the follow-up studies, by which

training samples can be increased and validity of inference rules can be enhanced.

This study did not choose traditional stock selection model, which usually select

stocks from massive data. It requires complex data processing operations because of

noise data. Instead, stock selection model in this study can be seen as a secondary filter

(Its stock screening process is based on stock funds’ investment results). There are

various advantages. For example, it is easy to operate with a relatively small amount of

computation and stock selection rules can be extracted directly.

This study applied public stock funds’ investment result to investment practice

directly for the first time. By choosing “Whether It Is Heavily Held by Stock Funds” as

target variable and building stock selection model, portfolio was constructed, which

rate of return outperformed average market rate of return. In this way, our results prove

that stock funds’ investment result can be used for portfolio construction and portfolio

optimization.

Acknowledgments

This paper is supported by the National Natural Science Foundation of China (No.

71271061), the National Students Innovation Training Program of China (No.

201511846058), Student Science and Technology Innovation Cultivating Projects &

Climbing Plan Special Key Funds in Guangdong Province (No. pajh2016b0174),

Philosophy and Social Science Project (No. GD12XGL14) & the Natural Science

Foundations (No. 2014A030313575, 2016A030313688) & the Soft Science Project

(No. 2015A070704051) of Guangdong Province, Science and Technology Innovation

Project of Education Department of Guangdong Province (No. 2013KJCX0072),

Philosophy and Social Science Project of Guangzhou (No. 14G41), Special Innovative

Project (No. 15T21) & Major Education Foundation (No. GYJYZDA14002) & Higher

Education Research Project (No. 2016GDJYYJZD004) & Key Team (No. TD1605) of

Guangdong University of Foreign Studies.

References

[2] P. Sassetti, M. Tani, Dynamic asset allocation using systematic sector rotation, Journal of Wealth

Management 8 (2006), 59-70.

[3] Y. Peng, W. Zhang, The research on strategy and application of sector rotation in Chinese stock market,

The Journal of Quantitative & Technical Economics 20 (2003), 148-151.

L.-M. He et al. / A Decision Tree Model for Meta-Investment Strategy of Stock 207

[4] Y. Ye, The cointegration analysis to stock market plate indexes based on association rules, Statistical

Education 9 (2008), 56-58.

[5] W. Yang, Research of sector rotation across the business cycle in the Chinese A share market, Wuhan:

Huazhong University of Science & Technology, 2011

[6] X. Lin, J. Dai, Quantitative and structural analysis of Guoxin investment clock (Report), Shenzhen China

(2012).

[7] C. M. Conover, G. R. Jensen, R. R. Johnson, et al., Is fed policy still relevant for investors? Financial

Analysts Journal 61 (2005), 70-79.

[8] H. Chen, Industry allocation in active portfolio management, Wuhan: Huazhong University of Science &

Technology (2011).

[9] M. Su, Y. Lu, Investigation on sector rotation phenomenon in Chinese A share market—from a

perspective of business cycle and monetary cycle, Study and Practice 27 (2011), 36-40.

[10] C. He, Analysis of sector rotation phenomenon in Chinese A share market, Economic research 47

(2001), 82-87.

[11] E. W. Glover, S. Lawrence, W. P. Birmingham, et al., Architecture of a metasearch engine that

supports user information needs, Conference on Information and Knowledge Management, 1999.

[12] J. H. Flavell, Metacognition and cognitive monitoring: A new area of cognitive-developmental inquiry,

American Psychologist 34 (1979), 906 – 911.

[13] W. Xue, H. Chen., SPSS modeler--the technology and methods of data mining, Beijing: Publishing

House of Electronics Industry (2014).

[14] H. Hu, J. Luo, Profitability and momentum are the key factors of selection of industries—the

exploration of the decision tree applied in sector selection (Report), Shenzhen China (2011).

[15] E. H. Sorensen, K. L. Miller, C. K. Ooi, The decision tree approach to stock selection, Journal of

Portfolio Management 27 (2000), 42-52.

[16] W. Liu, L. Luo, H. Wang, A forecast of bulk-holding stock based on random forest, Journal of Fuzhou

University (Natural Science Edition), 36 (2008), 134-139.

[17] L. Zhang, C. Wang, The investigation of Chinese business cycle and sector allocation on the

macroeconomic perspective (Report), Shenzhen China (2009).

[18] L. Zhang, Stock Selection Base on Multiple-Factor Quantitative Models, Shijiazhuang: Hebei

University of Economics and Business (2014).

[19] J. Zhao, Sector Rotation Multi-factor Stock Selection Model and Empirical Research on its

Performance, Dalian: Dongbei University of Finance and Economics (2015).

[20] P. Wang, J. Yu, Analysis of Financial Statements, Beijing: Tsinghua University Press (2004).

[21] H. Peng, X.Y. Liu, Sector Rotation Phenomenon Based on Association Rules, Journal of Beijing

University of Posts and Telecommunications, 18 (2016), 66-71.

208 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-208

Blurred Boundaries of Next Generation

Computing Era

Hyun-A PARK 1

Department of Medical Health Sciences, KyungDong University,

815 Gyeonhwon-ro, MunMak-eub, WonJu-City, Kangwon-do, Korea

Abstract. This paper deals with the security problems facing with next genera-

tion computing environments. As the method, Virtualized Security Defense Sys-

tem (VSDS) is proposed under the application of ‘Trello’(web application) for on-

line patient networks, it deals with the following problems; (1) blurred security

boundaries between attackers and protectors, (2) group key management system,

(3) secret-collaborative works and sensitive information-sharing for group mem-

bers, (4) preserving privacy, (5) rendering of 3D image(member indicator, high

level of security). Consequently, although current IT paradigm is changing to more

‘complicated’, ‘overlapped’ and ‘virtualized’, VSDS makes it securely possible to

share information through collaborative works.

Keywords. Blurred Security Boundaries, Virtualized Security Defense System,

PatientsLikeMe, Trello, Group key, Reversed hash key chain, VR/AR, Member

indicator, Pseudonym

Introduction

0.1. Computing Environments for Next Generation and Problem Identiﬁcation

have developed new types of IT-enabled product and service innovations in our daily-

lives. The important features of these IT innovative technologies are highly advanced

wireless techniques such as mobile-internet, SNS, cloud, or big data technologies in the

networked collaborative computing environments.

Currently, IT paradigm, which has been changed from wired to wireless or to inte-

grated information environments, has made the information boundaries blurred between

attackers and protectors. Here, one of the most important problems is - although infor-

mation sharing is highly increased through collaborative works, virtualized IT resources

and overlapped trust boundaries have given rise to security dilemma about to protect

‘what information boundaries’ and ‘what characteristic information’[1]. Considering se-

curity information systems and mobile application researches, the security has consid-

1 Corresponding Author: Hyun-A Park, Department of Medical Health Sciences, KyungDong University,

H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 209

ered a clear objective in the traditional IT environments, and it is divided as two groups–

attackers and protectors, in which the security specialists take responsibility to prevent

the attacks and threats from outsiders using their knowledge in the security architec-

ture. On the other hand, at present, the changing IT paradigm has made the information

boundaries between attackers and protectors blurred.

The characteristics of next generation computing era (IT paradigm) can be summa-

rized as follows; (1) the increase of collaborative works through the network connections,

(2) the increase of information sharing in information-oriented society, (3) the blurred

security boundaries to protect, which is caused by virtualized IT resources and migration

policy, (4) the increase of 3D data such as in VR/AR.

Therefore, in this paper, to solve the problem against ‘blurred security boundaries’,

Virtualized Security Defense System (VSDS) is proposed under the web application of

‘Trello’ to construct online patient networks very similar to ‘PatientsLikeMe’.

The solution for the problem is the security defense system for next generation’s com-

puting environments. The application is the web application ‘Trello’. With the Trello,

online patient network is constructed very similarly to ‘PatientsLikeMe’, because ‘Pa-

tientsLikeMe’ is very difﬁcult to use for the patients(users) in non-English speaking re-

gions and the patients suffering from all other diseases. Hence, VSDS is extended for the

persons(ex. Researcher) having interests about the same diseases and the patients who

use all other languages including English and struggling against all other diseases by

using ‘Trello’. The main methods are as follows.

1. The proposed system VSDS (Virtualized Security Defense System) is the new con-

cept for security solutions. Its goal is to ﬁgure out the problem against ‘blurred security

boundaries’, so that VSDS ﬁgures out the problem by constructing the ‘Virtualized’ se-

curity solution for next generation’s computing environments. As the methods, it largely

uses Cryptographic Techniques and Member Indicator as 3D Video Image Technology

for virtualized IT resources. Especially, Member Indicator is a new security solution re-

ﬂecting the characteristics of next generation’s computing.

2. VSDS should be secure and efﬁcient group key management system, because in-

formation sharing has been and will be highly increased even in blurred security bound-

aries. As the method, each member’s group key is made based on Reversed One-way

Hash chain. According to onewayness properties of hash function[2], VSDS can guar-

antee Forward Secrecy and Backward Accessibility, Group Key Secrecy[3], which are

security requirements in group information-sharing system, VSDS.

2-1. Forward Secrecy and Backward Accessibility of security requirements should

be satisﬁed. In VSDS, a leaving member cannot know the next group key(Forward Se-

crecy) but a joining member can know all the previous keys and information (Backward

Accessibility) by properties of the reversed hash key chain. Therefore, VSDS is suitable

for secret-collaborative works and sensitive information-sharing for group members.

2-2. VSDS does not need to do re-keying for membership-changes. The principle of

each member’s group key generation: A ﬁxed fundamental group key is assigned to each

group, and random numbers for each user and his every session are newly generated.

After applying the group key and random numbers to hash function respectfully, revers-

edly and repeatedly, the hashed group key and random numbers are combined according

210 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era

to the given(developed) equation algorithm. Then, total ﬁve sub-group keys are made as

each member’s group session key.

Hence, every member has different group keys for each session. However, con-

sequently, every result of authentication(including encryption/decryption) is the same

as the fundamental group key’s result under the computation of the developed proto-

col(equation algorithm).

One of the most important things is that the same result value between fundamen-

tal group key and all other group keys has no need of re-keying processes whenever

membership-changes.

3. VR/AR Technique: A new concept of the 3D Video Image Mobile Security Tech-

nology Solution is proposed. As a member indicator, the 3-dimensional realistic models

which are decided at the registration time should be rendered in the log-in process to be

authenticated as a legitimate user [4].

4. VSDS preserves privacy. (1)Anonymity and Pseudonymity; Every session we use

pseudonymity. Although perfect anonymity cannot be provided, instead pseudonymity

can be provided, (2) Unlinkability; Every session users log-in with different pseudonyms

(Pd) and use different encryption keys(each member’s group key). Consequently, VSDS

can achieve the similar level of security to ’One Time Encryption’. (3) Unobservability;

All information is encrypted and pseudonym is changed every session by reversed hash

chain [5].

5. Access Control by Cryptographic Techniques and VR/AR Technique

6. VSDS is scalable to other group project systems on the websites. Application sce-

nario is about patient networks on the web, however VSDS is extendable to other secure

group projects.

Among the researches related to main methods - Cryptographic techniques(especially,

group key management systems) and a member indicator as VR/AR technique, only one

research area about group key management systems is introduced and reviewed as related

works, because VR/AR Technique was applied just to use the new concept of security

solution.

The research areas about group key are so various such as group key agreement,

exchange, revocation, multicast/broadcast, yet this work only focuses on group key ap-

plication for multi-users. Especially, VSDS has a little different property from general

group key in that the security requirement of VSDS is not Backward Secrecy(a joining

member cannot know all the previous keys and information) but Backward Accessibil-

ity(a joining member can know all the previous keys and information). It is caused by

that the goal of application environments is all information-sharing with present group

members. That is the reason why VSDS is related to the research area of search schemes

for multi-users setting.

According to [6], Park et al.’s privacy preserving keyword-based retrieval proto-

cols for dynamic groups [7] is the ﬁrst work for the multi-users setting in secret search

schemes. In [7], Park et al. generate each member’s group session key based on reversed

hash key chain and their shceme also stisﬁes with backward accessibility. As for other

H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 211

researches based on reversed hash key chain, there are [3], [8]. [3] proposed two practical

approaches - efﬁciency and group search in cloud datacenter, where the authors deﬁned

the group search secrecy requirements including baward accessibility. [8] suggested the

protocol for designated message encryption for designated decryptor, so that they make

a server see the only corresponding message in the cloud service system based on onion

modiﬁcation and reversed hash key chain.

As for the multi-users setting researches not-based on reversed hash key chain, there

have been the following works; [6] proposed the common secure indices to make multi-

users obtain securely the encrypted group’s documents without re-encrypting them,

which is based on keyword ﬁeld, dynamic accumulators, Paillier’s cryptosystem and

blind signatures. They formally deﬁned common secure index for conjunctive keyword-

based retrieval over encrypted data (CSI-CKR) and its security requirements. The next

year, they proposed another scheme of keyword ﬁeld-free conjunctive keyword searches

on encrypted data in the dynamic group setting [9], whereby the authors solve the open

problem asked by Golle et al. In [10], Kawai et al. showed the ﬂaw of Yamashita and

Tanaka’s scheme SHSMG, and they suggest a new concept of Secret Handshake scheme;

monotone condition Secret Handshake with Multiple Groups (mc-SHSMG) for members

to authenticate each other in monotone condition. [11] suggested a new effective fuzzy

keyword search in a multi-user system over encrypted cloud data. This system supports

differential searching and privileges based on the techniques; attribute-based encryption

and Edit distance, which achieves optimized storage and representation overheads.

In this paper, VSDS generates group session keys for each user which are composed

of ﬁve sub-keys by using reversed hash key chain and random numbers. According to the

developed encryption/decryption algorithm, the group key achieves no need of re-keying

processes whenever membership changes happen.

1.2. Application

‘PatientsLikeMe’ is online patient networks, actually, VSDS does not apply to the web

‘PatientsLikeMe’ directly. The substantial application for VSDS is the web application

‘Trello’. It is intended that the proposed system VSDS, which is applied to Trello with

cryptographic and security techniques, can accomplish the goal and functional roles of

PatientsLikeMe. Hence, we need to know both of two websites.

Trello. Trello is a web-based project management application. Generally, basic ser-

vice charge is free, except for a Business Class service. Projects are represented by boards

containing lists (corresponding to task lists). Lists contain cards to progress from one list

to the next. Users and boards are grouped into organizations. Trello’s website can access

to most mobile web browsers. Trello dose various works such as real estate management,

software project management, school bulletin boards, and so on [12].

PatientsLikeMe. This online patient network has the goal of connecting patients

with one another, improving their outcomes, and enabling research. PatientsLikeMe

started the ﬁrst ALS (amyotrophic lateral sclerosis) online community in 2006. There-

after, the company began adding other communities such as organ transplantation, mul-

tiple sclerosis (MS), HIV, Parkinson’s disease and so on. Today the website covers more

than 2,000 health conditions. The approach is scientiﬁc literature reviews and data-

sharing with patients to identify outcome measures, symptoms, treatments through an-

swering questions [13].

212 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era

Using the web project application ‘Trello’, Security Defense System VSDS is con-

structed for the patients with any disease in all over the world, just like ‘PatientsLikeMe’.

The reasons are; (1) PatientsLikeMe does not deal with all kinds of disease. Although

the company began adding other communities such as MS, Parkinson’s disease, so many

other patients want to get in such kind of web and to be helped more easily. (2) The lan-

guage of PatientsLikeMe allows only for ‘English’. Patients in non-English speaking re-

gions are so difﬁcult to sign in and use. The system is for group members who want to get

helps through information-sharing. The information scope is health conditions and pa-

tient proﬁle. Mostly, the sensitive information could be shared but some secret personal

data in patient proﬁle should not be revealed to anyone. Plus one more important thing

is that the system is Virtualized SDS using 3D image rendering for the next generation

computing.

The details are as follows; A board is assigned to one group. A list containing cards

is assigned to a user. Each member uploads his/her conditions or information to a card

and then the information is shared.

VSDS has three parties; Users, SM(Security Manager), VSDS Server. SM(Security

Manager) is a kind of a client, which is granted a special role of a security manager. SM

is assumed as a TTP (trusted third party) and it is located in front of the VSDS server. SM

controls group-key and key-related information, all sensitive information, and all other

events with powerful computational and storage abilities. Fig.1 is the system conﬁgura-

tion of VSDS. Every user should register at SM at ﬁrst, thereafter they should get through

the authentication process every session and then they start some actions. When some

information is shared with other patients (it means that the shared card is generated), we

know the card is encrypted by the group’s encryption key. Only the legitimate users (who

registered at SM and have stored the information given by SM for authentication at his

device) can pass authentication processes and know the sharing information. In the last

H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 213

This image can be said as a member indicator.

2.1. Notations

• KG : the fundamental group key of group G

• m : the number of group G’s members, j : session number, i : each member of group G

j

• kmi : group session key for each member ’i’ in the j-th session

j j j j j j

• Ki,1 , Ki,2 , Ki,3 , Ki,4 , Ki,5 : ﬁve subkeys for i’s group key kmi

j

• αi : random number of member i in the j-th session

j

• pdi : pseudonym of member i in the j-th session

• h(·) : hash function, f (·) : pseudorandom function

• C, E : Encryption function, D : Decryption function

q

• Vi : a video image information for a member i to render at q-th session, RV : a rendered

image of V

2.2.1. Group member’s group keys.

We assume that there are ‘m’ members of the group ‘G’, then the group key for the group

G is KG and the group keys for each member ‘i’ are kmij , (1 ≤ i ≤ m, 1 ≤ j ≤ q). Here, j

is a session number and q is the last session. The each member i’s group key kmij consists

j j j j j

of totally ﬁve subkeys; Ki,1 , Ki,2 , Ki,3 , Ki,4 , Ki,5 . We generate random number αiq for these

subkeys. Therefore, the last session group key of user ‘i’ is kmqi =;

q q

Ki,1 = h(KG )αi ,

q q

Ki,2 = h(KG ) fKG (KG )(1 − αi ),

q

Ki,3 = g fKG (KG ) ,

q q

Ki,4 = −(h(KG ) + αi ),

q q

Ki,5 = fKG (KG )αi

We assume the total number of sessions is q. For every member i, we generate each

different random number αiq (1 ≤ i ≤ m) for the last session. Here, we again apply αiq to

hash function (q-1) times repeatedly to generate all sessions’ random number as follows.

q

αi , (randomly generated)

q q−1

h(αi ) = αi

q−1 q−2 q

h(αi ) = αi = h2 (αi )

q−2 q−3 q

h(αi ) = αi = h3 (αi )

.........

q

h(αi4 ) = αi3 = hq−3 (αi )

q

h(αi ) = αi = h (αi )

3 2 q−2

q

h(αi ) = αi = h (αi )

2 1 q−1

Therefore, the ﬁrst session’s random number of member i is αi1 and the s-th session’s

214 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era

random number of member i is αis ; h(αis+1 ) = αis = hq−s (αiq ). To make the member i’s

group session keys, αij is changed to αij+1 , 1 ≤ j ≤ q − 1 in the member’s group key.

With these different random numbers, we can make all different group keys for each

member and each session respectively.

One-way hash function h() plays the important role of group information-sharing

system in VSDS. One-Way Hash Key Chain is generated by randomly selecting the last

value, which is repeatedly applied to an one-way hash function h(). The initially selected

value is the last value of the key chain. One-way hash chain has two properties; 1. Any-

one can deduce an earlier value(ki ) with the later value(k j ) of the chain by computing

h j−i (ki ) = k j . 2. An attacker cannot ﬁnd a later value(k j ) with the latest released value(ki )

because of h j−1 (k j ) = ki . Therefore, two properties make it possible that a leaving mem-

ber cannot compute new keys after leaving the group and any newly joining member

can obtain all previous keys and information through applying the current key to hash

function h() repeatedly.

In this scheme, there are each member’s pseudonyms, which are generated with the re-

versed hash key chain as the same way of group session keys. Thus, each member has

also q pseudonyms which are denoted as pdij (for each member i, 1 ≤ j ≤ q).

q

pdi , (randomly generated)

q q−1

h(pdi ) = pdi

q−1 q−2 q

h(pdi ) = pdi = h2 (pdi )

q−2 q−3 q

h(pdi ) = pdi = h3 (pdi )

.........

q

h(pdi4 ) = pdi3 = hq−3 (pdi )

3 2 q−2 q

h(pdi ) = pdai = h (pdi )

2 1 q−1 q

h(pdi ) = pdi = h (pdi )

We assume that the encryption method for a massage ‘M’ with the group key ‘KG ’ is

q q q q q

C = gh(KG ) f (KG ) M. For simplicity, we put Ki,1 , Ki,2 , Ki,3 , Ki,4 , Ki,5 as K1 , K2 , K3 , K4 , K5 and

fKG (KG ) as f (KG ). Then, the encryption method with the each member’s group key kmij ,

for example, in the last session (i.e. j=q, kmqi ) is as follows.

q q

C = Ekmq (M) = K3K1 gK2 M = (g f (KG ) )h(KG )αi gh(KG ) f (KG )(1−αi ) M = gh(KG ) f (KG ) M. We can

i

check that the result of encryption with the group key ‘KG ’ is the same as one with each

member’s group key kmij , that is K1 , K2 , K3 , K4 , K5 .

The decryption method with the group key ‘KG ’ is D = C · g−h(KG ) f (KG ) = M. Then,

the decryption method with the the each member’s group key kmqi in the last session is;

q q

D = C · K3K4 gK5 = gh(KG ) f (KG ) M · (g f (KG ) )−(h(KG )+αi ) · g f (KG )αi =

q q

gh(KG ) f (KG )− f (KG )h(KG )− f (KG )αi + f (KG )αi · M = M

We can also check that the result of decryption with the group key ‘KG ’ is the same

as one with each member’s group key kmij . Because of the properties of this developed

encryption and decryption algorithm, VSDS can achieve no need of re-keying processes

whenever membership-changes happen.

H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 215

cryption with the secret key k.

At the registration stage, SM assigns the 3-dimensional real models of the number of

j(3 ≤ j ≤ t, t value depends on the condition and policy of systems) to each member i of

each group A, and the members keep the given j 3D real models for later authentication.

s is put as the

Every group has its particular j models(3D shaped thing) respectfully. VA,i

video image information for the 3D model of the member i in group A at the s-th ses-

sion. Every session SM selects one model of the group’s 3D models and challenges the

member of the group with VA,is . Then, the member renders the 3D real model for V s .

A,i

2.5.1. Registration.

As the ﬁrst process, every user should register at the Security Manager (SM). In this

registration Stage, pseudonyms, group members’ group keys, group session keys, and

the other information including member indicators are generated for each user to use this

system with safety.

Then, every user is given some information from SM. They stores them in one’s own

device such as smartphone or PC and keep VA,i s to j 3D real models. The given infor-

mation for each member i is as follows; h(Ekm1 (pdi1 ||V )), pdi1 , km1i , {h(Ekm j (pdij )), (1 ≤

i i

j ≤ q)}.

SM should also store some information for each member ; αiq , the values for pseudonym

hash key chain {h(pdij ), pdij , (1 ≤ j ≤ q)}.

Fig. 2 shows the whole process of VSDS from Registration.

1. With the stored value pdi1 , km1i , a member i computes f pd 1 (km1i ), h(pdi1 ), then sends

i

the below information in 1 to SM, where h(Ekm1 (pdi1 ||Vi1 )) is also the stored value

i

at registration time. Because km1i is the member i’s group key in the ﬁrst session,

K1 1

Ekm1 (pdi1 ||Vi1 ) means C = Ekm1 (pdi1 ||Vi1 ) = K31 1 gK2 (pdi1 ||Vi1 )

i i

1 1

= (g f (KG ) )h(KG )αi gh(KG ) f (KG )(1−αi ) (pdi1 ||Vi1 ) = gh(KG ) f (KG ) (pdi1 ||Vi1 ). Here, for simplic-

ity, K11 , K21 , K31 are denoted as the member i’s subkeys for its group key km1i in the ﬁrst

session. K41 , K51 are also the subkeys for km1i .

2. After receiving the information from the member i, SM checks 1(s), h(pdi1 ) and

ﬁnd the corresponding values αiq , pdi1 from its storage. Then, with pdi1 , SM decrypts

D( f pd 1 (km1i )) and gets km1i . For the found value αiq , SM applies αiq to hash function

i

repeatedly, to the (q-1) times. If he obtains the result αi1 , then SM computes km1i =;

K11 = h(KG )αi1 , K21 = h(KG ) fKG (KG )(1−αi1 ), K31 = g fKG (KG ) , K41 = −(h(KG )+αi1 ), K51 =

fKG (KG )αi1 .

216 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era

Registration

The First Session

User SM

Log-in Stage

1. Compute: f pd 1 (km1i ), h(pdi1 )

i

{1(s),h(pdi1 ), f pd 1 (km1i ),h(Ekm1 (pdi1 ||Vi1 ))}

−−−−−−−−−−−−−−−−−−i−−−−−−→ i

q

2. Find: 1(s), h(pdi1 ) → αi , pdi1

Decrypt: D( f pd 1 (km1i )) = km1i

i

q

Compute:hq−1 (αi ) = αi1 ,

km1i = {K11 , K21 , K31 , K41 , K51 }

Verify: km1i = km1i

Compute & Verify:

h(Ekm1 (pdi1 ||Vi1 )) =? h(Ekm1 (pdi1 ||Vi1 ))

i i

3. Compute: αi , kmi

2 2

f pd 1 (km2i ,pdi2 ), f pd 2 (pdi1 ||Vi1 )

Compute & Send:

←−i−−−−−−−−−i −−−−−−

4. Decrypt: D( f pd 1 (km2i , pdi2 )) = km2i , pdi2

i

Compute & Verify:

h(Ekm2 (pdi2 )) =? h(Ekm2 (pdi2 )), h(pdi2 ) = pdi1

i i

Then, km2i → km2i , pdi2 → pdi2

5. Decrypt: D( f pd 2 (pdi1 ||Vi1 )) = pdi1 ||Vi1

i

Render at a card: R(Vi1 )

6. Verify the card: R(V ) = RV 1

i

Action Stage

User V SDS Server

[member − i]

1

Ki,1 K1

7. Encrypt & Upload M: Ci1 = Ekm1 (M) 1

=Ki,3 ·g i,2 ·M=gh(KG ) f (KG ) M

i −−−−−−−−−−−−−−−−−−→

[member − j]

8. Download from VSDS Server: C1

←−−−

i

−−

K1 K 1j,5

9. Decrypt Ci1 :D = Ci1 · K 1j,3 j,4 ·g =M

2nd Session

User SM

Log-in Stage

1. Compute & Send:

{2(s),h(pdi2 ), f pd 2 (km2i ),h(Ekm2 (pdi2 ||Vi1 ))}

−−−−−−−−−−i−−−−−−−−i−−−−−−→ q

2. Find: 2(s), h(pdi2 ) → αi , pdi2

Decrypt: D( f pd 2 (km2i )) = km2i

i

q

Compute:hq−2 (αi ) = αi2 ,

km2i = {K12 , K22 , K32 , K42 , K52 }

Verify: km2i = km2i

Compute & Verify:

h(Ekm2 (pdi2 ||Vi1 )) =? h(Ekm2 (pdi2 ||Vi1 ))

i i

3. Compute: αi3 , km3i

f pd 2 (km3i ,pdi3 ), f pd 3 (pdi2 ||Vi2 )

Compute & Send:

←−i−−−−−−−−−i −−−−−−

4. Decrypt: D( f pd 2 (km3i , pdi3 )) = km3i , pdi3

i

Compute & Verify:

h(Ekm3 (pdi )) = h(Ekm3 (pdi3 )), h(pdi3 ) = pdi2

3

i i

Then, km3i → km3i , pdi3 → pdi3

5. Decrypt: D( f pd 3 (pdi2 ||Vi2 )) = pdi2 ||Vi2

i

Render at a card: R(Vi2 )

6. Verify the card: R(V ) = RV 2

i

Action Stage

Same as the 1st Session

Figure 2. The Whole Process of VSDS.

H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 217

Then, SM veriﬁes km1i = km1i or not. Again, SM computes h(Ekm1 (pdi1 ||Vi1 ))) with the

i

km1i and checks h(Ekm1 (Vi1 )) is the same as the received value h(Ekm1 (Vi1 )) or not. Here,

i i

Ekm1 (pdi1 ||Vi1 ) has the same meaning as the above 1.

i

3. SM computes αi2 by applying αiq to hash function (q-2) times, and then computes

km2i ;

K12 = h(KG )αi2 , K22 = h(KG ) fKG (KG )(1 − αi2 ), K32 = g fKG (KG ) , K42 = −(h(KG ) + αi2 ), K52 =

fKG (KG )αi2 .

SM computes and sends f pd 1 (km2i , pdi2 ), f pd 2 (pdi1 ||Vi1 ). Here, pdi2 is the stored value.

i i

4. With the value pdi1 , the member i decrypts the received value; D( f pd 1 (km2i , pdi2 )) =

i

km2i , pdi2 . With the obtained values km2i , pdi2 , the group member i computes h(Ekm2 (pdi2 ))

i

and verify if this is the same as h(Ekm2 (pdi2 )).

i

Because km2i is the member i’s group key, the encryption method is also the same

as 1. Then, i hashes the value pdi2 and veriﬁes; h(pdi2 ) = pdi1 . If the veriﬁcations are

successful, km2i and pdi2 become km2i and pdi2 .

5. With this pdi , the group member i also decrypts; D( f pd 2 (pdi1 ||Vi1 )) = pdi1 ||Vi1 .

2

i

With the decrypted Vi1 , i renders this R(Vi1 ) then i uploads the image of R(Vi1 ) at a card.

6. SM veriﬁes if the rendered card image R(Vi1 ) is the same as RV 1 (3D real model)

i

or not. At the ﬁrst session’s veriﬁcation, member indicator’ authentication is processed.

If SM’s veriﬁcation is successful, the member i can begin to act (log-in allowed). The

action means uploading, reading(decryption) and downloading.

[The First Session_Action Stage]

K1

1 i,1 · gKi,2 · M = gh(KG ) f (KG ) M.

1

7. A member i encrypt message M: Ci1 = Ekm1 (M) = Ki,3

i

Then, the member i uploads M to his card.

8.Another member j downloads an encrypted message Ci1 from a VSDS board(server).

K 1j,4 1

9.The member j decrypts Ci1 with his ﬁrst group session key; D = Ci1 · K 1j,3 · gK j,5 =

gh(KG ) f (KG ) M · (g f (KG ) )−(h(KG )+α j ) · g f (KG )α j = gh(KG ) f (KG )− f (KG )h(KG )− f (KG )α j + f (KG )α j ·

1 1 1 1

M=M

[The Second Session]

From the second session, most processes are similar to the ﬁrst session. As the

session is changed, the corresponding pseudonym keys and group session keys are also

changed. As for the video image information V for 3D real model, a member sends the

information V 1 kept from the ﬁrst session to SM, and then SM challenges the member

with the newly selected information V 2 in the third step. Lastly, the member renders 3D

real model R(V 2 ) at his card. Action stage is also similar to the ﬁrst session.

From the third session, all processes go through the same paths as the second session.

3. Discussion

3.1. Efﬁciency

3.1.1. Strength.

In the secure group information-sharing communication, ’Group Re-Keying’ is the im-

portant task when user joins or leaves the group. The group keys needs to be updated

218 H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era

to maintain the forward and backward secrecy [14]. However, in the proposed system

VSDS, according to computation of the developed protocol, the every result of authenti-

cation is the same as the authentication-result with the fundamental group key. Therefore,

it does not need to do re-keying for membership -changes.

3.1.2. Weakness.

In the last step of the ﬁrst session’s authentication (5, 6), 3-dimensional image Vis ren-

dered. R(V) plays a role of a member indicator which is decided by SM in advance. The

meaning is “improving security". If 3-dimensional image is inefﬁcient in a real world,

2-dimensional image is recommended.

However, Google’s project ‘Tango’ has been recently showcased with indoor map-

ping and VR/AR platform [4]. ‘Tango’ technology makes a mobile device possible to

measure the physical world. Tango-enabled devices (smartphones, tablets) are used to

capture the dimensions of physical space to create 3-D representations of the real world.

‘Tango’ gives the Android device platform the new ability for spatial perception. There-

fore, it can be said that the proposition of VSDS is timely good keeping abreast of

‘Tango’s AR/VR technique to mobile devices.

3.2. Security

VSDS is a reversed hash key chain based group-key management system. Message con-

ﬁdentiality is one of the most important features in secure information sharing for group

members. The group key security requirements are;

1. Group Key Secrecy: It should be computationally impossible that a passive adver-

sary discovers any secret group key.

2.Forward Secrecy: Any passive adversary with a subset of old group keys cannot

discover any subsequent(later) group key.

3.Backward Secrecy: Any passive adversary with a subset of subsequent group keys

cannot discover any preceding(earlier) group key.

4. Key Independence; Any passive adversary with any subset of group keys cannot

discover any other group key [3, 15].

However, group-key based information sharing and service system does not follow such

requirements because a new joiner to the group could search all of the previous informa-

tion to be helped. Namely, backward secrecy is not eligible for a security requirement of

VSDS. The System VSDS satisﬁes with Group Information-sharing Secrecy as follows;

1. Forward Secrecy: For any group GT and a dishonest participant p ∈ GTj , the prob-

ability that a participant p can generate valid group key and pseudonym for (j+1)-th au-

thentication is negligible when the participant knows group key kmij and pseudonym pdij ,

where p ∈ GTj+1 and 0 < j < q. It means that all leaving members from a group should

not access to all of the next information or documents of the group any more.

2. Backward Accessibility: For any group GT and a dishonest participant p ∈ GTj , the

probability that a participant p can generate valid group key and pseudonym for (j-l)-th

authentication is 1 − η(n)2 when the participant knows group key kmij and pseudonym

pdij , where p ∈ GTj−l and 0 < l < j. Namely, all joining members to a group can access

2 the term negligible function refers to a function η : N → R such that for any c ∈ N, there exists n ∈ N, such

c

that η(n) < n1c for all n ≥ nc [16]

H.-A. Park / VSDS for Blurred Boundaries of Next Generation Computing Era 219

3. Group Key Secrecy : For any group GT , and a dishonest participant p who knows a

set of initial knowledge-group fundamental key KGT and one member i’s group key km1i ,

the probability that participant p can guess correctly the encrypted information message

M of group GT at the j-th session is negligible. It must be computationally impossible for

dishonest participant p to know or guess correctly the contents of the encrypted message

even if a leaving member or another member of a group reveals his group keys.

4. Conclusion

VSDS was proposed for the patients in all over the world who want to get some helps

and share information such as the web ‘PatientsLikeMe’. This system guarantees secu-

rity and privacy, because most health and private information are sensitive. Therefore,

VSDS is scalable to other group’s project applications with safety. Moreover, it is ﬁrmly

believed that the identiﬁed problems between next generation’s collaborative comput-

ing and security and the approaches also should be managed as an Integrated Security

Management (ISM).

References

[1] H.A.Park, Secure Chip Based Encrypted Search Protocol In Mobile Ofﬁce Environments, International

Journal of Advanced Computer Research, 6(24), 2016

[2] Y.Hu, A.Perrig, D.B.Johnson, Efﬁcient security mechanisms for routing protocols, In the proceedings of

Network and Distributed System Security Symposium (2003), 57-73

[3] H.A.Park, J.H.Park, and D.H.Lee, PKIS: Practical Keyword Index Search on Cloud Datacenter,

EURASIP Journal on Wireless Communications and Networking, 2011(1), 84(2011), 1364-1372

[4] G.Sterling, Google to showcase Project Tango indoor mapping and VR/AR platform at Google I/O,

http://searchengineland.com/google-showcase-project-tango-indoor-mapping-vrar-platform-google-io-

249629, 2016

[5] H.A.Park, J.Zhan, D.H.Lee, PPSQL: Privacy Preserving SQL Queries, In the Proceedings of ISA(2008),

Taiwan, 549-554

[6] P.Wang, H.Wang, and J.Pieprzyk, Common Secure Index for Conjunctive Keyword-Based Retrieval over

Encrypted Data, SDM 2007 LNCS 4721(2007), 108-123

[7] H.A.Park, J.W.Byun, D.H.Lee, Secure Index Search for Groups, TrustBus 05 LNCS 3592(2005), 128-

140

[8] H.A.Park, J.H.Park, J.S.Kim, S.B.Lee, J.K.Kim, D.G.Kim, The Protocol for Secure Cloud-Service Sys-

tem. In the Proceedings of NISS(2012), 199-206

[9] P.Wang, H.Wang, and J.Pieprzyk, Keyword Field-Free Conjunctive Keyword Searches on Encrypted

Data and Extension for Dynamic Groups, CANS 2008 LNCS 5339(2008), 178-195

[10] Y.Kawai, S.Tanno, T.Kondo, K.Yoneyama, N.Kunihiro, K.Ohta, Extension of Secret Handshake Proto-

cols with Multiple Groups in Monotone Condition. WISA 2008 LNCS 5379(2009), 160-173

[11] J.Li, X.Chen, Efﬁcient multi-user keyword search over encrypted data in cloud computing, Computing

and Informatics 32 (2013), 723-738

[12] http://lifehacker.com/how-to-use-trello-to-organize-your-entire-life-1683821040

[13] https://www.patientslikeme.com/

[14] R.V.Rao, K.Selvamani, R.Elakkiya, A secure key transfer protocol for group communication, Advanced

Computing: An International Journal, 3(2012), 83-90

[15] A.Gawanmeh, S.Tahar, Rank Theorems for Forward Secrecy in Group Key Management Protocols, In

the Proceedings of 21st AINAW(2007), 18-23

[16] D.Boneh, B.Waters, Conjunctive, Subset, and Range Queries on Encrypted Data, In the Proceedings of

4th TCC(2007), 535-554

220 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-220

Reviews Based on Hybrid Rules

Yong WANG1, Ya-Zhi TAO, Xiao-Yi WAN and Hui-Ying CAO

Key laboratory of electronic commerce and logistics of Chongqing, Chongqing

University of Posts and Telecommunications, Chongqing 400065, China

Abstract. In the most existed text-mining schemes for customer reviews, explicit

features are usually concerned while implicit features are ignored, which probably

leads to incomplete or incorrect results. In fact, it is necessary to consider implicit

features in customer review mining. Focusing on the identification of implicit

feature, a novel scheme based on hybrid rules is proposed, which mixed statistical

rule, dependency parsing and conditional probability. Explicit product features are

firstly extracted according to FP-tree method and clustered. Then, association pairs

are obtained based on dependency parsing method and the production of frequency

and PMI. Finally, implicit features are identified by considering the association

pairs and conditional probability of verbs, nouns and emotional words. The

proposed scheme is tested on a public cellphone reviews corpus. The results show

that our scheme can effectively find implicit features in customer reviews.

Therefore, our research can obtain more accurate and comprehensive results from

the customer reviews.

extraction, conditional probability Introduction

Introduction

one hand, potential consumers can decide whether to buy the product after reading the

product reviews; on the other hand, reviews are helpful for manufacturers to improve

product design and quality. However, it is impossible for people to read all reviews by

themselves, because the amount of reviews is huge. So, review mining is emerging as

the times require and becomes a significant application field. Feature identification,

containing explicit features identification and implicit feature identification, is a core

step in review mining. If a feature appears in a review directly, it’s defined as an

explicit feature. Similarly, if a feature doesn’t appear in a review but is implied by

other words, it’s defined as an implicit feature [1]. A sentence which contains explicit

features is defined as explicit sentence, and a sentence which contains implicit feature

is defined as an implicit sentence. Wang et al. [2] counted the Chinese reviews they

crawled and discovered that at least 30 percent of the sentences are implicit sentences.

Thus, it can be seen that implicit features play a significant role in reviews mining.

1

Corresponding Author: Yong WANG, Chongqing University of Posts and Telecommunications, No.2

Chongwen Road, Nan’an District, Chongqing City, China; E-mail: wangyong1@cqupt.edu.cn.

Y. Wang et al. / Implicit Feature Identiﬁcation in Chinese Reviews Based on Hybrid Rules 221

In recent years, some scholars have been studying implicit feature extraction. In

most proposes, implicit features are identified on the basis of emotional words. Qiu et

al. [3] proposed a novel approach to mine implicit features based on clustering

algorithm of k-means and F2 statistics. Hai et al. [4] identified implicit features via co-

occurrence association rules (CoAR) mining. Zeng et al. [5] proposed a method based

classification for implicit features identification. Zhang et al. [6] used explicitly multi-

strategy property extraction algorithm and similarity to detect implicit features. What’s

more, Wang et al. [7] proposed a hybrid association rule mining method to detect

implicit features.

To identify implicit feature, we proposed a novel scheme based on a hybrid rules,

which consist of three different methods. Compared with previous research results, the

presented scheme has two advantages: (1) considering semantic association degree and

statistical association degree together, we would get more accurate <feature clusters,

emotional words> association pairs. (2) In Chinese reviews, some emotional words can

qualify more than one features, such as ̌ᅢ̍(good),̌Ꮕ̍(bad). Thus, it is not

accurate to only consider the association between emotional words and features. To

solve this problem, the association between verbs, nouns and features is also

considered.

1. Scheme Design

Figure 1 depicts the framework of our scheme which is composed of several parts.

In this stage, explicit features are extracted. Detail steps are as follows:

Do word segmentation and POS (part-of-speech) tagging for reviews via

ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis

System).Then, nouns and noun phrase from the annotated corpus of comments

are stored in a transaction file.

Frequent itemsets obtained by FP-tree method are regarded as candidate explicit

features I0.

Candidate explicit features I1 are got after pruning all single words in I0.

222 Y. Wang et al. / Implicit Feature Identiﬁcation in Chinese Reviews Based on Hybrid Rules

contains frequency words but non features is established. This rule is used to

filter I1 for getting Candidate explicit features I2. The rule is as follows:

appellation nouns, such as ̌ ̍(friend), ̌ ̍(classmate), etc.

colloquial nouns, such as ̌ ̍(machine), etc.

product name, such as ̌ ̍(cellphone), ̌ ̍(computer) , etc.

abstract nouns, such as ̌ ̍(reason), ̌ ̍(condition) , etc.

collective nouns, such as “ੱ ̍(people), etc.

PMI algorithm [8] is used to measure the value between the product and each

feature in I2. The final explicit features are obtained after filtering the features

with PMI value small than a threshold. The PMI value is calculated as follows:

hit (" "and" ")

PMI ( )=log 2

hit ( )hit ( )

(1)

Where hit(x) is the pages returned by Baidu search engine when using x as a

keyword; the threshold is set as -3.77, which is determined by experimental sample

data.

Each features similarity is calculated by Tongyici Cilin [9]. Features are clustered

into one group if their similarity values are 1. Once explicit feature clusters are

obtained, a feature will be chosen as the representative feature for the cluster

which is in.

1.2. Explicit association pairs <explicit feature cluster, emotional word> extraction

feature clusters and emotional words can compose association pairs from the two

aspects of semantics and statistics. Detail contents are as follows:

Extracting emotional words in explicit sentences. Extracting adjective, POS of

words are ̌/a̍ or ̌/an̍, in explicit sentences as emotional words.

Calculate frequency*PMI between emotional words and explicit feature clusters.

The frequency*PMI formula is as follows:

Pf &w

frequency * PMI f ,w ! Pf &w *log 2

Pf Pw (2)

Where w is the emotional word, f is the feature cluster, Pf is the probability of the

feature f occurrence in explicit sentences. The formulas of Pf and Pf&w are as follows:

n

Pf ¦P

i 1

fi

(3)

n

Pf &w ¦ Co _ occurrence f , w / R

i 1

i

(4)

Where n is the number of features in a feature cluster, fi is ith feature in the feature

cluster f, Co_occurence (fi, w) is co-occurrence times of fi and w explicit sentences, R is

the number of sentences in explicit sentences.

Using syntax analysis tools to obtain all dependence relationship in the

sentences. If “nsubj” relationship exists between feature clusters and emotional

words, there is modified relation feature between feature clusters and emotional

words. If a feature in a feature cluster has a modified relation with an emotional

Y. Wang et al. / Implicit Feature Identiﬁcation in Chinese Reviews Based on Hybrid Rules 223

word, we consider that the feature cluster has a modified relation to the

emotional word.

Setting a threshold p. The association pairs with frequency*PMI value larger

than p, or the frequency*PMI value smaller than p but existing modified

relations, are chosen as final association pairs. The p in the paper is -0.00009.

Different from them, we identify implicit features by considering emotional words,

verbs and nouns. Detail steps are as follows:

Analyzing elements of the implicit sentence and making two judgments. The

first judgment is whether the emotional words are in the implicit sentence. The

second judgment is whether existing verbs or nouns are in the implicit sentence.

There are four types in terms of the two judgments: Y1 represents that the

association pairs are found by emotional words in the implicit sentence; N1 is

the opposite of Y1.Y2 represents that verbs or nouns are in the implicit sentence;

N2 is the opposite of Y2.

Y1N1

Step 1, extracting emotional words in the implicit sentence, then candidate

association pairs containing these emotional words are obtained. Feature clusters in the

candidate association pairs are treated as candidate feature clusters.

Step 2, verbs and nouns in the implicit sentence are extracted and treated as

notional words set. Then, we calculate each candidate feature cluster’s conditional

probability under the condition of these words. We defined the calculation formula

follows:

Co _ occurrence f , word j

P( f | word j )

count ( word j ) (5)

Where wordj is jth word in notional words set, f is a candidate feature cluster. Then we

defined the f’s average condition probability as follows:

v

T( f ) ¦ P( f

j 1

| word j ) / v (6)

Where v is the number of notional words set.

Step 3, we defined the scores of each candidate feature clusters as follows:

Where D is a weight coefficient, and it is set as 0.7 after several experiments. Then the

representative feature of a feature cluster which is in the association pairs with the

highest score is chosen as the implicit feature.

Y1N2

Step 1 is the same as the first step of Y1N1.

Step 2, the representative feature of a feature cluster which is in the candidate

association pairs with the highest frequency*PMI value is chosen as the implicit feature.

Y2N1

Step 1, verbs and nouns in the implicit sentence are extracted and treated as

notional words set. Then, we use Eqs. (5) and (6) to calculate all explicit feature

cluster’s average conditional probability under the condition of these words.

224 Y. Wang et al. / Implicit Feature Identiﬁcation in Chinese Reviews Based on Hybrid Rules

pairs set with the highest score is chosen as the implicit feature.

Y2N2

The implicit feature can’t be identified

2. Experiment Evaluation

Six hundred reviews about one kind of cell phone was download from a pubic website

called Datatang.com. In order to evaluate the performance of the scheme, data set was

manually annotated. In the data set, there are 1870 explicit sentences and 413 implicit

sentences. Three traditional methods, precision, recall and F-measure, are used to

evaluate the performance of the scheme.

89 product explicit features are obtained by the method described in Section 1.1. The

top 5 features most concerned by customers are shown in Table 1. The precision of this

method is 70.8%, the recall is 73.3% and F-measure is 72%.1285 association pairs are

extracted from explicit sentences by the approach described in Section 1.2. Five

association pairs are shown in Table 2. As seen from the table, the performance of the

approach is good.

Table 1.Top 5 product features results

rank feature PMI frequency

1 ᥓ⢻(intelligence) 0.0 14

2 ઙ(software) -0.10005 42

3 ภ (number) -0.44418 30

4 ዳ᐀(screen) -0.6529 194

5 ચᩰ(price) -0.79837 34

Table 2. Association pairs

rank feature PMI frequency

1 ᥓ⢻(intelligence) 0.0 14

2 ઙ(software) -0.10005 42

3 ภ (number) -0.44418 30

4 ዳ᐀(screen) -0.6529 194

5 ચᩰ(price) -0.79837 34

Implicit features are identified by the approach described in Section 1.3. Table 3 is

partial result. Compared with Ref. [4] by using the same data, results are in Table 4.

Table 3.partial result about Implicit features identification Table 4.Comparative results

Implicit sentences implicit feature Evaluation index our scheme Ref.[4]

900 Ma is difficult to meet the needs battery precision 67.49 % 41.55%

too expensive price recall 65.86% 37.53%

very slow and very troublesome reaction F-measure 66.67% 39.44%

Very beautiful appearance

shape looks like hard appearance

It can be seen from the above tables that the proposed algorithm is far superior to

the algorithm in [4]. Our scheme can better meet the needs of the practical application.

The algorithm proposed in this paper takes statistical analysis and semantic analysis

Y. Wang et al. / Implicit Feature Identiﬁcation in Chinese Reviews Based on Hybrid Rules 225

into account which can find more association between emotional words and explicit

feature clusters. The research in [4] only focused on mining product features from the

point of statistical view. Therefore, our method has more advantages in performance.

3. Conclusion

Implicit features in customer reviews have an important effect on the text mining

results, which is also an important factor for customers or enterprises to make a wise

decision. In this paper, we proposed a scheme combining several rules to extract the

implicit features from the word segmentation to identification. Compare with the

conventional methods, our scheme not only obtains the association between emotional

words and product features based on statistics and semantics, but also consider the

effect of emotional words, verbs and nouns to the final results. Experiment results

shows that our scheme lays a good basis for the application of network reviews.

Acknowledgments

Natural Science foundation of CQ CSTC (cstc2015jcyjA40025), Social Science

Planning Foundation of Chongqing (2015SKZ09), and Social Science Foundation of

CQUPT (K2015-10).

References

[1] B. Liu, M. Hu, J. Cheng. Opinion observer: analyzing and comparing opinions on the web. In:

Proceedings of the 14th International Conference on World Wide Web (WWW’05), ACM, New York,

NY, USA, 2005, 342̄351.

[2] H. Xu, F. Zhang, W. Wang. Implicit feature identification in Chinese reviews using explicit topic mining

model. Knowledge-Based Systems, 76(2014):166̄175.

[3] Y. F. Qiu, X. F. Ni, L. S. Shao. Research on extracting method of commodities implicit opinion targets.

Computer Engineering and Applications, 51(2015):114-118.

[4] Z. Hai, K. Chang, J.-j. Kim, Implicit feature identification via co-occurrence association rule mining. In:

Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science,

6608(2011), 393̄404.

[5] L. Zeng, F. Li. A Classification-Based Approach for Implicit Feature Identification/ Chinese

Computational Linguistics and, Natural Language Processing Based on Naturally Annotated Big Data.

Springer Berlin Heidelberg, 2013:190-202.

[6] L. Zhang, X. Xu. Implicit Feature Identification in Product Reviews. New Technology of Library and

Information Service. 2015, (12):42-47.

[7] W. Wang, H. Xu, and W. Wan. Implicit feature identification via hybrid association rule mining. Expert

Systems with Applications, 40(2013):3518̄3531.

[8] K W Church, et al. Word association norms, mutual information and lexicography. In: Proceedings of

the 27th Annual Conference of the Association of Computational Linguistics, New Brunswick, NJ:

Association for Computational Linguistics.1989: 76̄83.

[9] J. L. Tian, W. Zhao. Words Similarity Algorithm Based on Tongyici Cilin in Semantic Web Adaptive

Learning System. Journal of Jilin University (Information Science Edition), 28(2011):602-608.

226 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-226

Uncertain Influence Based on Power Law

Ke-Ming TANGa, Hao YANG a,b,1 , Qin LIUc, Chang-Ke WANGc, and Xin QIUa

a

School of Information Engineering, Yancheng Teachers University, Yancheng,

Jiangsu, China

b

School of Software and TNList, Tsinghua University, Beijing, China

c

School of Foreign Language, Yancheng Teachers University, Yancheng, Jiangsu,

China

Abstract. The researches of the traditional cascade events, such as avalanche, sand

model, only researched the power-law distribution in the process of all time. In

fact, the speed of Virus propagation is different in each time period. In this paper,

we can find that the number of infected people behaves as a power-law for Guinea,

Liberia and Sierra Leone respectively over different time through our empirical

observations. So the government could take different power exponents of the

number of infected people as the spread of the disease in different periods for the

speed of manufacturing of the drug.

Introduction

much attentions for a long time, where the event undergoes a chain reaction and often

gives rise to catastrophes or disasters. Snow avalanches [8] or landslide avalanches [9]

induced by cascade failures in power grids [2, 6, 7].

The Ebola epidemic wreaking havoc in West Africa has led to a global ripple effect.

In the absence of Characteristics of Ebola, the disease has alarmed the global public

health community and caused panic among some segments of the population.

The ongoing Ebola epidemic in West Africa has affected the United States and other

Western countries, and the phenomenon of “avalanche” must take place in all of the

word in the absence of effective methods of eradicating Ebola through analyzing

the propagation characteristics.

The researches of the traditional cascade events, such as avalanche, sand model,

only researched the power-law distribution in the process of all time. It means that

large-scale avalanches occur occasionally in the process of evolution. Correspondingly,

various small-scale avalanches appear more and more and its number satisfies the

power-law distribution. In fact, the speed of Virus propagation is different in each time

1

Corresponding Author: Hao YANG, School of Information Engineering, Yancheng Teachers

University, Yancheng, Jiangsu, China; School of Software and TNList, Tsinghua University, Beijing, China;

E-mail: classforyc@163.com.

K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Inﬂuence 227

period. For Guinea, Liberia and Sierra Leone, we do some empirical observations about

the spread of the disease in different time.

1. The measurement of the Spread of the Disease Based on Power Law Model

We suppose the sand number increase with n. A possibility is that the sand number per

unit length of the cycles, λ. The parameter is constant. Put k/2 sands (k = 2πλctgθ is an

even number) on the 0-th phase, and k sands at the 1st sand. The number of the sand on

the n-th cycle should be nk. Likely, we assume the falling sands is an inelastic collision

with the resting sands. In this case, each sand slide together after the collision. We make

the sliding sands in order that (n2 −n+1) sands evenly meet 2n resting sands on the n-th

phase. There are (n2 − n + 1)k in the n-th generation (bn = (n2 − n + 1)k, dn = 0), then

N(t)~n(t)2~t2×2~t4 [16].

We resolve with susceptible people(S), latent people (L), infected people (I) and

death people(D). The transformation of the four nodes is shown in Figure 1.

Now we can study the different equations which show the virus spread on the basic

of the related knowledge in this paper. According to sandpile model’s analysis, the

equation about epidemic trend is as follows (with the acceleration of the number of

infected people B, and the acceleration of the number of dead people ′. A is a constant

which related to B in the axis, ′ is a constant which related to ′ in the axis):

(:()) (K())

=

= ′ (1)

c(L) = L ∙ 10 9(L) = L ∙ 10 (2)

2. Model Evaluations

We collected data about the number of all the cases and the number of the people

infected from a website (http://www.cdc.gov/vhf/ebola/outbreaks/2014-west-africa/

whats-new.html). As a result of our experiments, a linear function was fitted to the linear

ranges of log-log plotted distributions to estimate the value of the gamma exponent.

Figure 2-4 show distributions of I for Guinea, Liberia and Sierra Leone respectively

(with values of the Pearson correlation coefficient R, and standard deviation SD). Our

method considers the number of infected people from February 4, 2014 to March 25,

2015.

The Log-log plots of the number of the people infected and dead are demonstrated;

a) Log-log plot of the number of the people infected

228 K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Inﬂuence

(a)0<t<100 (b)100<t<229

(c)229<t<318

Figure 2. Log-log plot of the number of the people infected for Guinea

(a)2<t<100 (b)100<t<267

(c)267<t<318

Figure 3. Log-log plot of the number of the people infected for Liberia

K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Inﬂuence 229

(a)64<t<271 (b)271<t<318

Figure 4. Log-log plot of the number of the people infected for Sierra Leone

(a)0<t<100 (b)100<t<318

Figure 5. Log-log plot of the number of the people dead for Guinea

(a)0<t<100 (b)100<t<318

Figure 6. Log-log plot of the number of the people dead for Liberia

230 K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Inﬂuence

Figure 7.. Log-log plot of the number of the people dead for Sierra Leone

Table 1. Values of R(Pearson correlation coefficient), B(Power exponent) and SD(Standard deviation) for

log-log plot of the number of the people infected for three countries.

Guinea Liberia Sierra Leone

t 0-100 100-229 229-318 2-100 100-267 267-318 64-271 271-318

R 0.96003 0.98086 0.93949 0.88703 0.9855 0.98225 0.97084 0.98431

SD 0.06164 0.06282 0.01599 0.18816 0.10405 0.00326 0.15811 0.00396

Table 2. Values of R(values of the Pearson correlation coefficient) and SD(standard for log-log plot of the

number of the people dead for three countries.

Guinea Liberia Sierra Leone

t 0-100 100-318 2-100 100-318 64-318

R 0.95263 0.9909 0.79757 0.94552 0.96776

SD 0.06817 0.03521 0.15197 0.17199 0.15719

Figure 3-5 show the numbers of the people infected degree distributions for the three

countries and Figure 6-7 show the numbers of the people dead degree distributions for

the three countries. We use R and SD to illustrate the feasibility of our model(with values

of the Pearson correlation coefficient R, and standard deviation SD in Table 1-2). If 0.95

is taken as a minimal reliable value, we can state a power law for the infected and death

people.

Through the above analysis, we can get power relations about the number of people

infected or people dead changing over time. Values of A and B are shown in Table 3-4

respectively. Of course, it is not straightforward the correlation between them and the

number of people. We just represent the numerical results.

Table 3. Values of A and B of the people inflected for the three countries.

Guinea Liberia Sierra Leone

0-100 100-229 229-318 2-100 100-267 267-318 64-271 271-318

0.36666 3.1281 0.99661 0.82688 5.1708 0.73092 4.14169 1.01616

1.3384 -4.46411 0.53758 -0.18821 -8.61879 1.87354 -6.20484 1.336

K.-M. Tang et al. / Characteristics Analysis and Data Mining of Uncertain Inﬂuence 231

Table 4. Values of A and B values of the people dead for the three countries.

Guinea Liberia Sierra Leone

t 0-100 100-318 2-100 100-318 64-318

A 0.37032 1.92976 0.45961 3.71858 3.59267

B 1.61866 -1.52369 0.4246 -5.38156 -5.30488

3. Conclusion

According our model, the transmission speed of virus is slow at the beginning, but the

speed will accelerate after a period, which can cause people enough attention to the virus

and take some relevant measures to prevent the spread. Then the speed will be

decreased relatively. Our power law model is reasonable by using a simplified sandpile

model and analyzing the empirical data. The data of latent people could not be collected,

so we only analyze the data of infected and death people in the model to produce the drug.

As our experiment, it is a realistic, sensible, and useful mode, and can be applied to

eradicate Ebola.

Acknowledgements

This work is supported by the National High Technology Research and Development

Program (863 Program) of China (2015AA01A201), National Science Foundation of

China under Grant No. 61402394, 61379064, 61273106, National Science Foundation of

Jiangsu Province of China under Grant No. BK20140462, Natural Science Foundation of

the Higher Education Institutions of Jiangsu Province of China under Grant No.

14KJB520040, 15KJB520035, China Postdoctoral Science Foundation funded project

under Grant No. 2016M591922, Jiangsu Planned Projects for Postdoctoral Research

Funds under Grant No. 1601162B, JLCBE14008, and sponsored by Qing Lan Project.

References

[1] P. Bak, C. Tang, K. Wiesenfeld, Self-organized criticality, Physical review A, 38 (1988): 364.

[2] M. L. Sachtjen, B.A. Carreras, V.E. Lynch, Disturbances in a power transmission system, Physical Review

E, 61(2000): 4877.

[3] A. E. Motter, Cascade control and defense in complex networks, Physical Review Letters, 93(2004)

098701.

[4] J. Wang, L.-L. Rong, L. Zhang, Z. Zhang, Attack vulnerability of scale-free networks due to cascading

failures, Physical A, 387(2008): 6671.

[5] S.V. Buldyrev, R. Parshani, G. Paul, et al. Catastrophic cascade of failures in interdependent networks,

Nature, 464(2010): 1025.

[6] R. Parshani, S.V. Buldyrev, S. Havlin, Interdependent networks: reducing the coupling strength leads to a

change from a first to second order percolation transition, Physical Review Letters, 105(2010): 048701.

[7] T. Zhou, B. H. Wang, Chin. Maximal planar networks with large clustering coefficient and power-law

degree distribution, Physical Letters, 22 (2005): 1072.

[8] K. Lied, Avalanche studies and model validation in Europe, Avalanche studies and model validation in

Europe, European research project SATSIE (EU Contract no. EVG1-CT2002-00059), 2006.

[9] D.A. Noever, Himalayan sandpiles, Physical Review E, 47(1993): 724.

232 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-232

Based on Accident State Vector Using

Multimodal Data

Kang-Wei LIU , Jian-Hua WAN a and Zhong-Zhi HAN a

a

School of Geosciences, China University of Petroleum, Qingdao 266580 China

b

Sinopec Safety Engineering Institute, Qingdao, Shandong, 266071 China

sions, fires, leaks and poisoning incidents is occurred occasio1nally. So it is partic-

ularly important to forecast for hazardous chemical accidents and develop appro-

priate safety measures. In view of the analysis and summary of previous methods,

an improved Hazardous Chemicals Accident Prediction method is proposed based

on accident state vector in this paper. It defines the accident state vector using

Multi-modal Data such as authoritative data, accident report, webpage, image, vid-

eo, speech, etc. The Multi-modal Data is collected by web crawler which is built

by open-source tools. The web crawler is an Internet bot which systematically

browses the known hazardous chemical accident website, for the purpose of col-

lecting Multi-modal accident data. As mentioned before, the Multi-modal Data is

Multi format. In order to define the accident state vector easily, we divide the Mul-

ti-modal data into three dimensions based on the principle of accident causes. Re-

spectively is the human factors, physical state factors, environmental factors. Ac-

cording to the geometrical distribution characteristics of support vector, it can be

selected from the incremental samples that the sample of support vectors most

likely to become forming a boundary vector set by adopting vector distance pre-

extraction method, on which support vector training and accident prediction model

build. It ensures the validity of predictive models due to various factors of the

cause of the accident are fully considered by the accident state vector and ad-

vantages of support vector machines in high-dimensional, multi-factor, large sam-

ple datasets machine learning are exhibited. Sample experimental verification from

the mastered accident of hazardous chemicals has showed that hazardous chemical

accident prediction method proposed in this paper can effectively accumulate ac-

cident history information, possess higher learning speed and be positive signifi-

cance for the safe development of hazardous chemicals industry.

diction, accident state vector

Introduction

risk industry, which has some perilous characteristics of high temperature and high

pressure, inflammable and explosive, poisonous and harmful, continuous operation,

long chain side wide, etc. At present, the safety production situation of hazardous

1

Corresponding Author: Kang-Wei LIU, Engineer of Sinopec Safety Engineering Institute, No339,

Songling Road, Qingdao, Shandong, China ; E-mail: liukw.qday@sinopec.com.

K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector 233

chemical is very grim with all kinds of explosion, fire, leakage and poisoning accidents

occurred at times. According to statistics, more than 96000 chemical enterprises in

China, of which dangerous chemicals production enterprises are more than 24000 , the

species of chemicals are more than 100000,but more than 4600 chemical accidents

have occurred nearly a decade. As the device of large-scale, intensive, material and

economic loss will occur when any accident happens, and especially death and disabil-

ity loss will lead to health life loss. Therefore, it is particularly important to forecast for

hazardous chemical accidents and develop appropriate safety measures on this basis.

Accident prediction is based on the known information and data, which forecasts

and predict for the security of forecasting object, as shown in Figure 1. Accident pre-

diction method has become a hot topic of scholars gradually as the change trend and

security hidden danger of the accident can be analyzed through the method in recent

years. According to incomplete statistics, all kinds of forecasting method was more

than 300 now, and the development of modern forecasting methods are often accompa-

nied by cross-analysis and mutual penetration of all kinds of forecasting methods, so it

is difficult to classify them absolutely. The current common accident prediction method

can be summarized into 6 types of situational analysis method, regression prediction

method, time prediction method, Markova chain prediction method, gray prediction

method and the nonlinear prediction method. The establishment and algorithm im-

provement of model often tend to be an emphasis in the traditional accident prediction

method and the collection and carding of the prior accident data will be an overlook

frequently. Limited by difficulty of priori data collection and complexity of model, the

accident prediction models are usually based on number of factors of strong causal re-

lationship artificially to hazardous chemical accidents ,which include the number of

accident ,death toll and the amount and type of hazardous chemical, then, leading to

incomplete and inaccurate of the accident forecasting result ultimately.

Accident

Prior data

prediction

Of accidents

model

Figure 1. Establishment way of the Accident prediction model

Support Vector Machine (SVM) is developed by Vapnik and co-workers[1] It is

an excellent method of machine learning. SVM have empirically been shown to give

good generalization performance on a wide variety of problems. SVM is a kind of im-

plementation way of statistical learning theory, which is not only the pursuit of accura-

cy on the training sample, but also the consideration of complexity of the learning

space on the basis of the training sample accuracy, namely, it adopted a compromise

between spatial complexity and sample learning precision so that the resulting models

for unknown samples possess good generalization ability.

In view of the analysis and summary of previous methods, an improved Hazardous

Chemicals Accident Prediction method is proposed based on Support Vector Machine

in this paper. It defines the accident state vector from three dimensions of the human

factors, physical state factors, environmental factors based on the principle of accident

causes. According to the geometrical distribution characteristics of support vector, it

can be selected from the incremental samples that the sample of support vectors most

likely to become forming a boundary vector set by adopting vector distance pre-

extraction method, on which support vector training and accident prediction model

build. It ensures the validity of predictive models due to various factors of the cause of

234 K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector

the accident are fully considered by the accident state vector and advantages of support

vector machines in high-dimensional, multi-factor, large sample datasets machine

learning are exhibited.

1. Overview of SVM

The core of SVM is finding a hyperplane that separates a set of positive examples from

a set of negative examples with maximum margin[1,2,3]. The training of a Support

Vector Machine can be reduced to maximize a convex quadratic program to linear con-

straints. Given a training sample:

{( xi, yi )| i=1,…,l; xi Rn yi {+1, -1}},

For the condition of Linear Separable: The goal of SVM is to find a hyperplane

<w, x> + b = 0

Which divides the sample set exactly. But there are always not only one hyper-

plane. The hyperplane which has the largest margin of the two kinds of samples - the

optimum classification hyperplane - attains the best capacity of spread. The optimum

hyperplane is only determined by samples closest to it and has no responsibility on

other samples. These samples are so called support vectors. This is also the origin of

“support vector”[4,5,6,7].

The accident causation theory is used to illustrate the causes of accidents, exploring

process and accident consequences, so the occurrence and development of accident

phenomenon can be analyzed definitely. It is accident mechanism and model extracted

from the essence of a large number of typical accident, which reflects the regularity of

the accident, provides a scientific and complete basis in theory for the accident predic-

tion and prevention, besides the improvement of the safety management work owing to

the capacity for quantitative and qualitative analysis of accident cause

In accordance with the accident causation theory, the insecure elements of human

beings, the insecure status of objects and insecure impact of environment can all lead to

the occurrence of accidents, so the accident can be described as three categories of sub-

jective evaluation indicator (human factors), objective inherent indicator (physical fac-

tors), environmental indicator (environment factors), as shown in Figure 2.

P D E

K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector 235

Subjective evaluation indicator are judged and scored regularly by the enterprise,

assuming that the number of subjective evaluation indicator is m, then, the subjective

evaluation indicator can be represented as an m dimensional vector P (People),

P = {P1, P2, P3,̖̖Pm}

Subjective evaluation indicator are mainly involves acquisition of some safety in-

dicators impossible to quantify or unable to automatically extract such as the question

of "Are the training of special equipment operation and maintenance in place?" or

"whether the security regulatory behavior in place or not? " etc. these indicators need to

be evaluated and scored by the enterprise personnel regularly

(2) Objective inherent indicator (physical factors)

Objective inherent indicator refers to the enterprise inherent risk level, assuming

that the number of objective inherent indicator is n, then, the objective inherent indica-

tor can be expressed as an n dimensional vector D (Device)

D ={D1, D2, D3,̖̖Dn}

Objective inherent indicator can be obtained automatically, for instance, "chemi-

cals production", "number of major hazard installation", "fire and explosion indicator

of hazardous substances", "chemical material toxicity indicator", etc.

(3) Environmental indicator (environment factors)

Local climate and weather, geography and geological environment, frequency of

natural disaster, regulation level of government and social events are usually included

in environmental indicator. In a word, all not classified as former two kinds pertain to

environmental indicator in order to meet the requirement of big data fault tolerance.

These indicators should be corresponding to a t dimensional vector E finally

E={E1, E2, E3,̖̖Et}

In conclusion, accident state vector can be defined as follows:

accident state vector A = { P, D, E }

Wherein: Human vector P={P1, P2, P3,̖̖Pm},Physical state vector D={D1,

D2, D3,̖̖Dn}, Environmental vector E ={E1, E2, E3,̖̖Et}.

through the learning of accident state vector, and the unknown accident state vector can

be forecasted via the hyperplane, thus forming the accident prediction model[9,10]. As

mentioned above, not all the vector works for the establishment of prediction hyper-

plane, but only a small amount of training sample called support vector function when

training and learning via SVM, which distributed to neighborhood of hyperplane in

geometry position[11,12]. So we should choose one which may become a support vec-

tor samples as far as possible for studying. Therefore, this paper presents Support Vec-

tor Machine (SVM) training algorithm based on accident state vector (ASV-SVM algo-

rithm).

We can descript the incremental learning algorithm with support vector machine as

follows:

Historical sample set (M), incremental sample set (N), Suppose that M and N satis-

fy I , : is the initial SVM classifier and is corresponding support vector

236 K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector

set of M. Obviously, , the learning target is to find the classifier : and cor-

responding support vector set * of * .

Based on the geometric character of support vector, determining whether one sam-

ple can transfer to support vector should consider two complications: One is the dis-

tance between the sample and the hyperplane; the other is the distance between this

sample and the center of this kind of samples [13,14,17]. So we can do our best to se-

lect the samples likely to become support vectors as the newly training set. There may

be samples which will become support vector in and . Select samples which

are close to separating hyperplane and between the class center and hyperplane as new-

ly-increased samples. Select samples whose hyperplane-distance is less than center

plane distance form edge sample set T. Set * * as the final training set of in-

cremental learning.

4. Experimental Results

We apply this algorithm into establishment of the model for the prediction of hazardous

chemicals accidents. We compare the ASV-SVM algorithm with traditional SVM

learning algorithm and KNN k-Nearest Neighbor algorithm. Simple description of

three algorithms is as follows:

Classical SVM algorithm: This is traditional SVM algorithm. The algorithm com-

bines original samples and newly-increased samples, and does the learning again for all

of the training samples.

Classical KNN algorithm: KNN is a memory-based method. Prediction on a new

instance is preformed using the labels of similar instances of the training set.

ASV-SVM algorithm: Using ASV-SVM algorithm which select support vector

based on vector-distance for incremental learning.

In this experiment, the accident state vector is defined by Multimodal data. The

method is as follows:

(1)Collect and maintain the data of 619 typical hazardous chemicals accidents oc-

curring within the last ten years, including accident report, accident cause analysis,

accident consequence and influence.

(2)Crawling related data of mentioned accidents using web crawler which build by

open-source tools. The web crawler is an internet bot which systematically browses the

known hazardous chemical accident website, for the purpose of collecting Multi-modal

accident data. Such as the weather condition, geographical situation, population density

etc. when the accident happened.

(3)In order to do a good job of comparative test, we collected two to three sets of

non-accident status data on other times at the place where the accident occurred. And

1288 non accident state data are formed by this way.

(4)The data collected above is Multi format such as authoritative data, accident re-

port, webpage, image, video, speech, etc. In order to define the accident state vector

easily, we divide the Multi-modal data into three dimensions based on the principle of

accident causes. We use the open source big data tools, with the manual screening, the

data were structured to deal with. We add as many attribute labels as possible to each

data, so that these non-structural data become structured data. To be frank, for unstruc-

K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector 237

tured data such as video, image, most of which is done by artificial recognition, the

accuracy and availability of automatic recognition of the machine is not satisfactory.

(5)All the Multimodal data will have a lot of attribute labels after structured pro-

cess. Categorize these attribute labels by three dimensions based on the principle of

accident causes. Respectively is the human factors, physical state factors, environmen-

tal factors.

(6)Determine the dimension of the accident state vector of 265 dimensions, and

each attribute label represents one dimension, including human vector P (185 dimen-

sions), divided into leadership and safety culture, safety, process safety information for

process control, inspection and human performance, the state vector D (49 dimensions),

divided into Fire index of hazardous substances, explosive index, toxicity index, pro-

cess index, equipment index, safety facility index, etc. The environment vector (31

dimensions), by the meteorological index (We) and geography information index (Gi).

(7)Transfer accident state and the non-accident state into the vector form of acci-

dent as follows[15]:

<label> <index1>: <value1> <index2>: <value2> ̖̖ <indexn>: <valuen>

Label is result of the accident state. 1 is accident state.-1 is non-accident state. In-

dex is attribute label. Value is the weight or description of attribute label. And n is

equal to 256.

(8)And from which 1000 vectors are selected as test sets, 1000 vectors are used as

the initial training set, and the remaining 907 vectors are randomly divided into 3 sets,

as an incremental set.

After the pretreatment, Accident information are transferred to the form of vectors.

Then we use the three algorithms do the learning. All of the algorithms are carried out

in the LibSvm-mat-5.20 saddlebag[16]. The platform of the experiment is E7-4830V2,

operating system is Windows server 2012. In the experiment, kernel is REF function,

C=1. The results of the experiment are shown in table 1 3.

Table 1. Classical SVM algorithm experiment results

Incremental set

Test set

number

samples numbers time/s Precision

Test set

set number

samples numbers time/s Precision/%

238 K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector

ASV-SVM algorithm

Incremental

Test set

set number

samples numbers time/s Precision

We can see that only in the initial training the ASV-SVM algorithm is similar to

traditional SVM algorithm and KNN algorithm. Other incremental learning process

performance is better than other classical algorithm. The number of training samples is

reduced and the training time is shortened when the accuracy rate is not lost. As in Fig-

ure 3 and Figure 4 below

ASV-SVM algorithm is prior to the traditional SVM learning algorithm and KNN

algorithm on training samples numbers. In the process of incremental learning, the

ASV-SVM algorithm screens the newly-increased and original samples effectively,

thus reduce the number of training samples on the premise of reserving the effective

information of samples. The training time of the ASV-SVM algorithm decreases great-

ly contrast to the Classical SVM and KNN learning algorithm. The decreasing of the

number of the training samples can well control the scale of the incremental learning,

thus shorten the training time on the premise of not losing useful information. The pre-

cision of the ASV-SVM algorithm is a little better to the SVM learning algorithm, and

great better than the KNN algorithm. As is shown in Figure 5.

K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector 239

5. Conclusion

Hazardous chemicals industry is a high risk industry. Explosion, fire, leakage and poi-

soning accidents occur frequently. This paper analyzes the influence of occurrence of

hazardous chemicals accidents form human factors, physical factor and environmental

factor, and defines the accident state vector from three dimensions. In view of the anal-

ysis and summary of previous methods, an improved Hazardous Chemicals Accident

Prediction method is proposed based on Accident State Vector (ASV-SVM). The high

dimension vector is used to define the accident state, and the most possible factors are

considered. Using improved support vector machine learning algorithm (ASV-SVM

algorithm), an accident prediction model is established by accident state vector. A

sample test of the hazardous chemical accident shows that the method proposed this

paper can differentiate accident state accurately and efficiently, and make a positive

significance on accident prediction of hazardous chemicals.

Acknowledgement

This work was supported by the National Natural Science Foundation of China (Grant

No. 31201133).

References

[1] V.N. Vapnik, The Nature of Statistical Learning Theory, Springer-verlag, New York, (2000),332-350..

[2] N. Cristianini, J. Shawe-Talor. An Introduction to Support Vector Machines and Other Kernel-based

Learning Methods. Cambridge University Press, (2004), 543-566

[3] R. Xiao, J.C. Wang, Z.X. Sun An Incremental SVM Learning Algorithm. Journal of Nan Jing Univer-

sity (Natural Sciences) 38(2002), 152-157㧚

[4] N Ahmed, S Huan, K Liu, K Sung, Incremental learning with support vector machines. The International

Joint Conference on Artificial Intelligence, Morgan Kaufmann publishers, 10 1999), 352-356.

[5] P. Mitra, C. A. Murthy, S. K. Pal, Data Condensation in Large Databases by Incremental Learning with

Support Vector Machines. Proceeding of International Conference on Pattern Recognition, (2000), 2708-

2711.

240 K.-W. Liu et al. / Hazardous Chemicals Accident Prediction Based on Accident State Vector

[6] C. Domeniconi and D. Gunopulos Incremental Support Vector Machine Construction. Proceeding of

IEEE International Conference on Data Mining series (ICDM ), (2001),589-592.

[7] G. Cauwenberghs , T. Poggio, Incremental and Decremental Support Vector Machine Learning. Ad-

vances in Neural Information Processing Systems,(2000),122-127.

[8] S. Katagiri , S. Abe, Selecting Support Vector Candidates for Incremental Training. Proceeding of IEEE

International Conference on Systems, Man, and Cybernetics (SMC), (2005),1258-1263,.

[9] D. M. J. Tax, R. P. W. Duin, Outliers and Data Descriptions. Proceeding of Seventh Annual Conference

of the Advanced School for Computing and Imaging, (2001),234-241.

[10] L.M. Manevitz and M. Yousef, One-class SVMs for document classification. Journal of Machine Learn-

ing Research, 2 (2001), 139-154.

[11] R. Debnath, H. Takahashi, An improved working set selection method for SVM decomposition method.

Proceeding of IEEE International Conference Intelligence Systems, Varna, Bulgaria, 21-24(2004), 520-

523.

[12] R. Debnath, M. Muramatsu, H.Takahashi, An Efficient Support Vector Machine Learning Method with

Second-Order Cone Programming for Large-Scale Problems. Applied Intelligence, 23(2005), 219-239.

[13] W D.Zhou, L.Zhang, L.C.Jiao, An Analysis of SVMs Generalization Performance. Acta Electronica

Sinica. 29(2001),590-594

[14] J. Heaton, Net-Robot Java programme guide. Publishing House of Electronics Industry. 22(2002) 1-

141.

[15] C.W. Hsu C.J. Lin A simple decomposition method for support vector machines. Machine Learning,

46(2002) 291–314.

[16] C.C. Chang , C. Lin, LIBSVM : a library for support vector machines, 2001. Software available at

http://www.csie.ntu.edu.tw/~cjlin/libsvm

[17] C.H. Li, K.W. Liu, H.X. Wang. The incremental learning algorithm with support vector machine based

on hyperplane distance, Applied Intelligence, 46(2009):145-152

Fuzzy Systems and Data Mining II 241

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-241

Segmentation

Guo-Qi Liu a,1 , Hai-Feng Li b

a

School of Computer and Information Engineering, Henan Normal University,

Xinxiang 453007, China

b

College of Mathematics and Information Science, Henan Normal University,

Xinxiang 453007, China

method for image segmentation. However, intensity inhomogeneity uni-

versally exists in images and it greatly inﬂuences image segmentation.

Local binary ﬁtting model (LBF) is a eﬀective method to cope with

inhomogeneous intensity. However, the energy function of LBF is non-

convex and it costs much computational cost. Otherwise, LBF could not

preserve the weak edges. The non-convexity always causes the contour

suﬀer from local minimum, and the computational cost is large. In or-

der to cope with these shortcomings, we introduce a regularized mini-

mization for improved LBF model. In proposed model, the edge infor-

mation is integrated into the energy functional. The energy function-

al of improved LBF model is convex, and the local minimum is avoid.

Furthermore, some fast optimal method can be utilized. In this paper,

the regularized method is utilized to make the contour converge to min-

imization. Experimental results conﬁrm that proposed method attains

a similar segmentation eﬀect with the LBF but costs less computation

times.

Keywords. intensity inhomogeneity, level set, global minimization,

computation times

Introduction

vision. Level set method [1-4] is a popular algorithm with the competitive advan-

tages in computational robustness and ﬂexibility of topology changes. In general,

there are two types of level set models. One is the level set method based on

global information and the other is based on the local information. In the models

based on global information, Chan and Vese (C-V) [5] is one of the most pop-

ular methods, whose foreground and background usually have obvious diﬀerent

intensity means.

1 Corresponding Author: Guoqi Liu; School of Computer and Information Engineering, Henan

242 G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation

or intensity non-uniformity. Intensity inhomogeneity [6-11] degrades partition ef-

fectiveness and leads to inaccurate target location, which always appears in some

medical images. Therefore, intensity inhomogeneity segmentation method based

on level set or active contour have sprung up to in the past years. In the models

based on local information, Li has derived a local binary ﬁtting (LBF)[6] model.

By incorporating the local image information into the proposed model, the images

with intensity inhomogeneity can be eﬀectively segmented. However, the LBF

could not keep the weak edges and the computational cost is relatively large. Some

researchers also proposed similar methods to improve the performances in dealing

with inhomogeneous intensity, such as Zhang [10]. Generally, the models based on

local information has better performances in dealing with inhomogeneous inten-

sity because the local intensity inhomogeneity can be decreased by local ﬁlter. In

order to improve segmentation eﬃciency and keep true target contour, we extend

the version of LBF. Our paper is organized as follows. In Section 1, we review

the background. In Section 2, a method is proposed to enhance the former LBF

version. Section 3 shows the experimental results and makes comparisons with

LBF. Section 4 makes a summary of this paper.

1. Background

E(C, c1 , c2 ) = μ ds + |I − c1 |2 dx + |I − c2 |2 dx (1)

C Ω1 Ω2

whose intensity means are c1 and c2 . C represents the zero level curve and I is

image intensity. Moreover, the ﬁrst term is the curve length to regularize with a

weight μ and the last two terms are data ﬁtting terms. The Eq. (1) depends on

curve C and intensity means c1 and c2 , which can be solved by variation method

and gradient descent equation. Therefore, by representing contour with level set

φ, the above equation is computed as follows:

∂φ

= −((I − c1 )2 − (I − c2 )2 + μK)δ (2)

∂t

with parameter .

A data ﬁtting energy is deﬁned in LBF [6], which can be locally approximated the

image intensities on the two sides of the contour. This energy is then incorporated

into a variational level set formulation, and a curve evolution equation is derived

G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation 243

guide the motion of the contour, which thereby enables LBF model to cope with

intensity inhomogeneity. The local binary ﬁtting energy is deﬁned as follows:

2

e= λi ei (x)dx (3)

i=1

2

e1 (x) = Kσ (y−x)|I(y)−f1 (x)| Hdy, e2 (x) = Kσ (y−x)|I(y)−f2 (x)|2 (1−Hdy

Ω Ω

(4)

Kσ is served as a kernel function. f1 (x) and f2 (x) are computed as follows:

f1 (x) = f2 (x) = (5)

Kσ (x) ∗ H Kσ (x) ∗ (1 − H)

regularization term in the above energy. Length regularization term is deﬁned as

follows:

L(φ) = ds = |∇φ|dx (6)

C

∂φ ∂E ∇φ

=− = −δ (φ)(e1 − e2 ) + λδ (φ)div( ) (7)

∂t ∂φ |∇φ|

e(φ) = e1 H(φ)dx + e2 (1 − H(φ))dx (9)

Similar with [12], the evolution equation of LBF is also computed by mini-

mizing the following energy functional:

E = λ|∇φ|dx + (e2 − e1 )φdx (10)

244 G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation

the following improved energy functional:

E = λg|∇u|dx + (e2 − e1 )udx (11)

1

where g = 1+|∇I| 2 is the edge stoping function, u is the character function

and 0 ≤ u ≤ 1. Since the above energy is convex but with constrained condition

minimization problem, the unconstrained and convex energy is obtained based on

introducing an exact penalty function:

E (u, f1 , f2 , λ, α) = λT Vg (u) + (e2 − e1 )u + αpf (u)dx (12)

Ω

penalty function.

In order to obtain the solution of the energy functional (13), the regularized

method is utilized in this letter. By introducing a variable v, the regularized

energy functional is computed as follows:

μ

E (u, v, f1 , f2 , λ, α) = λT Vg (u) + u − v F + (e2 − e1 )v + αpf (v)dx (13)

2 Ω

are to compute the Eq. (13). It is to obtain the iteration solution of u by ﬁxing

v, f1 and f2 . A fast numerical minimization based on the dual formulation of the

TV energy is presented in [12-15]. According to [13], the solution of u is given by

1

u=v− div p (14)

μ

1 1

g(x)∇( div p − v) − |∇( div p − v)|p = 0. (15)

μ μ

The above equation can be solved by a ﬁxed point method, which is given in [13].

Similarly, v is obtained by minimizing the following equation: v is updated by

μ

v = argmin u − v F + (e2 − e1 )v + αpf (v)dx (16)

v 2 Ω

λ

v = min(max(u − (e2 − e1 ), 0), 1) (17)

μ

G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation 245

Input: initial value u0 , v0 , B, and p;

for k = 0 to maximum number of iterations do

for i = 1 to B do

uki = vk − μ1 div p

obtain p based on g(x)∇( μ1 div p − v) − |∇( μ1 div p − v)|p = 0.

end for

uk = conv(Gaussian, ukB )

σ (x)∗I(x)u

f1 (x) = KK σ (x)∗u)

and f2 (x) = KσK(x)∗I(x)(1−u)

σ (x)∗(1−u)

if uk+1 − uk F < δ then

return uk ;

end if

end for

Output:

u = 0.5;

demonstrates three images and initiation curves. The left image is 131 ∗ 103, the

middle image 110 ∗ 111 and the right image is 96 ∗ 127.

246 G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation

1 2 3

LBF 1.928899 2.276515 4.688177

LIF 1.256878 1.753792 3.798538

P roposedmethod 0.898745 1.356784 2.897653

geneity. Segmentation results are provided from Figure 2 to Figure 3 in which the

ﬁrst column is simulated by Lis algorithm and the second column is computed

by our method. In Figure 2, the medical images are tested. The results with LBF

and proposed method are similar.

Furthermore, proposed method utilizes the gradient information of image

edge. Thus, proposed method has better performance in keeping weak edges com-

pared with LBF model. As shown in Figure 3, the gray image is tested and some

of the edges in object is weak. LBF suﬀers from weak edges leakage and the strong

edges are extracted. While proposed method converges to the weak edges, since

the edge information g is integrated into the proposed energy functional and it

could preserve weak edges.

On the other hand, proposed method is more eﬃcient compared with LBF

and LIF. All the experiments are conducted by using MATLAB R2010a on the

PC with Intel Core (3.3*4GHz) and 8GB memory under Windows 7 profession-

al without any particular code optimisation. In algorithm 1, proposed method

based on image decomposition iterates to obtain u, which can decrease the evo-

lution times. The computational times are shown in Table 1. From the Table,

proposed method costs less times in converging to objects compared with LBF.

Because proposed algorithm iterates several times to obtain u before iterating v

and this process enhances the image non-smooth component, which causes the

total number of iterations decreasing.

4. Conclusions

In this paper, we ﬁrst introduce the CV model and the LBF model. Then, we

propose our model to improve the eﬃciency of contour evolution. There are two

contributions in ours. One is that an energy function with edge information is

added into LBF, the other is to introduce a fast algorithm to obtain the solu-

tion. Experimental results conﬁrm that the proposed method can obtain similar

segmentation and keep weak edges. Meanwhile, proposed method obtains faster

evolution of contour.

G.-Q. Liu and H.-F. Li / Regularized Level Set for Inhomogeneity Segmentation 247

Acknowledgements

(No. U1404603).

References

[1] V. Caselles, R. Kimmel, and G. Sapiro. Geodesic active contours, International journal

of computer vision, 22(1997), 61-79.

[2] S. Kichenassamy, A. Kumar, P. Olver, A. Tannenbaum, and A. Yezzi: Gradient ﬂows and

geometric active contour models, Proc. 5th Int. Conf. Comput. Vis., 1995, 810-815.

[3] R. Kimmel, A. Amir, and A. Bruckstein. Finding shortest paths on surfaces using level set

propagation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(1995),

635-640.

[4] R. Malladi, J. A. Sethian, and B. C.Vemuri. Shape modeling with front propagation: A

level set approach, IEEE Transactions on Pattern Analysis and Machine Intelligence,

17(1995), 158-175.

[5] T. han and L. Vese. Active contours without edges, IEEE Transactions on Image Pro-

cessing, 10(2) (2001), 266-277

[6] C. Li, C. Kao, J. C. Gore, and Z. Ding. Minimization of region-scalable ﬁtting energy for

image segmentation, IEEE Transactions on Image Processing, 17(2008), 1940-1949.

[7] C. Li, Huang R., Ding Z., Gatenby C., Metaxas DN., Gore JC. A level set method for

image segmentation in the presence of intensity inhomogeneities with application to MRI,

IEEE Transactions on Image Processing, 20(7) (2011), 2007-2016.

[8] X.F. Wang, H. Min. A level set based segmentation method for images with intensity in-

homogeneity, Emerging Intelligent Computing Technology and Applications, with Aspects

of Artiﬁcial Intelligence, 2009, 670-679.

[9] F.F. Dong, Z.S. Chen and J.W. Wang, A new level set method for inhomogeneous image

segmentation, Image and Vision Computing, 31(2013), 809-822.

[10] K.H. Zhang, H.H. Song and L. Zhang, Active contours driven by local lmage litting energy,

Pattern recognition, 43(2010), 1199-1206.

[11] C. Li, C. Xu, C. Gui, MD. Fox, Level set evolution without re-initialization: A new vari-

ational formulation, Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005,

430-436.

[12] A. Chambolle, An algorithm for total variation minimization and applications, Journal of

Mathematical Imaging and Vision, 20(2004), 89-97.

[13] X. Bresson, S. Esedoglu, P. Vandergheynst, et al. Fast global minimization of the active

contour /snake model, Journal of Mathematical Imaging and Vision, 28(2007), 151-167.

[14] E.S. Brown, T.F. Chan, X. Bresson. Completely convex formulation of the Chan-Vese

image segmentation model, International journal of computer vision, 98(2012), 103-121.

[15] C. Li, R. Huang, Z. Ding, C. Gatenby, D. Metaxas, J. Gore, A variational level set approach

to segmentation and bais correction of images with intensity inhomogeneity, Processing

of medical image computing and computer aided intervention (MICCAI), 2008, Part II,

LNCS 5242, 1083-1091.

248 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-248

Implicit in Test Instance to Fully

Represent Unrestricted Bayesian

Classiﬁer

Mei-Hui LI, Li-Min WANG 1

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of

Education, Jilin University, ChangChun City 130012, P. R. China

tion performance for data mining. However, the restricted network structure makes

it impossible to represent the Markov blanket of class variable, which corresponds

to the optimal classiﬁer. And the test instances are not fully utilized, the ﬁnal deci-

sion thus may be biased. In this paper, a novel unrestricted k-dependence classiﬁer

is proposed based on identifying the Markov blanket of the class variable. Further-

more, the algorithm also adopts local learning to build local structure, which can

represent the evidence introduced by test instance. 15 datasets are selected from the

UCI machine learning repository for zero-one loss comparison. The experimental

results indicate that the unrestricted Bayesian classiﬁer can achieve good trade-off

between structure complexity and prediction performance.

Keywords. Data mining, Unrestricted Bayesian classiﬁer, Local learning, Markov

blanket

Introduction

In the 1990s, Judea Pearl ﬁrst talked about Bayesian network [1], which is a kind of infer-

ence network based on probabilistic uncertainty. A particularly restricted model, Naive

Bayes (NB), is a powerful classiﬁcation technique. Many restricted Bayesian classiﬁers

[2] have been set out to extend the dependence of NB, such as Tree-augmented Naive

Bayes (TAN) [3] and k-dependence Bayesian classiﬁer (KDB) [4].

Madden [2] ﬁnds that unrestricted Bayesian classiﬁers [5] learned using likelihood-

based scores are comparable to TAN. In this paper, a novel unrestricted k-dependence

Bayesian classiﬁer (UKDB) is proposed to build from the perspective of Markov blanket.

Local mutual information and conditional local mutual information are applied to build

the local graph structure UKDBL for each test instance. UKDBL can be considered a

complementary part of UKDBG , which is learned from training set.

1 CorrespondingAuthor: LiMin Wang, Key Laboratory of Symbolic Computation and Knowledge

Engineering of Ministry of Education, Jilin University, ChangChun City 130012, P. R. China; E-mail:

wanglim@jlu.edu.cn.

M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance 249

The rest of the paper is organized as follows. Section 1 brieﬂy introduces information

theory and Markov blanket. Section 2 introduces related Bayesian classiﬁers. Section

3 presents the learning procedure of UKDB and basic idea of local learning. Section 4

provides the experimental results and comparisons. Section 5 concludes the ﬁndings.

In the 1940s, Claude E. Shannon introduced information theory, the theoretical basis of

modern digital communication. Many commonly used measures are based on the infor-

mation theory and used in a variety of classiﬁcation algorithms.

The mutual information (MI) [6] I(X; Y ) can measure the reduction of uncertainty

about variable X when all the values of variable Y are known. Conditional mutual in-

formation (CMI) [6] I(X; Y |Z) can measure the mutual dependence between X and Y .

Local mutual information (LMI) I(X; y) can measure the reduction about variable X

after observing that Y = y. Conditional local mutual information (CLMI) I(x; y|Z) can

measure the mutual dependence between two attribute values x and y.

Deﬁnition 1. [1] The Markov blanket (MB) for variable C is the set of nodes composed

of C’s parents Xpa , its children Xch , and its children’s parents Xcp . Suppose that X =

{Xpa , Xch , Xcp }, Markov blanket Bayesian classiﬁers approximate P (x, c) as follows,

P (c, x) = P (xpa )P (c|xpa )P (xcp |xpa , c)P (xch |xcp , xpa , c) (1)

Eq.(1) presents a more general case.The Markov blanket of C shields C from effects

of those attributes outside it and is the only knowledge needed to predict its behavior.

are assumed to be conditionally independent, then

n

P (x, c) ∝ P (c) P (xi |c). (2)

i=1

250 M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance

No. Dataset # Instance Attribute Class

1 Mushrooms 8124 22 2

2 Thyroid 9169 29 20

3 Pendigits 10992 16 10

4 Sign 12546 8 3

5 Nursery 12960 8 5

6 Seer 18962 13 2

7 Magic 19020 10 2

8 Letter-recog 20000 16 26

9 Adult 48842 14 2

10 Shuttle 58000 9 7

11 Connect-4 67557 42 3

12 Waveform 100000 21 3

13 Localization 164860 5 11

14 Census-income 299285 41 2

15 Covtype 581012 54 7

The basic structure of TAN allows each attribute to have at most one parent attribute

apart from the class, then

n

P (x, c) ∝ P (c)P (xr |c) P (xi |c, xj(i) ), (3)

i=1,i=r

where Xr denotes the root node and {Xj(i) } = Pa(Xi )\C, for any i = r. An

example of TAN is shown in Figure 1(b).

KDB further relaxes NB’s independence assumption by allowing every attribute to

be conditioned on the class and, at most, k other attributes [4]. Then

n

P (c|x) ∝ P (c)P (x1 |c) P (xi |c, xi1 , · · · , xip ) (4)

i=2

where {Xi1 , · · · , Xip } are the parent attributes of Xi and p = min(i − 1, k). Figure

1(c) shows an example of KDB when k=2.

UKDB can output two kinds of sub-classiﬁers, i.e., UKDBG and UKDBL , which de-

scribe the causal relationships implicated in training set and test instance, respectively.

UKDB uses I(Xi ; C) and I(Xi ; Xj |C) simultaneously to measure the comprehensive

effect of class C and other attributes (e.g., Xj ) on Xi .

The learning procedures of UKDBG are described as follows:

———————————————————————————————————

Algorithm 1 UKDBG

———————————————————————————————————

M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance 251

Input: Pre-classiﬁed training set, DB, and the k value for the maximum allowable

degree of attribute dependence.

1. Let the global Bayesian classiﬁer being constructed, UKDBG , begin with a single

class node C. Let the used attribute list S be empty.

2. Select k attributes {X1 , · · · , Xk } as Xpa that correspond to the maximum of

I(X1 , · · · , Xk ; C).

3. Add {X1 , · · · , Xk } to S. Add k nodes to UKDBG representing {X1 , · · · , Xk }

as the parents of C. Add k arcs from {X1 , · · · , Xk } to C in UKDBG .

4. Repeat until S includes all domain attributes

• Select

q attribute Xi that corresponds to the maximum value of I(Xi ; C) +

j=1 I(Xi , Xj |C), where Xi ∈/ S, Xj ∈ S and q = min(|S|, k).

• Add Xi to S. Add a node that represents Xi to UKDBG . Add an arc from C

to Xi . Add q arcs from q distinct attributes Xj in S to Xi .

5. Compute the conditional probability tables inferred by the structure of UKDBG

by using counts from DB, and output UKDBG .

———————————————————————————————————

The learning procedures of UKDBL are described as follows:

———————————————————————————————————

Algorithm 2 UKDBL

Input: Test instance (x1 , · · · , xn ), estimates of probability distributions on training

set and the k value for the maximum allowable degree of attribute dependence.

1. Let the local Bayesian classiﬁer being constructed, UKDBL , begin with a single

class node C. Let the used attribute list S be empty.

2. Select k attributes {X1 , · · · , Xk } as Xpa that correspond to the maximum of

I(x1 , · · · , xk ; C).

3. Add {X1 , · · · , Xk } to S. Add k nodes to UKDBL representing {X1 , · · · , Xk }

as the parents of C. Add k arcs from {X1 , · · · , Xk } to C.

4. Repeat until S includes all domain attributes

•

Select attribute Xi that corresponds to the maximum value of I(xi ; C) +

q

j=1 I(xi , xj |C), where Xi ∈

/ S, Xj ∈ S and q = min(|S|, k).

• Add Xi to S. Add a node that represents Xi to UKDBL . Add an arc from C

to Xi . Add q arcs from q distinct attributes Xj in S to Xi .

5. Compute the conditional probability tables inferred by the structure of UKDBL

by using counts from DB, and output UKDBL .

———————————————————————————————————

For UKDBG and UKDBL , estimate the conditional probabilities P̂G (cp |x) and

P̂L (cp |x) that instance x belongs to class cp (p = 1, 2, · · · , t), respectively. The class

label of x is determined by the average of both of the conditional probabilities.

c∗ = arg max . (5)

cp ∈C 2

252 M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance

Dataset NB TAN KDB UKDBG UKDBL UKDB

Mushrooms 0.020 0.000 0.000 0.000 0.001 0.000

Thyroid 0.111 0.072 0.071 0.075 0.093 0.074

Pendigits 0.118 0.032 0.029 0.028 0.028 0.019

Sign 0.359 0.276 0.254 0.243 0.302 0.247

Nursery 0.097 0.065 0.029 0.029 0.070 0.045

Seer 0.238 0.238 0.256 0.258 0.244 0.244

Magic 0.224 0.168 0.164 0.162 0.176 0.161

Letter-recog 0.253 0.130 0.099 0.088 0.130 0.080

Adult 0.158 0.138 0.138 0.135 0.132 0.130

Shuttle 0.004 0.002 0.001 0.001 0.001 0.001

Connect-4 0.278 0.235 0.228 0.219 0.248 0.228

Waveform 0.022 0.020 0.026 0.024 0.019 0.019

Localization 0.496 0.358 0.296 0.297 0.331 0.285

Census-income 0.237 0.064 0.051 0.050 0.061 0.050

Covtype 0.316 0.252 0.142 0.143 0.274 0.150

W/D/L NB TAN KDB UKDBG UKDBL

TAN 14/1/0

KDB 13/0/2 9/4/2

UKDBG 13/0/2 10/3/2 3/11/1

UKDBL 14/1/0 3/7/5 2/3/10 2/3/10

UKDB 14/1/0 11/4/0 5/8/2 5/9/1 12/3/0

In order to better verify the efﬁciency of the proposed UKDB, experiments have been

conducted on 15 datasets from the UCI machine learning repository [7]. Table 1 sum-

marizes the characteristics of each dataset. Table 2 presents for each dataset the average

zero-one loss. The following algorithms are compared:

• NB, standard Naive Bayes.

• TAN [8], Tree-augmented Naive Bayes applying incremental learning.

• KDB (k=2), standard k-dependence Bayesian classiﬁer.

• UKDBG (Global UKDB, k=2), a variant UKDB describes global dependencies.

• UKDBL (Local UKDB, k=2), a variant UKDB describes local dependencies.

• UKDB (k=2), a combination of global UKDB and local UKDB.

Statistically a win/draw/loss record (W/D/L) is computed for each pair of competi-

tors A and B with regard to a performance measure M . The record represents the number

of datasets in which A respectively beats, loses to, or ties with B on M . Finally, related

algorithms are compared via one-tailed binomial sign test with a 95% conﬁdence level.

Table 3 shows the W/D/L records respectively corresponding to average zero-one loss.

Dems̆ar [8] recommends the Friedman test [9] for comparisons of multiple algo-

rithms. For any pre-determined level α, the null hypothesis will be rejected if F > χ2α ,

which is the upper-tail critical value having t − 1 degrees of freedom. The critical value

M.-H. Li and L.-M. Wang / Exploring the Non-Trivial Knowledge Implicit in Test Instance 253

of χ2α for α = 0.05 is 9.49. The Friedman statistic for zero-one loss in our experiments

are 16.64. By comparing those results, we can get the following conclusions:

For different classiﬁers the average ranks of zero-one loss on all datasets are {N-

B(4.66), TAN(3.74), KDB(3.56), UKDBG (3.45), UKDBL (3.58), UKDB(2.01)}. UKD-

B and UKDBG performs the best among all classiﬁers in terms of zero-one loss. From

Table 3, UKDB has lower zero-one loss more often than other classiﬁers and the differ-

ences are signiﬁcant. UKDBG also has relative advantages, however, the differences are

not signiﬁcant. The performance of UKDBL is similar to that of TAN. UKDB can make

full use of the information that is supplied by the training sets and test instances. Thus,

performance robustness can be achieved.

5. Conclusion

The working mechanisms of NB, TAN and KDB were analysed and summarised. The

proposed algorithm, i.e., UKDB, applies local learning and Markov blanket to improve

the classiﬁcation accuracy. Local learning makes the ﬁnal model more ﬂexible and

Markov blanket breaks the limitation of strict restriction for the parent variables.

15 datasets are selected from UCI machine learning repository by 10-fold cross val-

idation for zero-one loss comparison. Overall, ﬁndings reveal that UKDB model outper-

formed NB, TAN and KDB extraordinarily. To clarify the working mechanism of UKDB

more clearly, global UKDB and local UKDB, are also implemented and compared.

Acknowledgements

This work was supported by the National Science Foundation of China (Grant No.

61272209, 61300145) and the Postdoctoral Science Foundation of China (Grant No.

2013M530980), Agreement of Science & Technology Development Project, Jilin

Province (No. 20150101014JC).

References

[1] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kauf-

mann, Palo Alto, CA, 1988.

[2] M.G. Madden, On the classiﬁcation performance of TAN and general Bayesian networks, Knowledge-

Based Systems, 22 (2009), 489–495.

[3] R.A. Josep, Incremental Learning of Tree Augmented Naive Bayes Classiﬁers, in Proceedings of the 8th

Ibero-American Conference on Artiﬁcial Intelligence, Seville, Spain, 2002, 32–41.

[4] M. Sahami, Learning limited dependence Bayesian classiﬁers, in Proceedings of the 2nd International

Conference on Knowledge Discovery and Data Mining, 1996, 335–338.

[5] F. Pernkopf, Bayesian network classiﬁers versus selective k-NN classiﬁer, Pattern Recognition,

38(2005), 1–10.

[6] C.E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal, 1948, 379–

423.

[7] UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu/ml/datasets.html.

[8] J. Demšar, Statistical comparisons of classiﬁers over multiple data sets, Journal of Machine Learning

Research, 7 (2006), 1–30.

[9] M. Friedman, The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of

Variance, Journal of the American Statistical Association, 32 (1937), 675–701.

254 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-254

Social Indicator Research

Ying XIEa, Yao-Hua CHENb and Ling-Xi PENG b,1

a

Guangzhou Social Work Research Center, Guangzhou University,

Guangzhou, P.R. China, 510006

b

School of Mechanical and Electrical Engineering, Guangzhou University,

Guangzhou, P.R. China, 510006

indicator analysis. Most of time, factor analysis results in the textbook only give

some mathematical expressions without clear interpretation. Motivated from a case

study on a popular textbook, this paper attempts to illustrate the potential pitfall of

factor analysis in some real applications. The study demonstrates that without

careful examination of the original dataset, factor analysis can lead to misleading

conclusions. This issue has been largely ignored in the literature including popular

textbooks. The statistical analysis cannot completely rely on the automated

computer software. The Kaiser-Meyer-Olkin (KMO) test results can only be used

as a reference. We should carefully examine the applicability of the original data

and give a cautious explanation. Provided that some popular textbooks ignore this

point, we wish this article can draw the readers’ special attention to the raw data

when conducting factor analysis.

Introduction

multivariate statistics course, factor analysis is an essential part. Normally, before

conducting factor analysis, it is suggested to use Kaiser-Meyer-Olkin (KMO) test to

justify whether the dataset is suitable for factor analysis. However, KMO test is

designed to test the sampling adequacy and does not fully account for the applicability

of factor analysis to a specific dataset. Consequently, even with significant KMO test

results, the factor analysis still produces suspicious conclusions.

The purpose of factor analysis is to reduce the dimensionality of the dataset, and to

examine the underlying relationships among the variables. In general, factor analysis

attempts to find a few factors to capture most information about the original data,

where the factors are combinations of the related variables [1-8].

Clearly, the factor analysis is based on analysis of the characteristics of the original

variables to summarize the information of the original variables [9-10].Therefore, the

choice of the original variables is very important. If there is no correlation among the

original variables, the data is not suitable for factor analysis. Dimension reduction

1

Corresponding Author: Ling-xi PENG, School of Mechanical and Electrical Engineering, Guangzhou

University, Guangzhou, P.R. China. Email: xysoc@gzhu.edu.cn.

Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research 255

effect will be limited. On the contrary, with stronger correlation, factor analysis could

largely reduce the dimensionality, produce superior performance, and improve the

interpretability [11].

Nowadays, factor analysis is implemented in most statistical software. But the

software are unable to understand the underlying meaning of each variable, and thus

researchers need to name cryptic factor, make a practical interpretation of the factors,

and check the applicability of factor analysis to the dataset. Quite often, the

applicability is not tested at all, and the researchers assume applicability by default.

This is one of the key reasons of absurd factor analysis results are not uncommon in

many statistical textbooks and articles. Many authors do not examine the raw data

before conducing the factor analysis.

Specifically, this article will use an example in Statistics (the fourth edition,

Renmin University of China Press) to illustrate the importance of checking

applicability of factor analysis. This textbook is widely used in China, recommended

by National Statistics Committee and Ministry of Education, with supporting

comprehensive database of teaching. In fact, the similar misuses could be found in

many other statistical textbooks, including another popular textbook Multivariate

Statistical Analysis [8].

1. A Case Study

The following example uses factor analysis to rank economic development of China

Provinces. "Based on the data of six major economic indicators for 31 provinces,

municipalities and autonomous regions in 2006, conduct factor analysis, explain the

factors, and calculate the factor score [9]." (Quoted (translated) from 256-269 pages of

the original book, Chapter 12, principal component analysis and factor analysis):

Gross

Governm Total Total Household Total Retail

Regional

ent Investment in Consumption Sales of

Product Populatio

Region Revenue Fixed Assets Expenditure Consumer

Per n (10000

(10000 (100 million (yuan/per Goods (100

Capita persons)

yuan) yuan) Capita) million yuan)

(yuan)

Hebei 16962 6205340 5470.2356 6898 4945 3397.42296

Shanxi 14123 5833752 2255.7351 3375 4843 1613.43996

Inner

Mongoli 20053 3433774 3363.2077 2397 5800 1595.26514

a

256 Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research

Component

1 2

Gross Regional Product Per Capita .112 .981

Government Revenue .755 .622

Total Investment in Fixed Assets .931 .247

TotalPopulation .941 -.213

Household Consumption Expenditure .117 .980

Total Retail Sales of Consumer Goods .922 .349

Extraction Method: Principal Component Analysis.

Rotation Method: Varimax with Kaiser Normalization.

Loadings Loadings

Total % of Cumulative %Total % of Cumulative %Total % of Cumulative %

Variance Variance Variance

According to the textbook, the first component is most highly correlated with Total

Investment in Fixed Assets; Government Revenue; Total Retail Sales. The author

defined it as "economical level factor", and defined the second factor as "consumption

level".

Table 4 Region Rank in the Textbook

1 Guangdong 2.42045 .89371 3.31416

2 Shanghai -.54724 3.46909 2.92185

3 Jiangshu 1.96498 .57532 2.54030

4 Shandong 2.36315 .00275 2.36591

5 Zhejiang .94065 1.11499 2.05565

6 Beijing -.64278 2.63862 1.99584

7 Liaoling .41769 .20721 .62490

8 Henan 1.29494 -.83424 .46070

Then, the author weighted each factor according to variance contribution rate, and

then summed. In this way, the textbook calculated the total scores of each region, and

Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research 257

used the total score to reflect regional economic development. The result of rank in the

textbook is in the following Table 4.

According to the textbook, the result of the KMO Test, shown in Table 5, is

statistically significant, which means that the result of factor analysis is meaningful.

However, the result is highly skeptical. For example, Beijing is significantly under-

ranked, and Henan is over-ranked. Guangdong being ranked first is inconsistent with

the actual situation of the economic development.

Table 5 KMO and Bartlett's Test

Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .695

Approx. Chi-Square 277.025

Sig. .000

The problem of the above analysis lies in the raw data. The example selects a few

variables to reflect the economic development. However, these variables are not on the

same scale. The GDP per capita is on the "individual" scale, while the "total population

at the end of the year", "investment in fixed assets", "total retail sales of social

consumer goods" and "government revenue" are all on the "population" or "overall"

scale. Because of the mismatched scale, it is inappropriate to combine these variables

into meaningful factors. In fact, to compare the level of economic development, the

"total population" is not even a proper indicator, as it gives advantages to regions with

larger population in the ranking system. Obviously, large population does not

necessarily indicate a prosperous economy. For example, Beijing, the Capital of China,

has much less population than Henan province, but Beijing’s economy is much more

developed than Henan.

To overcome this problem, a more appropriate approach is to examine the raw data

before factor analysis. To evaluate the economic development, the per capita variables

are more reasonable. Using the data from the textbook, the author calculates per capita

data for each variable except "total population", and then using factor analysis to do

same kind of research. The results (Table 6) show that the Number 1 extracted factor

can explain more than 80% of the variation, showing that there is greater relationship

between per capita level of economic indicators. We use the component matrix (Table

7) to recalculate the score.

Table 6 New Total Variance Explained

Initial Eigenvalues Extraction Sums of Squared Loadings

Component

Total % of Variance Cumulative % Total % of Variance Cumulative %

1 4.210 84.210 84.210 4.210 84.210 84.210

2 .592 11.833 96.042

3 .139 2.776 98.818

4 .039 .770 99.588

5 .021 .412 100.000

Extraction Method: Principal Component Analysis.

258 Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research

a

Component Matrix

Component

1

Government Revenue Per C .966

Investment in Fixed Asset Per C .948

Household Consumption Expenditure .698

Retail Sales of Consumer Goods Per C .968

Gross Regional Product Per C .978

Extraction Method: Principal Component Analysis.

a. 1 components extracted.

The final regional rank of the economic level (Fac 1) is shown below.

Rank Region Fac1

1 Shanghai 2.78325

2 Beijing 2.6151

3 Zhejiang 1.30243

4 Tianjin 1.13873

5 Jiangsu 1.02431

6 Guangdong 0.82984

7 Liaoning 0.67854

8 Shangdong 0.65539

18 Henan -0.31168

Clearly, the ranking in Table 8 agrees with the actual economic situation in China.

The more developed regions are on the top.

2. Conclusions

Factor analysis is a widely taught and used statistical method, especial in the field of

social indicator research. Various professional statistical softwares (such as SPSS and

SAS) integrate modules of factor analysis to automate the process. But without careful

examination of the raw data, erroneous conclusions are unavoidable. The quality of the

factor analysis result highly depends on the original variables, data sources, and the

analysis method.

The KMO test is often employed to test whether the data is suitable for factor

analysis, but this test cannot tell whether the data itself is reasonable for analysis.

Most of time, factor analysis results in the textbook only give some mathematical

expressions without clear interpretation. When researchers and teachers used factor

Y. Xie et al. / The Factor Analysis’s Applicability on Social Indicator Research 259

applicability of factor analysis to the original data, and to analyze whether the variables

are on the same scale. Only if the original data meet the scientific requirements, the

reliable conclusions could be reached.

In short, the statistical analysis cannot completely rely on the automated computer

software. The KMO test results can only be used as a reference. We should carefully

examine the applicability of the original data and give a cautious explanation. Provided

that some popular textbooks ignore this point, we wish this article can draw the

readers’ special attention to the raw data when conducting factor analysis.

Acknowledgements

This work was supported by the National Social Science Fund 15AZD077.

References

[1] H. H. Harman, Modern Factor Analysis, 3rd ed. Chicago: University of Chicago Press. 1976.

[2] N. Cressie, Statistics for spatial data. John Wiley & Sons, 2015.

[3] J. L. Devore, Probability and Statistics for Engineering and the Sciences. Cengage Learning, 2015.

[4] D. R. Anderson, D. J. Sweeney, T. A. Williams, et al. Statistics for business & economics. Nelson

Education, 2016.

[5] J. Pearl, M. Glymour, N. P. Jewell. Causal Inference in Statistics: A Primer. John Wiley & Sons, 2016.

[6] J. R. Schott, Matrix analysis for statistics. John Wiley & Sons, 2016.

[7] D. C. Howell, Fundamental statistics for the behavioral sciences. Nelson Education, 2016.

[8] X. Q. He, multivariate statistical analysis, Renmin University of China Press, 2011, 143-173.

[9] J. P. Jia, Statistics, Renmin University of China Press, 2011, 254-270.

[10] J. Kim and C. W. Mueller. Factor Analysis: What it is and how to do it. Beverly Hills and London:

Sage Publications, 1978.

[11] P. Kline, An easy guide to factor analysis. London: Routledge, 1994.

260 Fuzzy Systems and Data Mining II

S.-L. Sun et al. (Eds.)

IOS Press, 2016

© 2016 The authors and IOS Press. All rights reserved.

doi:10.3233/978-1-61499-722-1-260

Based on Genetic Algorithm

Yan-Sheng ZHANG1, Zhong-Tao QIAO and Jian-Hui JING

The Fourth Department, Ordnance Engineering College, Shijiazhuang City, Hebei

Province, China

optimization problem, which is an important content of command and decision in

air defense operation. WTA is known to be NP-complete problem, and the

intelligent optimization methods are widely employed to solving it. A popular

coding length is n m corresponding to assigning n weapons to m targets. However,

the coding length will increase greatly with the problem scale growing, and the

computation is too heavy to meet the real-time requirements. This paper focuses

on designing a new gene coding to improve computational efficiency. In our study,

a sequence of weapons serves as gene coding, which is attached the two other

codes, target code and capacity code respectively. This coding length is n and

adapts to the constraints of WTA effectively. Then the operators of gene selection,

crossover and mutation are designed. On the other hand, the maximum operational

effectiveness is defined as the object function with the minimum consumption of

ammunition. This model is based on multi-objective optimization, and is more

realistic. An example shows that the method is feasible and can save computing

time greatly.

Introduction

The weapon target allocation (WTA) is to optimize the distribution of our forces and

weapons according to the characteristics and quantity of incoming targets for the best

operational effectiveness. The WTA is a typical constrained combinatorial optimization

problem, which is a hard Non-Polynomial optimizing problem. The model of WTA

based on multi-objective optimization is more realistic and a hot topic. At present, the

intelligent optimization methods[1-3], such as genetic algorithm (GA), particle swarm

algorithm (PSA), ant colony algorithm (CA), and simulated annealing (SA), are widely

employed to solving WTA.

These intelligent algorithms have been shown better solutions than the classic ones.

However, it is not enough to satisfy real-time requirement of air-defense. In this paper,

we focus on designing a new gene coding to improve computational efficiency. A

popular genetic coding length is n*m corresponding to assigning n weapons to m

targets. In our study, a sequence of weapons serves as gene coding, which is attached

the two other codes, target code and capacity code respectively. This coding length is n

and adapts to the constraints of WTA effectively. On the other hand, the maximum

1

Corresponding Author: Yan-Sheng ZHANG, Lecturer, Ordnance Engineering College, No.97 Heping

West Road, Shijiazhuang City, Hebei Province, China; E-mail: zhang_sheng_74@163.com.

Y.-S. Zhang et al. / Research on Weapon-Target Allocation Based on Genetic Algorithm 261

consumption of ammunition.

1. Mathematic Model

Our anti-aircraft equipment is represented by A=[a1, a2,…, an], in which ai means the ith

(1≤i≤n) weapon. R=[r1, r2,…, rn] represents the capacity of ammunition corresponding

to A=[a1, a2,…, an], and ri means the quantity of ammunition about ai. Target set is

given by T=[t1, t2,…, tm], and tj (1≤j≤m) is the jth incoming target. D=[d1, d2,…, dm]

shows threat levels corresponding to T=[t1, t2,…, tm], and dj represents threat degree of

tj. P=[pij]nm is a matrix of intercept probability, and pij gives the intercept probability of

ai to tj. The decision matrix is described by X=[xij]nm, and xij is the number of missile

about ai to tj.

Operational effectiveness f1(X) is expressed in Eqs. (1) [4]. The total number of

missiles f2(X) consumed is given in Eqs. (2). The optimization of WTA is to make f1(X)

as large as possible, and f2(X) as small as possible. The multiple objective

optimizations can be transformed into a single one as the following, shown in Eqs. (3).

Then f(X) is the objective function, in which L1 and L2 are weights. We expect f(X) as

large as possible, shown in Eqs. (4).

m n

¦ d j (1 (1 pij ) ij )

x m

f1 ( X ) (1)

j 1

n m

i 1 ¦

j 1,觟j z k

xij 0 (5)

f2 ( X ) ¦¦ x ij

(2) n

i 1 j 1 s.t.

¦x ij t1 (6)

f (X ) L1 f1 ( X ) L2 f 2 ( X ) (3) i 1

1 d xij d ri (7)

max f ( X ) (4)

Usually, there are some constraints about f(X). The number of weapons is greater

than the number of targets, namely, n m. A weapon ai is allowed only to been allocated

to one target tk, which is indicated in Eqs. (5). It is concluded that there is only one

nonzero element in each line of X. Any target is assigned at least one weapon. This

tells that at least one element of each column is not zero in X, as shown in Eqs. (6). The

number of missiles xij, which ai is allocated to tj, shouldn’t exceed the capacity of the

missile ri, given in Eqs. (7).

The decision matrix X is the solution of the objective function. It is very complicated to

perform gene crossover and mutation if X is directly encoded as gene particle. A

sequence of weapons, A=[a1, a2,…, an], serves as gene coding, and 1, 2,… n represents

a1, a2,…, an respectively. Additionally, each gene particle is set two other codes.

Corresponding to A=[a1, a2,…, an], one is the target set T=[t1, t2,…, tn], and the other is

the quantity set of ammunition C=[c1, c2,…, cn]. t1, t2,…, tn are described by ,

262 Y.-S. Zhang et al. / Research on Weapon-Target Allocation Based on Genetic Algorithm

,… m , and c1, c2,…, cn meet the conditions of c1< r1, c2< r2,…, cn< rn. For example,

the gene coding and its additional coding are shown in Figure 1.