You are on page 1of 25

JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.

1 (1-25)
Available online at www.sciencedirect.com

ScienceDirect
Fuzzy Sets and Systems ••• (••••) •••–•••
www.elsevier.com/locate/fss

Principles for constructing three-way approximations of fuzzy sets:


A comparative evaluation based on unsupervised learning
Jie Zhou a,b,c , Witold Pedrycz d,e,f , Can Gao a,b,c,∗ , Zhihui Lai a,b,c , Xiaodong Yue g
a College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong 518060, China
b SZU Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, Guangdong 518060, China
c Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, Guangdong 518060, China
d Department of Electrical & Computer Engineering, University of Alberta, Edmonton, Canada
e Department of Electrical and Computer Engineering, Faculty of Engineering, King Abgudulaziz University, Jeddah 21589, Saudi Arabia
f Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland
g School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China

Received 17 October 2019; received in revised form 12 May 2020; accepted 23 June 2020

Abstract
Three-way approximations of fuzzy sets are an important scheme of granular computing, by abstracting a fuzzy set to its dis-
crete three option-alternatives which adhere to human cognitive behaviors and reduce the computational burden. The key point of
such three-way approximations of fuzzy sets is how to choose a suitable design leading to their realization. Undesired three-way
approximations might be produced if the selected mechanism is unsuitable to data distribution. In this study, the principles for
constructing three-way approximations of fuzzy sets are summarized. The following taxonomy of these principles is provided,
namely (i) uncertainty balance-based principle, (ii) prototype-based principle, and (iii) model-based invoking the tradeoff between
classification error and the number of data that have to be classified. A number of detailed optimization models are discussed
in detail. To evaluate the performance of different construction principles, a general unsupervised learning framework based on
three-way approximations of fuzzy sets is exhibited. Some synthetic data sets along with sixteen data sets from UCI repository are
involved for experiments. Friedman testing followed by Holm-Bonferroni testing are exploited to test the performance significance
of the proposed criteria, which can provide insights and deliver guidance when choosing a principle for constructing three-way
approximations of fuzzy sets in the real-world scenarios. The research methods in this paper can also be extended to supervised
and semi-supervised learning areas.
© 2020 Elsevier B.V. All rights reserved.

Keywords: Shadowed sets; Three-way approximations; Fuzzy sets; Statistical significance testing

* Corresponding author at: Room 519, College of Computer Science and Software Engineering, Shenzhen University, Nanshan District, Shenzhen
City, Guangdong Province, China.
E-mail addresses: jie_jpu@163.com (J. Zhou), wpedrycz@ualberta.ca (W. Pedrycz), 2005gaocan@163.com (C. Gao), lai_zhi_hui@163.com
(Z. Lai), yswantfly@shu.edu.cn (X. Yue).

https://doi.org/10.1016/j.fss.2020.06.019
0165-0114/© 2020 Elsevier B.V. All rights reserved.
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.2 (1-25)
2 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

1. Introduction

In the last two decades, granular computing [1–3] has emerged as a growing information processing paradigm for
dealing with human-centric real-world problems, with which a lot of successful achievements have been obtained in
the fields of AI (Artificial Intelligence), reasoning, decision analysis, data mining, etc. [4,5]. The key components of
granular computing, i.e., information granules [6,7] that are established at different levels of information granularity,
provide a required level of abstraction for problems which engage aspects of human cognitive behaviors.
Fuzzy sets [8], rough sets [9], three-way decisions [10] and shadowed sets [11] are the special formalisms of
granular computing. They exhibit not only unique advantages, but also show some close connections. Three-way
decisions extend two valued decisions (true or false, accept or reject) to three options (true, false and noncommitment)
which provide the necessary flexibility and universality of thinking in threes. For the noncommitment part, it means
no decisions can be made according to the information already available at hand. The approximations in rough set
theory correspond to the three options presented in three-way decisions, i.e., the positive region is related to true
(acceptance), the negative region is related to false (rejection) and the boundary region is related to noncommitment.
The approximation partitions of the universe based on rough set theory can be considered as an aspect of three-way
decisions, some other aspects have been widely considered, such as three-valued logics [12,13], three-way clustering
[14], pattern discovery with three-way decisions [15], three-way concept learning [16], three-way group decisions
[17], sequential data analysis [18] and others. In this case, it may be emphasized that the theory of three-way decisions
goes beyond rough set theory to some extent.
With regard to shadowed sets, it is often derived from a fuzzy set. The three-valued logics 1, 0 and [0, 1] in the
shadowed sets are related to the three options in three-way decisions, and the corresponding fuzzy set is divided into
three pair-wise disjoint regions, i.e., fully accept, fully reject and the shadows (no decisions) [19]. In this way, a fuzzy
set can be represented by three abstracted approximations which enhance the interpretations of data structure and
hierarchy of concepts. Specifically, shadowed sets are a special case of three-way approximations of fuzzy sets, which
provide an abstraction mechanism for the fuzzy sets. The key point of three-way approximations of fuzzy sets is how
to choose and evaluate a proper construction principle (granulation principle). Undesired three-way approximations
may be produced if the data structure can not be detected well by the selected principle.
Pedrycz’s pioneering shadowed sets [11] produced an effect of vagueness reallocation, and an optimization objec-
tive function is developed to determine the optimal thresholds partitioning fuzzy sets into three approximation regions.
Deng and Yao [20] proposed 0.5-three-way approximations of fuzzy sets based on minimizing the average distances
between original membership grades and three-values {0, 0.5, 1}. Yao et al. [21] further presented a general optimiza-
tion model to construct three-way approximations of fuzzy sets based on the principle of minimal distance. According
to the notion of gradual grade of fuzziness, Tahayori et al. [22] put forward an analytic solution to construct three-way
approximations. Nguyen et al. [23] analyzed several distance-based approximations for transforming fuzzy recom-
mendations to crisp ones. Grzegorzewski [24] studied a kind of approximations simplifying fuzzy numbers based on
shadowed sets. Zhang et al. [25] introduced the notion of game-theoretic shadowed sets. To reduce the construction
uncertainties, Ibrahim et al. [26] presented type-II shadowed sets to deal with the shadowiness in the doubtful zone.
Zhou et al. [27,28] proposed constrained three-way approximations to avoid inconsistent solutions. The previous re-
searches often focus on only one kind of construction principle and its related semantic interpretation. However, the
comparative studies on different principles are rarely reported. No matter which principles are adopted, the intrinsic
data structure needs to be detected well and then structured information processing in computational perspective can
be conducted.
In this research, the principles for constructing three-way approximations of fuzzy sets are systematically studied.
There are four objectives: (1) The alternative construction principles are summarized as three types, i.e., uncertainty
balance-based principle, prototype-based principle and the principle based on the tradeoff between classification error
and the number of data that have to be classified; (2) Seven special optimization models under three types of principles
are exhibited in detail, including Pedrycz’s model, Yao’s model, and some new introduced models, such as entropy-
based models, self-organizing map (SOM)-based model, etc.; (3) A general unsupervised learning framework based on
three-way approximations of fuzzy sets is established to evaluate the performances produced by different optimization
models; and (4) Friedman testing followed by Holm-Bonferroni testing are exploited to test the significance among
seven special methods, which can provide insights and deliver guidance when using three-way approximations of
fuzzy sets in the real applications. How to choose a suitable principle for constructing three-way approximations is a
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.3 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 3

Fig. 1. A view of three-way approximations of a fuzzy set: (a) Three-way approximation partition; (b) The borderline of core and boundary region (the region with
oblique lines) of Cluster 1 according to the membership degrees of items with respect to Cluster 1.

pivotal problem for three-way approximation studies. Though an unsupervised learning framework is only involved
for comparing different construction criteria, the research methods and results in this paper also provide reference
for supervised and semi-supervised learning areas. It is worth stressing that the study exhibits a significant level of
originality by bringing for the first time a sound and comprehensive taxonomy in the area, elaborating on the ensuing
design guidelines and offering thorough comparative studies.
The paper is organized as follows. Some fundamental notions of three-way approximations of fuzzy sets are briefly
reviewed in Section 2. Principles for constructing three-way approximations of fuzzy sets are discussed in Section 3. A
general framework of three-way approximations-based unsupervised learning is introduced in Section 4. Comparative
studies along with thorough statistical testing are presented in Section 5. Some conclusions are given in Section 6.

2. Preliminaries

In this section, some concepts of three-way approximations of fuzzy sets are reviewed briefly. More detailed infor-
mation can be foundin [3,11,29,30].
 
A fuzzy set A = xj , μA xj (j = 1, 2, · · · , N ), xj ∈ Rd (d ≥ 1 stands for the dimensionality in a real universe
of discourse), μA xj ∈ [0, 1] is fully described by a membership function specifying for each xj a degree of its
membership (belongingness) to a given concept. The three-way approximations of fuzzy set A induced by a pair of
thresholds (α, β) with values located in [0, 1] can be formally described in terms of three mutually disjoint categories
of elements of the universe of discourse:
   
Positive region: P os (α,β) (A) = xj |β ≤ μA xj ≤ 1 ; 
Boundary region: Bnd (α,β) (A) = xj |α < μA xj < β  ;
Negative region: N eg (α,β) (A) = xj |0 ≤ μA xj ≤ α .

Where 0 ≤ α < β ≤ 1. Positive region is considered as the β-core of fuzzy set A. It is worth  stressing
  that β-

core of fuzzy set A is different from the kernel of fuzzy set A which includes the elements in xj  μA xj = 1 .
In the following description, the β-core of fuzzy set A is called as the core of fuzzy set A for short if there is no
special explanation. According to three-way decision theory, items located in the positive region are those definitely
belonging to the target concept, i.e., they can be fully accepted; items in the negative region are certainly considered
as not belonging to the target concept, i.e., they can be fully rejected; and the items positioned in the boundary region
are those whose belongingness is uncertain or unknown; as a result no decisions can be made with respect to them
presently. Refer also to Fig. 1(a).
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.4 (1-25)
4 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

The generated three-way approximations of a fuzzy set can reveal data structure well if the thresholds (α, β) are
determined properly. As shown in Fig. 1(b), the obtained three-way approximations based on the fuzzy membership
degrees with respect to Cluster 1 can visualize the data structure well. Most of items belonging to Cluster 1 are
partitioned into the positive region of Cluster 1. Most of items belonging to Cluster 2 and Cluster 3 are positioned
in the negative region of Cluster 1. Items located in the overlapping areas are partitioned into the boundary region
of Cluster 1. In this way, all data are divided into three parts with respect to a fixed cluster which conforms to data
distribution. However, there arises a key problem: how to determine the partition thresholds (α, β). Unreasonable
values of α and β will result in undesired three-way approximations. Subsequently, the intrinsic data structure can not
be detected, and the corresponding semantic interpretations are farfetched.

3. Principles for constructing three-way approximations of fuzzy sets

Some available and new methods for constructing three-way approximations of fuzzy sets will be discussed in
detail in this section. According to the common properties of these methods, they can be summarized as three types,
i.e., uncertainty balance-based principle, prototype-based principle and the principle based on the tradeoff between
the classification error and the number of classified items. These principles have different semantic interpretations and
the corresponding optimization objective functions can be formed to determine the partition threshold values.

3.1. Uncertainty balance-based principle

3.1.1. Pedrycz’s optimization model


Pedrycz [11] firstly introduced the notion of shadowed sets which is an outstanding example of three-way approx-
imations of fuzzy sets. The main merits of shadowed sets include an optimization mechanism for determining the
thresholds and a conceptual framework for interpreting the obtained partition thresholds. Pedrycz used a symmetric
pair of threshold values, i.e., β = 1 − α, to form an optimization objective function for vagueness relocation. The
optimal separation threshold α ∗ is achieved based on the following optimization objective.
α ∗ = arg min VP (α)
α
 
 
         

= arg min  μ A xj + 1 − μA xj − card
  {xj }  , (1)
α  
j :μA xj ≤α  
j :μA xj ≥1−α
j :α<μA xj <1−α

  
where α ∈ [0, 0.5). card (X) means the cardinality of set X. ψ1 =  
μA xj measures the reduction
j :μ x ≤α
    A j

of membership grades, ψ2 =  
1 − μA xj indicates the elevation of membership grades, and ψ3 =
j :μA xj ≥1−α
 
card
  {xj } stands for the shadows. The sum of ψ1 and ψ2 quantifies the uncertainty reduction due to some
j :α<μA xj <1−α
fuzzy items become to crisp ones. To preserve the total uncertainty embedded in the original fuzzy set, the reduc-
tion of uncertainty is compensated by increasing the uncertainty produced in the formed shadows ψ3 , i.e., the new
membership grade of each item partitioned in the shadows is uncertain in the entire interval (0, 1).
After obtaining the value of α∗ , the fuzzy set Acan be partitioned
 into three disjoint regions:
Positive region: P os α ∗ (A) = xj |1 − α ∗ ≤ μA xj ≤ 1 ; 
Boundary region: Bnd α ∗ (A) = xj |α ∗ < μA xj < 1 − α ∗ ;
Negative region: N eg α ∗ (A) = xj |0 ≤ μA xj ≤ α ∗ .
Pedrycz’s model makes the vagueness caused by the membership relocated. The membership values low enough
(in the negative region) and high enough (in the core) will be reassigned as 0 and 1, respectively. These two parts
reduce the uncertainty in the original fuzzy set. To balance the uncertainty reduction, the membership values in the
middle will fall into shadowed area (boundary region) which elevate the uncertainty. The threshold values obtained
based on the optimization model (1) for partitioning three approximate regions are symmetric, and then a linear and
fast algorithm [27] can be exploited to solve this optimization problem. More generally, two asymmetric parameters
can also be involved in the optimization model (1), i.e., β = 1 - α. In this case, the lengths of membership intervals for
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.5 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 5

positive region and negative region may be different. It implies that the decision maker’s attitude towards acceptance
and rejection are different. Some decision makers tend to accept more, and the others tend to reject more. However,
it is time-consuming to solve the optimization model with two asymmetric threshold parameters, which limits the
practicability of the Pedrycz’s model for big data applications.

3.1.2. Generalized Pedrycz’s optimization model


In a real scenario, three approximation regions have different contribution for decision making. One of them may
play more important role than others. Such as in a fault diagnosis application, we should remove the irrelevant causes
(belonging to the negative region) as much as possible due to the real cause often can not be detected directly. However,
in the medical diagnosis, we should get the support causes (belonging to the positive region) as much as possible,
then blood testing or CT scanning is needed. According to Pedrycz’s optimization model, a generalized model for
constructing three-way approximations of fuzzy sets can be formed as follows:

α * = arg min VGP (α)


α
 
 
          
 
= arg min PN ϕ1 xj + PP ϕ2 xj − PB ϕ3 xj  . (2)
α        
 j :μA xj ≤α j :μA xj ≥1−α j :α<μA xj <1−α 
   
Where α ∈ [0, 0.5). ϕ1 xj and ϕ2 xj measure the uncertainty reduction caused by the itemxj which  mem-

bership value is lower and higher than the threshold α and 1 − α, respectively. 1 = PN  
ϕ1 xj and
j :μ x ≤α
   A j

2 = PP  
ϕ2 xj evaluate the weighted uncertainty reduction, where PN and PP are weighted val-
j :μA xj ≥1−α
ues measuring the importance of negative region and positive region when relocating the vagueness. They can be
assigned with
 the probability
  of the items partitioned
 into
 the negative
 and positive
  regions respectively, namely,
PN = card xj |μA xj ≤ α /N , PP = card xj |μA xj ≥ 1 − α /N . ϕ3 xj measuresthe uncertainty   eleva-
tion caused by the item xj which membership value is located in the shadows. 3 = PB  
ϕ 3 xj stands
j :α<μA xj <1−α
for the sum of weighted uncertainty elevation, where PB is a weighted value and  can
 be assigned
 with the probability
of the items partitioned into the boundary region, i.e., PB = card xj |α < μA xj < 1 − α /N .    
Obviously,
 Pedrycz’s
  optimization
  objective function VP (α) is a special case of VGP (α) when ϕ1 xj = μA xj ,
ϕ2 xj = 1 − μA xj , ϕ3 xj = 1 and PN = PP = PB = 1.
In Pedrycz’s optimization model, the simple value 1 is used to measure the elevated uncertainties caused by the
items in shadows. Its original intention is that no specific membership values are assigned to the items in the shadowed
region. These items may be assigned with the membership degrees in the entire interval (0,1), therefore the elevated
uncertainty of each item partitioned into shadowed region is equal to 1 (the length of interval (0,1)). The uncertainty
or fuzziness
   of fuzzy items
 can also be evaluated by some other measures, such as entropy-based measures. Therefore,
ϕ1 xj , ϕ2 xj and ϕ3 xj can be formed based on entropy functions.

3.1.3. Entropy-based models


Entropy is often used as a measure to evaluate the uncertainty of a system. According to the generalized Pedrycz’s
optimization model, an entropy-based model for constructing three-way approximations of fuzzy sets can be formed
as follows.
For ∀μA (x) ∈ [0, 1], De Luca and Termini’s entropy measure [31] of fuzziness is defined as:

ϕ (μA (x)) = {(μA (x) , f (μA (x)))} , (3)


where:

f (μA (x)) = −μA (x) log (μA (x)) − (1 − μA (x)) log (1 − μA (x)) . (4)
f (μA (x)) is a Shannon’s information entropy function which measures the uncertainty caused by the primary fuzzy
membership value μA (x). Its values versus μA (x) are described in Fig. 2.
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.6 (1-25)
6 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

Fig. 2. Mapping membership values to their fuzziness measures.

The fuzziness measure f (μA (x)) is symmetric with respect to μA (x) = 0.5 and achieves the maximal value
when μA (x) = 0.5. When μA (x) = 0 or μA (x) = 1, the fuzziness of the corresponding fuzzy item equals zero,
that means without vagueness. In general, we talk about a certain mapping f : [0,1] → [0, 1] such that f (μA (x)) is
monotonically increasing in [0,0.5], monotonically decreasing in [0.5,1] along with the boundary conditions f (0) =
f (1) = 0 and f (0.5) = 1. By exploiting the fuzziness measure, the optimization objective function for determining
the partition threshold is formed as follows:

α * = arg min VE (α)


α
 
 
            


= arg min  f μA x j + f μA x j − 1 − f μA xj  . (5)
α  
j :μA xj ≤α  
j :μA xj ≥1−α
 
j :α<μA xj <1−α 

According to the properties of entropy measure of fuzziness [22,32], VE (α) can be simplified as:
 
 
        


VE (α) =  f μA x j − 1 − f μA xj  , (6)
       
j :f μA xj ≤λ j :f μA xj >λ 

where α = f −1 (λ) and α ∈ [0, 0.5], f −1 stands for the inverse function of f . In this way, the problem finding
optimal α with respect to the primary
   membership degrees μA xj is transformed to finding optimal λ with respect
to the fuzziness measure f μA xj . Similarly, the generalized format with weighted ratios is formulated as follows:


        

VGE (α) = PN f μA xj + PP f μA x j

 j :μA xj ≤α  
j :μA xj ≥1−α


   

− PB 1 − f μA xj  . (7)
  
j :α<μA xj <1−α 

De Luca and Termini’s entropy measure of fuzziness is only one of possible realizations of the measure the uncer-
tainty. Some other fuzziness measures can also be used here, such as Kaufmann’s fuzziness measure [33], Ebanks’s
fuzziness measure [34] and some other entropy measures [35,36].

3.2. Prototype-based principle

When abstracting a fuzzy set to three-way approximations, the reassigned new membership in each approximation
can be considered as a prototype which should represents the items in this approximation suitably. In this way, the
problem to compute optimal partition thresholds is converted to find three prototypes with the biggest representative
capability for all items in the fuzzy set.
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.7 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 7

Fig. 3. Mapping membership grades to prototypes.

3.2.1. Minimal distance-based model (Yao’s model)


As shown in Fig. 3, a three-valued set {ηN , ηB , ηP }, i.e., the new membership degrees, is formed to represent
all fuzzy items. ηN , ηB and ηP represent the new membership grades of items partitioned into the negative region,
boundary region and positive region, respectively. Therefore, three-way approximations of a fuzzy set can be defined
as:
⎧  

⎨ ηP β ≤ μ A xj ≤ 1
   
Tμ A x j = η B α < μ A xj < β (8)

⎩  
ηN 0 ≤ μA xj ≤ α
To measure the representative capability of the obtained prototypes, an optimization objective function based on
the representative errors is formed as follows:
 ∗ ∗
α , β = arg min VY (α, β)
α,β
⎛ ⎞
⎜              
μA xj − ηB ⎟

= arg min ⎝ μ A x j − ηN  + μA xj − ηP  + ⎠.
α,β      
j :μA xj ≤α j :μA xj ≥β j :α<μA xj <β

(9)
To optimize the threshold values α and β when minimizing VY (α, β), Yao [21,37] introduced a decision theoretical
method. The resulting detailed decision rules can be formed as follows:
                 
(1) if μA xj  ≤ α, then μA xj  − ηN  ≤ μA xj  − ηP  and μA xj  − ηN  ≤ μA  xj  − ηB ;
(2) if μA xj ≥ β,        
 then μA xj − ηP ≤ μA xj − ηN and μA xj − ηP ≤ μA  xj − ηB ; 
(3) if α < μA xj < β, then μA xj − ηB ≤ μA xj − ηN and μA xj − ηB ≤ μA xj − ηP .
      

Generally, let ηN = 0 which means membership grades low enough are reduced to 0, ηP = 1 which states that the
membership grades high enough are elevated to 1. Subsequently, only the parameter ηB needs to be optimized. With
this decision procedures, the optimization results can be obtained as follows:
⎧ 1+ηB  

⎨ 1 ≤ μ A xj ≤ 1
  2
  1+ηB
Tμ A x j = η B ηB
2 < μA xj < 2
, (10)

⎩ ηB
0 0 ≤ μA xj ≤ 2
where ηB ∈ (0, 1). VY (α, β) becomes:
      
VY (ηB ) = μ A xj + 1 − μA xj
   
j :μA xj ≤ηB /2 j :μA xj ≥(1+ηB )/2
    
+ μA xj − ηB . (11)
 
j :ηB /2<μA xj <(1+ηB )/2

VY (ηB ) measures the distance between original membership


 grades and the obtained prototypes. If the optimal
value ηB∗ is obtained, then α ∗ = ηB∗ /2 and β ∗ = ηB∗ + 1 /2. Obviously, the length of boundary region is always
(1 + ηB )/2 − ηB /2 = 0.5 no matter what the value of ηB is. It implies that the ratio of the number of unclassified
items is always 50% if the membership values are uniformly distributed in [0, 1]. Especially, when fixing ηB = 0.5,
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.8 (1-25)
8 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

Fig. 4. Boundary region generated based on ηB .

Fig. 5. Granulation-degranulation mechanisms.

Fig. 6. The schema of SOM based on membership grades.

 
TμA xj becomes 0.5-three-way approximations of fuzzy sets [20], and the partition pairs are α = 0.25 and β = 0.75
no matter what the data distributions are, refer to Fig. 4.

3.2.2. Self-organizing map (SOM)-based model


Information granule is the basic component in granular computing, which abstracts the concept items to some
groups, new concepts, or some new descriptions. With regard to a prototype-based model for constructing three-
way approximations of fuzzy sets, the items can be granulated to three groups according to the obtained prototypes.
After granulation process, the obtained prototypes can be used for degranulation process to reconstruct the original
items, as shown in Fig. 5. The mechanism of “granulation-degranulation” comes also under the name of “coding-
decoding” or “fuzzification-defuzzification”. The results obtained in granulation process should represent the original
data properties, and then as the inputs of degranulation process to reconstruct or estimate the original data. The model
error produced in granulation and degranulation processes, i.e., from original items to their estimations, is expected to
be small as much as possible.
In the granulation process, xj (j = 1, 2, · · · , N ) are mapped to three-valued set {0, ηB , 1}, in which 0 and 1 are
fixed, only the value of ηB needs to be optimized. In this way, self-organizing map (SOM) [38] can be imposed to
optimize the value of ηB , which implements an orderly mapping of a high-dimensional distribution onto a representa-
tive low-dimensional grid, and then the most important topological relationships in the original fuzzy set can be well
preserved, as shown in Fig. 6.
In Fig. 6, there are only three neurons with the outputs 0, ηB and 1. According to SOM [39]  for one-dimensional
case, the regression of the neuron vector vi ∈ R1 into the space of observation vector μA xj ∈ R1 can be made by
the following process:
   
(t)
vi (t+1) = vi (t) + hcxj ,i μA xj − vi (t) , (12)
 
where t stands for the iteration steps, vi denotes the output of the neurons. ∀t , v1 (t) = 0 and v3 (t) = 1. c xj is the
 (t)
index of best mapping unit of μA xj :
       
 (t)   (t) 
μA xj − vc (t)  ≤ μA xj − vi (t)  . (13)
hcxj ,i is a neighborhood function and often taken to be the Gaussian function:
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.9 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 9

   
rc −r 2
− i
hcxj ,i = α (t) e 2σ 2 (t)
, (14)
 
where rc and ri are positions of neurons c xj (corresponding to vc ) and i (corresponding to vi ) on the SOM grid.
The error function of SOM is shown to be:

N 
M
   2
VSOM = hcxj ,i μA xj − vi  . (15)
j =1 i=1

Here M = 3. To minimize the optimization objective (15), a batch computation process [38,39] can be involved, which
is described as follows.     
For each i, it is expected that VSOM (t) hcxj ,i μA xj − vi∗ = 0, i.e., lim vi (t+1) = vi (t) if hcxj ,i is nonzero.
t→∞ t→∞
Approximately, one has
    
hc xj ,i μA xj
j
vi∗ =  . (16)
hcxj ,i
j

Here only v2∗(i.e. ηB ) needs to be updated due to v1∗ = 0 and v3∗ = 1 are fixed. Therefore, the neighborhood function
 
hc xj ,i is simply updated considering the following rules:
For ∀xj (j = 1, 2, · · · , N ),
 
ηB 2
  −
if c xj = 1, then hcxj ,2 = α (t) e 2σ 2 (t)
;
 
if c xj = 2, then hcxj ,2 = α (t);
   
1−ηB 2
  −
if c xj = 3, then hcxj ,2 = α (t) e 2σ 2 (t)
.

In accordance with the above discussion on SOM for mapping a fuzzy set to three prototypes, an optimization
algorithm with iteration strategy can be constructed as follows:

Algorithm I Constructing three-way approximations of fuzzy sets based on SOM.


  
Input: Fuzzy set A = xj , μA xj (j = 1, 2, · · · , N );
Output: The value of ηB and the approximation region partitions.
 of ηB ∈ (0, 1) randomly;
Step 1: Initialize the value
Step 2: Compute c xj for each item xj in A with formula (13);
Step 3: Update ηB with formula (16);
Step 4: Repeat Steps 2 to 3 until convergence has been reached;
Step 5: According to the obtained ηB and formula (13), partitioning A into three approximation regions.

The convergence in Step 4 can be assessed using the following criterion:


 
 (t+1) 
ηB − ηB (t)  ≤ ε. (17)
ε is a predefined small threshold. With (16), the impact caused by the learning rate factor α (t) is removed. The only
parameter is σ (t) controlling the neighborhood influence zone. It is a decreasing monotonic function of iterations t ,
quite commonly one considers its form to be:
 t
−T
σ (t) = σ0 e . (18)
σ0 can be assigned to the radius of the SOM. T is a constant, often T = 1000
log(σ0 ) . Being different from Yao’s optimization
model based on minimal distance, SOM-based model maps the original fuzzy membership degrees to a grid with only
three neurons. No matter which prototype-based method is exploited, three representative prototypes will be generated.
Though the iteration strategy is involved to solve SOM-based model, the convergence can be achieved fast which will
be illustrated in the experiments.
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.10 (1-25)
10 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

Fig. 7. Coverage degree and hesitation degree versus values of α.

3.3. The principle based on the tradeoff between classification error and the number of classified items

A larger number of data located in the uncertain region helps increase accuracy (as less data are potentially mis-
classified), however we are penalized by the larger number of data which are not classified. On the other side, a
smaller number of data located in the uncertain region upgrade the number of classified data, however the accuracy
may be decreasing. The quality measure for the approximation region partitions can take into account the number of
misclassified data along with the number of data located in the uncertain region.
To measure the classification error and the number of classified data, two uncertainty measures are introduced in
this section, including coverage degree and hesitation degree. Coverage degree measures the number of classified data,
and hesitation degree measures the property of data located in the uncertain region. They are formulized as follows:
Coverage degree:
  
cov (α) = μ A xj . (19)
j :xj ≥1−a

Hesitation degree:
           

H (α) = −μA xj log μA xj − 1 − μA xj log 1 − μA xj . (20)


j :α<xj <1−a

Where α ∈ [0, 0.5). Hesitation degree involves De Luca and Termini’s entropy for the items in boundary region. The
optimal partition threshold maximizes the following objective function:
α ∗ = arg max VCH (α) = arg max (cov (α) · H (α)) . (21)
α α
When increasing the value of α, the coverage degree will increase and hesitation degree will decrease, as shown
in Fig. 7. In this case, more data will be classified. It also means that more data are potentially misclassified and
farfetched results may be produced. In contrast, when decreasing the value of α, the coverage will become small
and hesitation degree will increase. Therefore, more data will be unclassified which results in potentially higher
classification accuracy. If more data are unclassified, i.e., more data are partitioned in the boundary region because of
their location in the uncertainty region, the formed model is not helpful for decision makers due to uncertain region is
too large. A tradeoff between coverage degree and hesitation degree needs to be balanced.

4. A general framework of unsupervised learning based on three-way approximations of fuzzy sets

Three-way approximations of fuzzy sets provide good interpretations for the data distribution from the macro-level
rather than the micro-level of each individual item. Due to data structure can be well detected by the produced three-
way approximations, it is beneficial for unsupervised learning. Some three-way approximations-based unsupervised
learning methods have been reported, including rough C-means (RCM) [40], shadowed C-means (SCM) [41], rough-
fuzzy C-means(RFCM) [42] and its revised version [43,44], shadowed set-based rough-fuzzy C-means (SRFCM)
[45,46]. It has been demonstrated that the performance of these three-way approximations-based methods are better
than the performance provided by the original Fuzzy C-means (FCM) [47].
Regarding to RCM and RFCM, the partition thresholds are assigned subjectively. Though SRFCM exploits
Pedrycz’s model to construct three-way approximations from fuzzy sets, it is claimed that Pedrycz’s model belongs
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.11 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 11

to only one kind of principles when forming three-way approximations of each cluster. Different principles will result
in different three-way approximations for a fixed fuzzy set and then affect the unsupervised performance. To eval-
uate the effectiveness of different principles when constructing three-way approximations, a general framework of
unsupervised learning based on three-way approximations of fuzzy sets is introduced in Algorithm II.

Algorithm II A general framework of unsupervised learning based on three-way approximations of fuzzy sets.
 
Input: Data set xj , the number of clusters C and  fuzzification
 coefficient m;
Output: Prototypes {vi } and membership matrix uij .
Step 1: Initialize the prototype {vi };
Step 2: Compute membership values uij of each item xj to each prototype vi . uij is computed as the same as in FCM;
Step 3: For each cluster Gi , according to three-way approximation construction principles discussed in Section 3, obtain the optimal threshold αi∗ ;
Step 4: According to αi∗ , determine the three-way approximation regions for cluster Gi ;
Step 5: Update the prototypes:


⎨ wl × l + wb × b if P osαi (Gi ) = ∅ ∧ Bndαi (Gi ) = ∅
vi = l if P osαi (Gi ) = ∅ ∧ Bndαi (Gi ) = ∅ , (22)

b if P osαi (Gi ) = ∅ ∧ Bndαi (Gi ) = ∅
 
um x
  ij j
um x
  ij j
xj ∈P osαi Gi xj ∈Bndαi Gi
where wl + wb = 1, l =  , b =  ;
um
  ij
um
  ij
xj ∈P osαi Gi xj ∈Bndαi Gi
Step 6: Repeat Steps 2 to 5 until convergence is reached.

In Step 5, the approximation regions obtained based on the optimal partition threshold αi∗ are used to update the
prototypes. Generally, the parts l and b in formula (22) stand for the contribution from the positive region and
boundary region, respectively. Items in the negative region have no contribution when updating this cluster prototype.
Three-way approximations-based clustering methods can be considered as a bridge between hard C-means and fuzzy
C-means, in which items in the positive and boundary regions are only selected for renewing prototypes. In hard
C-means method, the information in the boundary region are totally ignored. However, in fuzzy C-means method, all
data are involved which impose useless information and the prototype updating is deviated.
In Step 3, the membership degrees of items with respect to a fixed cluster can be considered as an independent
fuzzy set, and then the corresponding three-way approximations of each cluster can be formed based on different opti-
mization models displayed in Section 3. Obviously, different models used for constructing three-way approximations
will result in different approximation region partitions. Therefore, the prototype calculations will be affected. If the
generated three-way approximations fit for the data distribution well, the obtained prototypes will tend to their natural
positions.

5. Experimental studies

In this section, some synthetic fuzzy sets and data sets coming from UCI repository are investigated to compare
the performance of different optimization models under three types of principles for constructing three-way approxi-
mations of fuzzy sets.

5.1. Synthetic data sets

5.1.1. Synthetic fuzzy set D14


Suppose a fuzzy set is given as in Table 1, is defined with some individual elements x1 , x2 , · · · , x14 .
The obtained thresholds based on the use of different optimization models are presented in Table 2.
According to Table 2, Pedrycz’s model and weighted Pedrycz’s model tend to form narrow boundary region,
wide positive and wide negative regions. The other models tend to produce wide boundary region, narrow positive
and narrow negative regions. In addition, the length of boundary region obtained based on Yao’s model and SOM
are 0.5, where the three-way approximations obtained by other models are symmetrical. The values of optimization
objective functions when increasing the value of α are displayed in Fig. 8. The optimal threshold values, minimizing
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.12 (1-25)
12 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

Table 1
A synthetic fuzzy set D14 .
Items Membership grades Items Membership grades Items Membership grades
x1 0.1 x6 0.5 x11 0.8
x2 0.2 x7 0.6 x12 0.9
x3 0.3 x8 0.65 x13 0.95
x4 0.4 x9 0.7 x14 1
x5 0.45 x10 0.78

Table 2
Obtained thresholds for fuzzy set D14 .
Models Threshold α Negative regions Boundary regions Positive regions
Regions Items Regions Items Regions Items
PrP 0.4 [0, 0.4] {x1 , x2 , ..., x4 } (0.4, 0.6) {x5 , x6 } [0.6,1] {x7 , x8 , ..., x14 }
PrW P 0.35 [0, 0.35] {x1 , x2 , x3 } (0.35, 0.65) {x4 , x5 , ..., x7 } [0.65, 1] {x8 , x9 , ..., x14 }
PrE 0.22 [0, 0.22] {x1 , x2 } (0.22, 0.78) {x3 , x4 , ..., x9 } [0.78, 1] {x10 , x11 , ..., x14 }
PrW E 0.22 [0, 0.22] {x1 , x2 } (0.22, 0.78) {x3 , x4 , ..., x9 } [0.78, 1] {x10 , x11 , ..., x14 }
PrY 0.25 [0, 0.25] {x1 , x2 } (0.25, 0.75) {x3 , x4 , ..., x9 } [0.75, 1] {x10 , x11 , ..., x14 }
PrSOM 0.2976 [0, 0.2976] {x1 , x2 } (0.2976, 0.7976) {x3 , x4 , ..., x10 } [0.7976, 1] {x11 , x12 , ..., x14 }
PrCH 0.22 [0, 0.22] {x1 , x2 } (0.22, 0.78) {x3 , x4 , ..., x9 } [0.78, 1] {x10 , x11 , ..., x14 }
Notes: PrP , PrW P , PrE , PrW E , PrY , PrSOM and PrCH stands for constructing three-way approximations based on Pedrycz’s model, weighted
Pedrycz’s model, entropy model, weighted entropy model, Yao’s model,SOM-based model and the model based on the tradeoff between coverage
degree and hesitation degree (CH-tradeoff-based model).

the objective function in Pedrycz’s model, weighted Pedrycz’s model, entropy-based model, weighted entropy-based
model, Yao’s model or maximizing the objective function in CH-tradeoff-based model are exist obviously. It can
be obtained by directly using enumeration methods or some fast optimization methods reported in [27,28]. When it
comes to SOM-based model, though the iteration process is exploited, the iteration times are very small. Only three
iterations are needed to achieve the convergence.

5.1.2. Gaussian fuzzy membership function


Considering one-dimensional Gaussian fuzzy membership function as follows:
 2
xj −x
− δ
ui = e . (23)
Where xj ∈ R1 , x = 5 and δ = 2. xj ranges from 0 to 10 with the step of 0.01, as shown in Fig. 9(a). The thresholds
obtained based on different models are presented in Table 3. As the same in D14 , the boundary region obtained by
Pedrycz’s model and weighted Pedrycz’s model are relatively narrow, and the corresponding positive and negative
regions are relatively wide compared with the one obtained by the other principles.
The values of optimization objective functions when increasing the value of α are displayed in Fig. 9(b)-(f). Ob-
viously, only one global optimal value exists for each optimization construction model. With regard to SOM-based
model, the iteration process is converged fast. Only three times are needed, which is also illustrated in Fig. 8(d).

5.1.3. Synthetic two-dimensional data


This synthetic data set D220 with a mixture of Gaussian distributions is depicted in Fig. 10(a). It has three clusters
with 50, 100 and 70 data, respectively. The means of three clusters are υ1 = [5, 4], υ2 = [7, 9] and υ3 = [8, 3], and the
standard deviations of three clusters are 1, 1.5 and 0.4, respectively. As observed, Cluster 1 and Cluster 2, Cluster 1
and Cluster 3 overlap. According to Algorithm II, Step 3 is the key point for constructing three-way approximations of
each cluster. Different construction models will be used in Step 3 when the rest running situations maintain the same,
including the initializations, the convergence condition and the fuzzification coefficient. The membership grades of
each item with respect to each cluster are computed as the same as in FCM in the whole iteration process. To reduce
the uncertainty caused by random initializations, the clustering method is executed 10 times for each data set using
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.13 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 13

Fig. 8. Analysis for D14 . (a) Pedrycz’s model and weighted Pedrycz’s model; (b) Entropy-based model and weighted entropy-based model; (c) Yao’s model; (d) The
value of threshold in each iteration using SOM; (e) CH-tradeoff-based model.
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.14 (1-25)
14 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

Fig. 9. Analysis for Gaussian fuzzy membership grades. (a) Gaussian fuzzy membership grades; (b) Pedrycz’s model and weighted Pedrycz’s model; (c) Entropy-based
model and weighted entropy-based model; (d) Yao’s model; (e) The value of threshold in each iteration using SOM; (f) CH-tradeoff-based model.
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.15 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 15

Table 3
Obtained thresholds for Gaussian membership grades.
Models Threshold α Negative regions Boundary regions Positive regions
PrP 0.4133 [0, 0.4133] (0.4133, 0.5867) [0.5867,1]
PrW P 0.3788 [0, 0.3788] (0.3788, 0.6212) [0.6212, 1]
PrE 0.2652 [0, 0.2652] (0.2652, 0.7348) [0.7348, 1]
PrW E 0.2821 [0, 0.2821] (0.2821, 0.7179) [0.7179, 1]
PrY 0.2244 [0, 0.2244] (0.2244, 0.7244) [0.7244, 1]
PrSOM 0.3272 [0, 0.3272] (0.3272, 0.8272) [0.8272, 1]
PrCH 0.2077 [0, 0.2077] (0.2077, 0.7923) [0.7923, 1]

Table 4
Obtained thresholds for D220 .
Models Threshold Obtained prototypes AvgDis
Cluster1 Cluster2 Cluster3 Cluster1 Cluster2 Cluster3
PrP 0.2984 0.3384 0.2621 [4.6946, 3.5591] [7.0847, 9.1481] [8.0113, 3.0360] 0.2482
PrW P 0.1682 0.2504 0.1745 [4.7046, 3.3732] [6.9772, 9.1676] [8.0753, 3.0219] 0.3135
PrE 0.1151 0.1450 0.1323 [4.7493, 3.2014] [7.1299, 9.4728] [8.0738, 3.0095] 0.4673
PrW E 0.0928 0.1208 0.1221 [4.7370, 3.1885] [7.1831, 9.4533] [8.0721, 3.0100] 0.4715
PrY 0.0678 0.3568 0.0712 [4.7743, 3.6989] [7.1993, 9.5112] [7.9547, 3.1719] 0.3676
PrSOM 0.1189 0.1930 0.1859 [4.7014, 3.5775] [7.0405, 9.1330] [8.0028, 3.0924] 0.2496
PrCH 0.0649 0.1600 0.0583 [4.6646, 3.1685] [6.9744, 9.3270] [8.1588, 2.9510] 0.4636

each construction model, and the average values are selected as the results. Due to the items in the positive region play
the most important role when updating the prototypes, let the weighted parameter wl = 0.95.
The obtained values of the thresholds while considering different construction models, as well the obtained pro-
totypes and the average distance (AvgDis) between the obtained prototypes and their true positions are given in
Table 4.
From Table 4, the average distances obtained by Pedrycz’s model and SOM-based model are smaller than the one
obtained by the other methods. It means the prototypes obtained by Pedrycz’s model and SOM-based model tend to
their natural positions. The approximation region partitions produced by using different principles are displayed in
Fig. 10(b-h). Intuitively, for a fixed cluster, the items belonging to this cluster should be partitioned into its positive
region, meanwhile, the items in the other clusters need to be partitioned into its negative region as much as possible.
Only some overlapping items can be partitioned into its boundary region. In this way, the data structure of this cluster
can be detected well. However, only the approximation region partitions obtained by Pedrycz’s model (Fig. 10(b)) and
SOM-based model (Fig. 10(g)) follow this requirement well. As for the principles PrW P , PrE , PrW E and PrCH , the
obtained positive region of a fixed cluster is relatively narrow, many items belonging to this cluster are then partitioned
into its boundary region. As for PrY , though the positive region of a fixed cluster is suitable (Fig. 10(f)), the boundary
region of Cluster 1 is too broad. Many items belonging to the Cluster 2 and Cluster 3 are partitioned into its boundary
region, then its prototype calculation is inevitably deviated. Cluster 3 also has the same situation.
To evaluate different optimization models comprehensively, some clustering validity indices are involved, including
Xie-Beni index (XB) [48], Davies-Bouldin index (DB) [49], PBM-Index (PBM) [50], normalized mutual information
(NMI) [51], rand index (RI) [52] and recognition accuracy (ACC) [51]. The smaller the XB and DB index values and
the greater the PBM, NMI, RI and ACC index values, the better the clustering methods. The values of the validity
indices for D220 using different construction models are presented in Table 5.
From Table 5, PrP has the best performance in terms of all validity indices. According to the ACC index, the better
performance of PrP is significant compared with other methods. The performance of PrSOM is the second which is
also better than the other methods in terms of ACC index. It implies that the three-way approximations constructed
based on Pedrycz’s model and SOM adhere to the natural data structure better than the other methods when coping
with spherical data sets.
To analyze the three-way approximations-based clustering methods under noise environments, three hundred white
noises are further added to the data set D220 , as displayed in Fig. 11(a), denoted as D520 . The obtained thresholds and
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.16 (1-25)
16 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

Fig. 10. The approximation partitions for each fixed cluster under D220 without noises. (a) D220 ; (b) Pedrycz’s model; (c) Weighted Pedrycz’s model; (d) Entropy-based
model; (e) Weighted entropy-based model; (f) Yao’s model; (g) SOM-based model; (h) CH-tradeoff-based model.
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.17 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 17

Table 5
Validity indices for D220 .
Models PBM XB DB ACC NMI RI
PrP 13.985 0.157 0.586 0.964 0.850 0.955
PrW P 12.198 0.245 0.760 0.898 0.764 0.901
PrE 13.345 0.225 0.728 0.901 0.779 0.909
PrW E 13.180 0.256 0.769 0.902 0.776 0.907
PrY 11.579 0.320 0.880 0.870 0.731 0.881
PrSOM 12.641 0.229 0.707 0.932 0.806 0.928
PrCH 13.757 0.185 0.654 0.930 0.809 0.929

prototypes are shown in Table 6. The final approximation region partitions are presented in Fig. 11(b)-(h). Though
many noise data are added, the data structure can also be detected well by PrP , PrW P and PrSOM . It implies that
the three-way approximations-based clustering methods are robust in the complex environments if the models for
constructing three-way approximations are suitably selected. According to the values of AvgDis, the model PrW P
performs the best, namely, the prototypes obtained based on weighted Pedrycz’s model are closer to their true positions
in the noise environments.
The clustering validity indices for D520 are presented in Table 7. Obviously, PrP gets the best performance in
terms of most validity indices. With regard to ACC index, the weighted Pedrycz’s model and SOM-based model also
get good performance due to the approximation regions are partitioned well.

5.2. UCI data sets

Sixteen data sets are selected from UCI repository to illustrate the performances of different principles. The in-
formation about these sixteen data sets are given in Table 8 and the clustering validity indices are presented in
Tables A1–A8 in Appendix A.
According to Tables A1–A8, different models get the best performance in terms of different validity indices. No
models can get the best performance over all data sets with respect to all validity indices. To compare the different
models statistically, the times of the best validity indices obtained by each model are reported in Table 9. It can be
found that the Pedrycz’s model gets the most times of the best validity indices over all data sets. The SOM-based
model gets the second most times. Entropy-based model and weighted entropy-based model gets the least total times,
especially, zero times of PrW E for the ACC index over all data set, which implies that entropy-based principles are
not suitable for three-way approximations-based unsupervised learning tasks.

5.3. Statistical analysis

In the following, we will use Friedman-Holm-Bonferroni testing framework [53] to test the significance among
different construction models in which the ranking technology is involved.
Firstly, the rank value of each method on each data set can be obtained, then the total average rank of each method
with respect to all data sets can be generated, as shown in Table 10.
According to Friedman test, the statistical variable τF is computed as follows:
(N − 1) τχ 2
τF = , (24)
N (k − 1) − τχ 2
 k 
12N  k(k + 1)2
τχ 2 = Ari2 − . (25)
k (k + 1) 4
i=1
τF obeys the F distribution with degrees of freedom k − 1 and (k − 1) (N − 1), where k is the number of methods,
N is the number of data sets. Ari stands for the total average rank value of ith method over all data sets. Based on the
total average ranks, the value of the statistics τF = 2.3071. If the confidence degree is set as 0.95, the critical value of
F0.05 (6, 6 ∗ 15) = 2.2011 which is smaller than τF = 2.3071. Therefore, the null-hypothesis is rejected and the result
of Friedman test shows that the performance differences among different methods are statistically significant.
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.18 (1-25)
18 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

Fig. 11. The approximation partitions for each fixed cluster under D520 . (a) D520 ; (b) Pedrycz’s model; (c) Weighted Pedrycz’s model; (d) Entropy-based model; (e)
Weighted entropy-based model; (f) Yao’s model; (g) SOM-based model; (h) CH-tradeoff-based model.
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.19 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 19

Table 6
Obtained thresholds for D520 .
Models Threshold α Obtained prototypes AvgDis
Cluster1 Cluster2 Cluster3 Cluster1 Cluster2 Cluster3
PrP 0.3488 0.3594 0.3767 [3.8520, 4.3746] [7.2511, 9.3179] [8.4746, 3.0268] 0.6960
PrW P 0.2579 0.2907 0.2797 [4.1852, 3.8290] [7.1954, 9.3078] [8.3300, 2.9472] 0.5104
PrE 0.1596 0.1812 0.1685 [3.8659, 4.3164] [7.2931, 9.4736] [8.2200, 3.0166] 0.6517
PrW E 0.1475 0.1847 0.1678 [3.9073, 4.3961] [7.2396, 9.5302] [8.2212, 3.0172] 0.6553
PrY 0.0822 0.3031 0.2238 [3.9144, 4.4474] [7.3304, 9.4298] [8.2826, 2.9483] 0.6679
PrSOM 0.1313 0.1880 0.1790 [3.8400, 4.4098] [7.2870, 9.2537] [8.3222, 2.9788] 0.6454
PrCH 0.1156 0.1558 0.1105 [3.7496, 4.2772] [7.1815, 9.5450] [8.1978, 3.0398] 0.6856

Table 7
Validity indices for D520 .
Models PBM XB DB ACC NMI RI
PrP 3.358 0.178 0.871 0.918 0.808 0.925
PrW P 3.359 0.202 0.905 0.911 0.798 0.921
PrE 3.395 0.208 0.913 0.882 0.773 0.906
PrW E 3.305 0.203 0.914 0.860 0.758 0.894
PrY 3.258 0.180 0.890 0.877 0.745 0.894
PrSOM 3.315 0.179 0.876 0.905 0.786 0.918
PrCH 3.408 0.226 0.939 0.878 0.776 0.901

Table 8
Information of the selected UCI data sets.
Data sets Number of data Number Clusters Data sets Number Number Clusters
of features of data of features
Wine 178 13 3 Ionosphere 351 33 2
Glass 214 9 2 Liver 345 6 2
Breast cancer 683 9 2 Teaching 151 5 3
Diabetic 1151 19 2 Vertebral 310 6 3
Heart 270 13 2 Flowmeter A 87 36 2
ILPD 583 10 2 Flowmeter B 92 51 3
Yeast 1484 8 5 Flowmeter C 181 43 4
Seeds 210 7 3 Flowmeter D 180 43 4

Table 9
Times of the best validity indices obtained by each principle for UCI data sets.
Models PBM XB DB ACC NMI RI Total
PrP 3 5 4 5 6 5 28
PrW P 2 1 1 3 4 3 14
PrE 1 2 3 1 1 1 9
PrW E 2 2 1 0 1 1 7
PrY 0 0 0 4 5 4 13
PrSOM 5 3 1 6 2 5 22
PrCH 3 3 6 1 0 1 14

Holm-Bonferroni test is then used as post-hoc test for the Friedman test. We first compute the values zij as follows:

  k (k + 1)

zij = Ari − Arj  . (26)
6N
The values of zij are presented in Table 11.
The values of zij obey the standard norm distribution, and are used to find the corresponding probability when
computing the significant p-values, as displayed in Table 12. If a p-value is less than statistical significance level,
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.20 (1-25)
20 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

Table 10
Rank values and total average rank of each method over sixteen data sets.
Data sets PrP PrW P PrE PrW E PrY PrSOM PrCH
Wine 3.7 1.6 5.4 6.0 4.8 3.4 3.2
Glass 2.2 2.2 3.8 5 5.4 5 4.4
Breast 1.5 3.5 5.3 6 6.7 1.5 3.5
Diabetic 2 4.6 6 3.7 1 3.7 7
Heart 2.5 5.3 4.1 5.3 1.9 1.9 7
ILPD 2.6 2.4 5.3 4.6 6.8 5.3 1
Yeast 3.7 3.4 4.8 4.8 3.3 2.7 5.3
Seeds 1.5 5.5 5.3 6 5.2 1.5 3
Ionosphere 2 3 7 6 1 5 4
Liver 3.4 1.7 5.3 3.4 6.3 5.1 2.8
Teaching 4.2 4.6 3.6 5.5 3.2 2.5 4.4
Vertebral 2.3 5.5 4.4 5.7 1.2 3.9 5
Flowmeter A 1 4 4 4 7 4 4.1
Flowmeter B 3.6 5.8 3.5 3.4 3.8 3.6 4.3
Flowmeter C 5.4 3.1 5.6 5.6 3.5 1.9 2.9
Flowmeter D 3.7 4.1 6.3 3.8 3.4 2.5 4.2

Total Average Rank 2.831 3.769 4.978 4.925 4.031 3.341 4.131

Table 11
z values between different methods.
zij PrP PrW P PrE PrW E PrY PrSOM PrCH
PrP 0 1.2281 2.8111 2.7417 1.5712 0.6677 1.7021
PrW P 1.2281 0 1.5830 1.5136 0.3430 0.5604 0.4740
PrE 2.8111 1.5830 0 0.0694 1.2399 2.1433 1.1090
PrW E 2.7417 1.5136 0.0694 0 1.1705 2.0739 1.0396
PrY 1.5712 0.3430 1.2399 1.1705 0 0.9034 0.1309
PrSOM 0.6677 0.5604 2.1433 2.0739 0.9034 0 1.0344
PrCH 1.7021 0.4740 1.1090 1.0396 0.1309 1.0344 0

Table 12
p-values between different methods.
p PrP PrW P PrE PrW E PrY PrSOM PrCH
PrP 0 0.2194 0.0049 0.0061 0.1161 0.5043 0.0887
PrW P 0.2194 0 0.1134 0.1301 0.7316 0.5752 0.6355
PrE 0.0049 0.1134 0 0.9447 0.2150 0.0321 0.2674
PrW E 0.0061 0.1301 0.9447 0 0.2418 0.0381 0.2985
PrY 0.1161 0.7316 0.2150 0.2418 0 0.3663 0.8958
PrSOM 0.5043 0.5752 0.0321 0.0381 0.3663 0 0.3010
PrCH 0.0887 0.6355 0.2674 0.2985 0.8958 0.3010 0

Table 13
Ratio between significance level and the sorting indices of methods.
i 1 2 3 4 5 6 7
α/(n − i + 1) 0.0071 0.0083 0.0100 0.0125 0.0167 0.0250 0.0500
Notes: α = 0.05, n stands for the number of methods.

then the null hypothesis is rejected. In this case, it means the difference between two compared methods is statistically
significant. Otherwise, the null hypothesis can not be rejected. Due to the testing is conducted with n independent
hypotheses over multiple methods, Bonferroni correction method needs to be involved. Table 13 shows the Bonferroni
correction results including the ratios with respect to significance level and the sorting indices of methods (ascending
sort of p-values with respect to a fixed method, i.e., each row or each column in Table 12).
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.21 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 21

With regard to PrP (as the second row or the second column in Table 12), the null hypothesizes for PrE and PrW E
are rejected due to the corresponding p-values 0.0049 and 0.0061 are smaller than 0.0071 and 0.0083 respectively.
It indicates that PrP performs significantly better than PrE and PrW E . However, the significant differences between
PrP and PrW P , PrY , PrSOM or PrCH are not tested. In addition, the significant differences for PrW P , PrY , PrSOM or
PrCH when comparing with the other methods are also not detected. Consequently, the method for constructing three-
way approximations of fuzzy sets based on Pedrycz’s model, i.e., PrP , performs better than other methods in terms
of the total times of the best validity indices (Table 9). Furthermore, it performs significantly better than the entropy-
based models. It exhibits that the narrow boundary region, and the wide positive and negative regions have beneficial
effects on the clustering results, due to the data structure can be detected well and more information from the positive
region can be kept, as well information from the negative region are totally removed. Narrow boundary region does
not mean that empty boundary region is the best. Empty boundary region may not reflect the true geometric structure
of two overlapping clusters. On the other hand, the case with the wide boundary and narrow positive regions is not
good for fuzzy unsupervised learning, since the most items that naturally belong to a fixed cluster are partitioned into
boundary region, as well most items that naturally belong to the other clusters are also partitioned into the boundary
region of this cluster. In this way, some important information with respect to this fixed cluster are lost and a lot of
meaningless information are involved simultaneously. Therefore the computed prototypes are deviated unavoidably.
From another perspective, narrow boundary region and wide positive region indicate the number of correctly clas-
sified data versus the number of data located in the boundary regions is relatively large. However, not the narrowest
boundary region is the best due to the penalty of misclassified items. Wide boundary region and narrow positive re-
gion mean the number of correctly classified data versus the number of data to be inspected in the boundary region is
relatively small. In this case, many items are not classified which can not provide enough information for prototype
updating, and the prototype of a fixed cluster in each iteration tend to stay in a small area. Effective movement of the
candidate prototype according to the data distribution in the iteration process can not be performed.

6. Conclusions

Several types of methods for constructing three-way approximations of fuzzy sets are analyzed in this paper,
including Pedrycz’s model and its weighted version, entropy-based model and its weighted version, Yao’s model,
SOM-based model and CH-tradeoff-based model. They can be further summarized into three categories, i.e., uncer-
tainty balance-based principle, prototype-based principle and the principle based on the tradeoff between classification
error and the number of data that have to be classified. Different construction models have different semantic inter-
pretations and result in different approximation regions. Their performances are evaluated comparatively under an
unsupervised learning framework with some synthetic and benchmark data sets. The statistical significance testing
involving Friedman-Holm-Bonferroni testing framework is also conducted, which exhibits that the obtained three-
way approximations with narrow boundary region and wide positive and negative regions are beneficial for updating
prototypes when using fuzzy unsupervised learning techniques. According to this observation, Pedrycz’s model and
SOM-based mode are preferentially recommended when no data structure knowledge is given in advance, whereas
entropy-based models could be the last alternatives. The discussed construction models and their comparative evalua-
tions, including the visualization results, provide a guidance for choosing a suitable principle for producing three-way
approximations of fuzzy sets. In the future, we will further incorporate the notion of three-way approximations of
fuzzy sets into supervised learning and semi-supervised learning areas, as well evaluate the performances of different
construction methods in these two areas.

Declaration of competing interest

The authors confirm that there are no known conflicts of interest associated with this publication and there has been
no significant financial support for this work that could have influenced its outcome.

Acknowledgements

The authors are grateful to the anonymous referees for their valuable comments and suggestions. This work is
supported by the Guangdong Natural Science Foundation (No. 2018A030310450, 2018A030310451), and partially
supported by the National Natural Science Foundation of China (No. 61976134, 61976145, 61806127, 61773328).
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.22 (1-25)
22 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

Appendix A

Table A1
Validity indices for Wine and Glass.
Models Wine Glass
PBM XB DB ACC NMI RI PBM XB DB ACC NMI RI
PrP 4.614 0.987 1.030 0.972 0.892 0.961 1.705 0.210 0.886 0.897 0.465 0.815
PrW P 4.568 1.058 1.023 0.978 0.911 0.969 1.845 0.193 0.842 0.897 0.465 0.815
PrE 3.978 1.399 1.150 0.943 0.797 0.923 2.020 0.169 0.864 0.857 0.323 0.758
PrW E 3.969 1.539 1.190 0.941 0.789 0.920 1.559 0.331 1.105 0.808 0.276 0.713
PrY 3.918 1.001 1.065 0.949 0.837 0.934 1.062 0.421 1.448 0.737 0.076 0.611
PrSOM 4.636 1.006 1.050 0.965 0.872 0.953 2.459 0.123 0.746 0.762 0.091 0.644
PrCH 4.601 1.038 1.012 0.971 0.880 0.959 1.815 0.191 0.822 0.893 0.448 0.807

Table A2
Validity indices for Breast cancer and Diabetic.
Models Breast cancer Diabetic
PBM XB DB ACC NMI RI PBM XB DB ACC NMI RI
PrP 6.484 0.102 0.722 0.952 0.712 0.908 1.620 0.262 0.976 0.545 0.007 0.504
PrW P 6.988 0.093 0.695 0.944 0.685 0.895 1.691 0.249 0.948 0.543 0.006 0.503
PrE 7.291 0.084 0.671 0.936 0.660 0.879 1.792 0.226 0.896 0.542 0.006 0.503
PrW E 7.240 0.082 0.666 0.934 0.659 0.877 1.791 0.228 0.903 0.544 0.007 0.503
PrY 7.072 0.085 0.681 0.934 0.659 0.877 1.574 0.261 0.963 0.552 0.009 0.505
PrSOM 6.616 0.099 0.710 0.952 0.712 0.908 1.752 0.236 0.919 0.544 0.007 0.503
PrCH 7.480 0.084 0.661 0.944 0.685 0.895 1.825 0.220 0.884 0.540 0.006 0.503

Table A3
Validity indices for Heart and ILPD.
Models Heart ILPD
PBM XB DB ACC NMI RI PBM XB DB ACC NMI RI
PrP 0.242 1.388 2.643 0.525 0.004 0.4994 0.011 12.687 4.815 0.583 0.019 0.513
PrW P 0.258 1.296 2.550 0.519 0.003 0.4989 0.007 20.657 6.048 0.583 0.019 0.513
PrE 0.361 0.889 2.092 0.522 0.004 0.4991 0.004 31.991 7.356 0.576 0.016 0.511
PrW E 0.328 0.993 2.216 0.519 0.003 0.4989 0.087 21.023 5.582 0.568 0.023 0.509
PrY 0.000 2.68E+06 3.93E+03 0.526 0.003 0.4995 0.003 43.286 8.630 0.563 0.014 0.507
PrSOM 0.000 2.68E+06 3.93E+03 0.526 0.003 0.4995 0.005 30.755 7.230 0.576 0.017 0.511
PrCH 0.366 0.888 2.081 0.515 0.002 0.4985 0.005 26.063 6.567 0.592 0.016 0.516

Table A4
Validity indices for Yeast and Seeds.
Models Yeast Seeds
PBM XB DB ACC NMI RI PBM XB DB ACC NMI RI
PrP 2.336 0.603 1.733 0.443 0.213 0.679 16.877 0.147 0.725 0.919 0.727 0.899
PrW P 1.951 0.875 2.011 0.438 0.177 0.676 17.666 0.136 0.709 0.901 0.690 0.877
PrE 1.813 0.982 2.218 0.413 0.172 0.688 15.881 0.147 0.743 0.875 0.676 0.856
PrW E 1.635 1.313 2.435 0.429 0.175 0.692 14.466 0.177 0.819 0.851 0.644 0.836
PrY 1.796 5.65E+05 4.71E+02 0.441 0.179 0.673 15.423 0.167 0.802 0.866 0.664 0.848
PrSOM 2.303 0.683 1.814 0.455 0.205 0.670 16.887 0.145 0.724 0.919 0.727 0.899
PrCH 1.779 0.748 1.989 0.408 0.155 0.657 17.368 0.134 0.702 0.910 0.722 0.886
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.23 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 23

Table A5
Validity indices for Ionosphere and Liver.
Models Ionosphere Liver
PBM XB DB ACC NMI RI PBM XB DB ACC NMI RI
PrP 1.121 0.382 1.380 0.684 0.090 0.566 2.738 0.112 0.881 0.548 0.002 0.503
PrW P 1.174 0.361 1.351 0.672 0.074 0.558 2.468 0.125 0.933 0.551 0.000 0.504
PrE 1.394 0.272 1.197 0.550 0.003 0.504 0.642 1.093 1.991 0.533 0.001 0.501
PrW E 1.383 0.276 1.208 0.556 0.001 0.505 1.054 0.407 1.409 0.546 0.000 0.503
PrY 0.938 0.449 1.445 0.695 0.120 0.575 2.546 0.118 0.891 0.536 0.003 0.501
PrSOM 1.410 0.273 1.204 0.627 0.024 0.531 2.458 0.123 0.909 0.542 0.001 0.502
PrCH 1.309 0.312 1.264 0.656 0.054 0.547 2.070 0.190 1.042 0.548 0.000 0.503

Table A6
Validity indices for Teaching and Vertebral.
Models Teaching Vertebral
PBM XB DB ACC NMI RI PBM XB DB ACC NMI RI
PrP 1.469 0.801 1.649 0.424 0.048 0.537 3.848 0.243 1.074 0.517 0.322 0.6400
PrW P 1.424 0.339 1.315 0.415 0.033 0.545 3.593 0.265 1.152 0.490 0.331 0.6428
PrE 1.620 0.605 1.697 0.438 0.045 0.568 2.780 0.456 1.487 0.510 0.296 0.6402
PrW E 1.754 0.434 1.355 0.403 0.029 0.549 2.993 0.393 1.350 0.486 0.280 0.6354
PrY 1.381 0.916 1.944 0.465 0.055 0.570 2.711 0.498 1.539 0.529 0.306 0.6409
PrSOM 1.318 1.023 2.054 0.469 0.053 0.566 3.188 0.322 1.282 0.504 0.323 0.6431
PrCH 1.740 0.766 1.729 0.430 0.050 0.544 2.660 0.417 1.412 0.491 0.303 0.6417

Table A7
Validity indices for Flowmeter A and Flowmeter B.
Models Flowmeter A Flowmeter B
PBM XB DB ACC NMI RI PBM XB DB ACC NMI RI
PrP 8.528 0.087 0.4662 0.540 0.003 0.497 10.309 9.683 2.460 0.578 0.345 0.653
PrW P 7.788 0.095 0.4680 0.529 0.001 0.496 10.773 0.220 0.812 0.541 0.344 0.615
PrE 6.882 0.104 0.4547 0.529 0.001 0.496 9.711 0.254 0.858 0.598 0.409 0.694
PrW E 6.807 0.105 0.4552 0.529 0.001 0.496 9.593 0.216 0.859 0.593 0.373 0.673
PrY 4.580 0.145 0.5152 0.517 0.000 0.495 10.740 0.203 0.844 0.543 0.341 0.634
PrSOM 8.710 0.084 0.4724 0.529 0.001 0.496 11.314 0.190 0.816 0.546 0.344 0.636
PrCH 6.673 0.107 0.4587 0.529 0.001 0.496 9.366 0.213 0.783 0.565 0.355 0.664

Table A8
Validity indices for Flowmeter C and Flowmeter D.
Models Flowmeter C Flowmeter D
PBM XB DB ACC NMI RI PBM XB DB ACC NMI RI
PrP 2.329 38.574 6.192 0.382 0.141 0.587 12.302 20.241 4.151 0.483 0.425 0.689
PrW P 6.071 8.218 2.249 0.441 0.229 0.618 15.005 7.409 2.349 0.442 0.342 0.635
PrE 7.694 3.547 1.462 0.392 0.196 0.590 16.471 5.423 2.033 0.420 0.330 0.628
PrW E 6.682 3.458 1.554 0.392 0.196 0.590 11.575 5.960 2.030 0.448 0.368 0.665
PrY 6.533 6.883 2.305 0.424 0.215 0.607 12.819 16.588 3.056 0.461 0.426 0.678
PrSOM 3.752 14.968 4.440 0.461 0.221 0.624 10.393 9.420 2.414 0.469 0.378 0.673
PrCH 3.947 18.542 3.142 0.433 0.194 0.623 13.395 10.634 2.768 0.447 0.411 0.675
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.24 (1-25)
24 J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–•••

References

[1] L.A. Zadeh, Generalized theory of uncertainty (GTU) - principal concepts and ideas, Comput. Stat. Data Anal. 51 (1) (2006) 15–46.
[2] A. Skowron, J. Stepaniuk, Information granules: towards foundations of granular computing, Int. J. Intell. Syst. 16 (1) (2001) 57–85.
[3] Y.Y. Yao, Three-way decision and granular computing, Int. J. Approx. Reason. 103 (2018) 107–123.
[4] J.T. Yao, A.V. Vasilakos, W. Pedrycz, Granular computing: perspectives and challenges, IEEE Trans. Cybern. 43 (6) (2013) 1977–1989.
[5] J.T. Yao, A ten-year review of granular computing, in: Proc. IEEE Int. Conf. Granul. Comput., San Jose, CA, Nov. 2-4, 2007, 2007,
pp. 734–739.
[6] W. Pedrycz, Granular computing for data analytics: a manifesto of human-centric computing, IEEE/CAA J. Autom. Sin. 5 (6) (2018)
1025–1034.
[7] Y.H. Qian, J.Y. Liang, W.Z. Wu, et al., Information granularity in fuzzy binary GrC model, IEEE Trans. Fuzzy Syst. 19 (2) (2011) 253–264.
[8] L.A. Zadeh, Fuzzy sets, Inf. Control 8 (1965) 338–353.
[9] Z. Pawlak, A. Skowron, Rudiments of rough sets, Inf. Sci. 177 (1) (2007) 3–27.
[10] Y.Y. Yao, An outline of a theory of three-way decisions, in: Proc. Rough Sets Curr. Trends Comput, in: LNCS, vol. 7413, 2012, pp. 1–17.
[11] W. Pedrycz, Shadowed sets: representing and processing fuzzy sets, IEEE Trans. Syst. Man Cybern. B 28 (1) (1998) 103–109.
[12] D. Ciucci, D. Dubois, A map of dependencies among three-valued logics, Inf. Sci. 250 (2013) 162–177.
[13] S. Mabuchi, An interpretation of membership functions and the properties of general probabilistic operators as fuzzy set operators. II. Exten-
sion to three-valued and interval-valued fuzzy sets, Fuzzy Sets Syst. 92 (1) (1997) 31–50.
[14] H. Yu, X.C. Wang, G.Y. Wang, et al., An active three-way clustering method via low-rank matrices for multi-view data, Inf. Sci. 507 (2020)
823–839.
[15] F. Min, Z.H. Zhang, W.J. Zhai, et al., Frequent pattern discovery with tripartition alphabets, Inf. Sci. 507 (2020) 715–732.
[16] C.C. Huang, J.H. Li, C.L. Mei, et al., Three-way concept learning based on cognitive operators: an information fusion viewpoint, Int. J.
Approx. Reason. 83 (2017) 218–242.
[17] D.C. Liang, Z.S. Xu, D. Liu, et al., Method for three-way decisions using ideal TOPSIS solutions at Pythagorean fuzzy information, Inf. Sci.
435 (2018) 282–295.
[18] J. Qian, C.H. Liu, D.Q. Miao, et al., Sequential three-way decisions via multi-granularity, Inf. Sci. 507 (2020) 606–629.
[19] W. Pedrycz, A. Bargiela, Granular clustering: a granular signature of data, IEEE Trans. Syst. Man Cybern. B 32 (2) (2002) 212–224.
[20] X.F. Deng, Y.Y. Yao, Decision-theoretic three-way approximations of fuzzy sets, Inf. Sci. 279 (2014) 702–715.
[21] Y.Y. Yao, S. Wang, X.F. Deng, Constructing shadowed sets and three-way approximations of fuzzy sets, Inf. Sci. 412–413 (2017) 132–153.
[22] H. Tahayori, A. Sadeghian, W. Pedrycz, Induction of shadowed sets based on the gradual grade of fuzziness, IEEE Trans. Fuzzy Syst. 21
(2013) 937–949.
[23] H.T. Nguyen, W. Pedrycz, V. Kreinovich, On approximation of fuzzy sets by crisp sets: from continuous control-oriented defuzzification to
discrete decision making, in: Proc. of Int. Conf. Intell. Tech, 2000, pp. 254–260.
[24] P. Grzegorzewski, Fuzzy number approximation via shadowed sets, Inf. Sci. 225 (2013) 35–46.
[25] Y. Zhang, J.T. Yao, Game theoretic approach to shadowed sets: a three-way tradeoff perspective, Inf. Sci. 507 (2020) 540–552.
[26] A.M. Ibrahim, T.O. William-West, Shadowed sets of type-II: representing and computing shadowiness in shadowed sets, Int. J. Intell. Syst.
33 (2018) 1756–1773.
[27] J. Zhou, C. Gao, W. Pedrycz, et al., Constrained shadowed sets and fast optimization algorithm, Int. J. Intell. Syst. 34 (2019) 2655–2675.
[28] J. Zhou, D.Q. Miao, C. Gao, et al., Constrained three-way approximations of fuzzy sets: from the perspective of minimal distance, Inf. Sci.
502 (2019) 247–267.
[29] W. Pedrycz, From fuzzy sets to shadowed sets: interpretation and computing, Int. J. Intell. Syst. 24 (2009) 48–61.
[30] Y.Y. Yao, Three-way decisions and cognitive computing, Cogn. Comput. 8 (2016) 543–554.
[31] A. De Luca, S. Termini, A definition of a nonprobabilistic entropy in the setting of fuzzy sets theory, Inf. Control 20 (4) (1972) 301–312.
[32] G.J. Klir, U.H.S. Clair, B. Yuan, Fuzzy Set Theory: Foundations and Applications, Prentice-Hall, Englewood Cliffs, NJ, 1997.
[33] A. Kaufmann, Introduction to the Theory of Fuzzy Subsets - Fundamental Theoretical Elements, Academic Press, New York, 1975.
[34] B.R. Ebanks, On measures of fuzziness and their representations, J. Math. Anal. Appl. 97 (1) (1983) 24–37.
[35] E. Trillas, T. Riera, Entropies in finite fuzzy sets, Inf. Sci. 15 (2) (1978) 159–168.
[36] R. Mesiar, J. Rybarik, Entropy of fuzzy partitions: a general model, Fuzzy Sets Syst. 99 (1998) 73–79.
[37] Y.Y. Yao, The superiority of three-way decisions in probabilistic rough set models, Inf. Sci. 181 (2011) 1080–1096.
[38] T. Kohonen, The self-organizing map, Neurocomputing 21 (1998) 1–6.
[39] M. Cottrell, M. Olteanu, F. Rossi, N. Villa-Vialaneix, Theoretical and applied aspects of self-organizing maps, in: Proc. of 11th Int. Workshop
WSOM 2016, Houston, Texas, USA, 2016, pp. 3–26.
[40] P. Lingras, C. West, Interval set clustering of web users with rough k-means, J. Intell. Inf. Syst. 23 (1) (2004) 5–16.
[41] S. Mitra, W. Pedrycz, B. Barman, Shadowed c-means: integrating fuzzy and rough clustering, Pattern Recognit. 43 (2010) 1282–1291.
[42] S. Mitra, H. Banka, W. Pedrycz, Rough-fuzzy collaborative clustering, IEEE Trans. Syst. Man Cybern., Part B 36 (4) (2006) 795–805.
[43] P. Maji, S.K. Pal, Rough-fuzzy clustering for grouping functionally similar genes from microarray data, IEEE/ACM Trans. Comput. Biol.
Bioinform. 10 (2013) 286–299.
[44] I. Saha, J.P. Sarkar, U. Maulik, Integrated rough fuzzy clustering for categorical data analysis, Fuzzy Sets Syst. 361 (2019) 1–32.
[45] J. Zhou, W. Pedrycz, D.Q. Miao, Shadowed sets in the characterization of rough-fuzzy clustering, Pattern Recognit. 44 (8) (2011) 1738–1749.
[46] J. Zhou, Z.H. Lai, D.Q. Miao, et al., Multigranulation rough-fuzzy clustering based on shadowed sets, Inf. Sci. 507 (2020) 553–573.
[47] J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Kluwer Academic Publishers, Norwell, MA, USA, 1981.
[48] X.L. Xie, G.A. Beni, Validity measure for fuzzy clustering, IEEE Trans. Pattern Anal. Mach. Intell. 13 (13) (1991) 841–847.
JID:FSS AID:7897 /FLA [m3SC+; v1.332; Prn:13/07/2020; 14:31] P.25 (1-25)
J. Zhou et al. / Fuzzy Sets and Systems ••• (••••) •••–••• 25

[49] D.L. Davies, D.W. Bouldin, A cluster separation measure, IEEE Trans. Pattern Anal. Mach. Intell. 1 (2) (1979) 224–227.
[50] X.L. Xie, G.A. Beni, M.K. Pakhira, et al., Validity index for crisp and fuzzy clusters, Pattern Recognit. 37 (3) (2004) 487–501.
[51] Q. Gu, C. Ding, J.W. Han, On trivial solution and scale transfer problems in graph regularized NMF, in: Proc. 22nd IJCAI, Barcelona, Spain,
2011, pp. 1288–1293.
[52] R.J.G.B. Campello, A fuzzy extension of the rand index and other related indexes for clustering and classification assessment, Pattern Recognit.
Lett. 28 (7) (2007) 833–841.
[53] J. Demsar, Statistical comparisons of classifiers over multiple data sets, J. Mach. Learn. Res. 7 (2006) 1–30.

You might also like