Pmaji RS RFC PDF

Rough Sets and Rough
Rough-
-Fuzzy
Cl
Clustering
i Algorithm
Al i h
Pradipta
p Maji
j
Machine Intelligence Unit
Indian Statistical Institute, Kolkata, INDIA
E-mail:pmaji@isical.ac.in
Web:http://www isical ac in/ pmaji
Web:http://www.isical.ac.in/~pmaji
Introduction
M
Main
i goall off roughh sets
t - induction
i d ti off approximations
i ti off
concepts.
It constitutes sound basis for knowledge discovery in databases

offers mathematical tools to discover patterns hidden in data.
It can be used for feature selection, feature extraction, data

reduction, decision rule generation, and pattern extraction
(t
(templates,
l t association
i ti rules),
l ) clustering,
l t i etc. t
Identifies ppartial or total dependencies

p in data,, eliminates
redundant data, gives approach to null values, missing data,
dynamic data and others.
Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India
Basic Concepts of Rough Sets
Information/Decision
f i / i i Systems
S (Tables)
( bl )
Indiscernibility
Set Approximation
Reducts and Core
Rough Membership
Dependency of Attributes

Information Systems/Tables
U A1 A2 IS is a pair (U, A)
x１ 16-30 50 U is a non-empty finite set of

x2 16-30 0 objects.
x3 31-45 1-25
x4 31-45 1-25
A is a non-empty finite set of
x5 46-60 26-49 attributes such that a : U → Va
x6 16-30 26-49
for every a ∈ A.
x7 46-60 26-49
Va is called the value set of a.

Decision Systems/Tables: Issues
U A1 A2 Walk
DS: T = (U , A ∪ {d })
Elements of AA={A1
{A1, A2} are
x１ 16-30 50 yes called the condition attributes.
x2 16-30 0 no
x3 31-45 1-25 no d∉A is the decision attribute
(instead of one we can consider
x4 31-45 1-25 yes more decision attributes).
x5 46-60 26-49 no
x6 16-30 26-49 yes The same or indiscernible objects
may be represented several times.
x7 46-60 26-49 no
Some of the attributes may be
superfluous.
An Example of Indiscernibility
Non-empty subsets
b off condition
di i
U A1 A2 Walk attributes are {A1}, {A2}, and
{{A1,, A2}.
}
x1 16-30 50 yes
x2 16-30 0 no IND({A1}) = {{x1,x2,x6},
x3 31-45 1-25 no { 3 4} {x5,x7}}
{x3,x4}, { 5 7}}
x4 31-45 1-25 yes
x5 46-60 26-49 no IND({A2})
({ }) = {{x1},
{{ }, {x2},
{ },
x6 16-30 26-49 yes {x3,x4}, {x5,x6,x7}}
x7 46-60 26-49 no
IND({A1
IND({A1, A2}) = {{{{x1},
1} {{x2},
2}
{x3,x4}, {x5,x7}, {x6}}.
Indiscernibility
Equivalence
i l relation
l i – a binary
bi l i R ⊆ X × X which
relation hi h is
i
reflexive (xRx for any object x),
symmetric
y ( xRyy then yyRx),
(if ), and
transitive (if xRy and yRz then xRz).
Th
The equivalence
i l class
l l t x ∈ X consists
[ x]]R off an element i t off all
ll
objects y ∈ X such that xRy.
Let IS = (U, A) be an information system, then with any B ⊆ A

there is an associated equivalence relation:
IND IS ( B ) = {( x, x' ) ∈ U 2 | ∀a ∈ B, a ( x) = a ( x' )}
where INDIS (B ) is called the B-indiscernibility relation.
Observations
An equivalence relation induces a partitioning of the universe.
The partitions can be used to build new subsets of the universe.
S
Subsets
b t that
th t are mostt often
ft off interest
i t t have
h the
th same value
l off
the decision attribute.
It may happen, however, that a concept such as “Walk” cannot

be defined in a crisp manner.

An Example of Set Approximation
Let W = {x | Walk(x) = yes}.
U A1 A2 Walk
AW = {x1, x6},
x1 16-30 50 yes
x2 16-30 0 no AW = {x1, x3, x 4, x6},
x3 31-45 1-25 no
x4 31-45 1-25 yes BN A (W ) = {x3, x 4},
x5 46-60 26-49 no U − AW = {x 2, x5, x7}.
x6 16-30 26-49 yes
The decision class, Walk, is
x7 46-60 26-49 no rough since the boundary
region
i isi nott empty.
t

An Example of Set Approximation
{{x2}, {x5,x7}}
AW {{x3,x4}}
yes
AW
{{x1},{x6}} yes/no
no

Set Approximation
Let T = (U, A) and let B ⊆ A and X ⊆ U.
We can approximate X using only the information contained

in B by constructing the B-lower and B-upper
approximations of X, denoted B X and B X respectively,
where
B X = {x | [ x]B ⊆ X },
B X = {x | [ x]B ∩ X ≠ φ }.

Set Approximation
B-boundary region of X, BN ( X ) = B X − B X ,
B
consists of those objects that we cannot decisively classify into
X in B.
B-outside region of X, U − B X ,
consists of those objects that can be with certainty classified as
not belonging to X.
A set is said to be rough if its boundary region is non-empty,

otherwise the set is crisp.

Rough Sets
RX − X
U/R
RX
R : subset of
setＸ attributes
ib

Properties of Approximations
B( X ) ⊆ X ⊆ B X
B (φ ) = B (φ ) = φ , B (U ) = B (U ) = U
B( X ∪ Y ) = B( X ) ∪ B(Y )
B( X ∩ Y ) = B( X ) ∩ B(Y )
X ⊆ Y implies B ( X ) ⊆ B (Y ) and B( X ) ⊆ B (Y )

Properties of Approximations
B ( X ∪ Y ) ⊇ B ( X ) ∪ B (Y )
B( X ∩ Y ) ⊆ B( X ) ∩ B (Y )
B(− X ) = − B( X )
B(− X ) = − B( X )
B ( B ( X )) = B ( B ( X )) = B ( X )
B( B( X )) = B( B( X )) = B( X )
where -X denotes U - X.
Four Basic Classes of Rough Sets
X is roughly B-definable, iff B ( X ) ≠ φ and B ( X ) ≠ U ,
X is internally B-undefinable, iff B ( X ) = φ and

B( X ) ≠ U ,
X is externally B-undefinable, iff B ( X ) ≠ φ and

B( X ) = U ,
B-undefinable iff B ( X ) = φ and B ( X ) = U .

X is totally B-undefinable,

Accuracy of Approximation
| B( X ) |
αB(X ) =
| B( X ) |
where |X| denotes the cardinality of X ≠ φ .
Obviously 0 ≤ α B ≤ 1.
If αB ( X ) = 1, X is crisp with respect to B.

B
If αB ( X ) < 1, X is rough with respect to B.

B

Issues in the Decision Table/ Reducts
Same or indiscernible
i di ibl objects
bj may be
b representedd severall times.
i
Some of the attributes may be superfluous (redundant).

(redundant) That is,
is
their removal cannot worsen the classification.
Keep only those attributes that preserve the indiscernibility

relation and consequently set approximation.
There are usually several such subsets of attributes and those

which are minimal are called reducts.

Dispensable & Indispensable Attributes
C-positive region of D: POS C ( D ) = UCX
X ∈U / D
Let c ∈ C . Attribute c is dispensable in T

if POS C ( D ) = POS ( C −{ c }) ( D ) , otherwise attribute
c is indispensable in T.
T = (U, C, D) is independent
if all c ∈ C are indispensable in T.

Reduct and Core
The
h set off attributes
ib R ⊆ C is
i called
ll d a reduct
d off C, if T’ =
(U, R, D) is independent and
POS R ( D) = POSC ( D).
)
The set of all the condition attributes indispensable in T is

denoted by CORE(C).
CORE ( C ) = ∩ RED ( C )
where RED(C) is the set of all reducts of C.

An Example of Reducts and Core
Reduct1 = {Muscle-pain,Temp.}
U Headache Muscle Temp. Flu U Muscle Temp. Flu
pain pain
U1 Yes Yes Normal No U4 Yes
Ｕ 11,U4 Y Normall
N No
N
U2 Yes Yes High Yes U2 Yes High Yes
U3 Yes Yes Very-high Yes U3,U6 Yes Very-high Yes
U4 No Yes Normal No U5 No High No
U5 N
No N
No Hi h
High N
No
U6 No Yes Very-high Yes Reduct2 = {Headache, Temp.}
U Headache Temp. Flu
U1 Yes Norlmal No
CORE = {Headache,Temp} I U2 Yes High Yes
{{MusclePain, Temp}
p} U3 Yes Very-high Yes
U4 N
No N
Normall N
No
= {Temp} U5 No High No
U6 No Very-high Yes
Discovering dependencies between attributes is an

important issue in KDD.
A set of attribute D depends totally on a set of

attributes C, denoted C ⇒ D, if all values of
attributes from D are uniquely determined by
values
l off attributes
tt ib t from
f C
C.

Let D and C be subsets of A. We will say that D

depends on C in a degree k (0 ≤ k ≤ 1),
denoted by C ⇒ k D, if
| POSC ( D) |
k = γ (C , D) =
|U |
where POS C ( D ) = U C ( X ), called C-positive
p
X ∈U / D
region of D.

Obviously
| C( X ) |
k = γ (C , D) = ∑ .
X ∈U / D |U |
If k = 1 we say that D depends totally on C.
If k < 1 we say that D depends partially (in a degree

k) on C.

Rough Membership
The rough membership function quantifies the
degree of relative overlap between the set X and the
equivalence class [ x]B to which x belongs.
belongs
| [ x]B ∩ X |
μ : U → [0,1]
B
X μ = B
X
| [ x]B |
The rough membership function can be interpreted

as a frequency-based estimate of P ( x ∈ X | u ),
where u is the equivalence class of IND(B).

Rough Membership
The formulae for the lower and upper
approximations can be generalized to some arbitrary
le el of precision π ∈(0.5, 1] by
level b means of the rough
ro gh
membership function
Bπ X = {x | μ ( x) ≥ π } B
X
Bπ X = {x | μ ( x) > 1 − π }. B
X
Note: the lower and upper approximations as

originally formulated are obtained as a special case
with π = 1.
Outline
Rough-Fuzzy
R hF C
Computing
ti
Clustering: Hard and Fuzzy
Rough-Fuzzy Clustering
Design of Algorithm
Objective Function; Cluster Prototypes; Membership Functions
Convergence of Algorithm; Generalization of Existing Algorithms
Performance Analysis
Numerical Data Sets (Clustering Benchmark Data; Segmentation of
Brain MR Images; Text-Graphics Segmentation; Clustering
Functionally Similar Genes)
N
Non-numerical
i l Data
D t Sets
S t (Amino
(A i Acid
A id Sequence
S A l i)
Analysis)
Conclusion
Rough Sets
It approximates an arbitrary set X by a pair of lower and upper
approximations
Lower approximation R(X): collection of those elements of U (the
universe of discourse) that definitely belong to X;
Upper approximation Ṝ(X): collection of those elements of U that
possibly belong to X.
( ), Ṝ((X)]
The interval [[R(X), )] is the representation
p of an ordinary
y set X in
the approximation space < U, R > or simply called the rough set of X.
Key notions of rough set:

Information granule (clump of similar objects): formalizes the concept
of finite precision representation of objects in real life situation; and
Reducts: represent the core of an information system,
system both in terms of
objects and features, in a granular universe.

Granular Computing
G
Granular
l computing
i refers
f to that
h domain
d i where
h computation
i and
d operations
i
are performed on information granules.
Granular computing is applicable to the situations

1) when a problem involves incomplete, uncertain and vague information,
it may be difficult to differentiate distinct elements and one may find it
convenient to consider granules for its handling
2) though detailed information is available, it may be sufficient to use
granules in order to have an efficient and practical solution.
The computation time is greatly reduced as computations are performed on

granules,, rather than on individual data p
g points.
useful for designing scalable pattern recognition algorithms

Rough-
Rough-Fuzzy Computing
Th
There are usually
ll reall valued
l d data
d and
d fuzzy
f i f
information
i ini reall world
ld
applications.
Fuzzy set theory hinges on the notion of a membership function on the

domain of discourse, assigning to each object a grade of belongingness in
order to represent an imprecise or overlapping concept.
Rough-Fuzzy Computing: Combining fuzzy sets and rough sets provides a

mathematical framework to capture uncertainties associated with the data.
They are complementary in some aspects:
while the membership functions of fuzzy sets enable efficient
handling of overlapping classes,
the concept of lower and upper approximations of rough sets deals
with uncertainty, vagueness, and incompleteness in class definitions.
Clustering
Cl
Cluster
t Analysis:
A l i a technique
t h i for
f finding
fi di natural
t l groups presentt in
i data
d t
Divides a data set into a set of clusters in such a way that two objects
from the same cluster are as similar as possible and the objects from
different clusters are as dissimilar as possible.
possible
Hard Clustering: k-means or hard c-means (HCM) algorithm
Basic steps of HCM:

Assign initial means or centroids vi, i = 1, 2, · · · , c.
For
F each h object
bj t xj, calculate
l l t distance
di t dij between
b t itself
it lf andd the
th centroid
t id
vi of ith cluster.
If dij is minimum for 1 <= i <= c, then assign xj to ith cluster.
Compute new centroids.
Repeat above three steps until no more new assignments can be made.
Hard Clustering
Hard
H d Clustering:
Cl t i
each object is assigned to exactly one

cluster.
cluster
In real data analysis, uncertainties arise due

to overlapping cluster boundaries and
incompleteness and vagueness in cluster
definition.
Also, noise and outliers are unavoidable.

Fuzzy Clustering
F
Fuzzy c-means (FCM):
(FCM) allows
ll gradual
d l
memberships - assigns memberships to an object
which are inversely related to the relative
j to the cluster prototypes.
distance of the object p yp
Offers the opportunity to deal with the data that

belong to more than one cluster at the same time
Can deal
d l - uncertainties
i i arising
i i from
f
overlapping cluster boundaries
Resulting membership values of FCM

Do not always correspond well to degrees of
belonging of data
May be inaccurate in a noisy environment.
J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithm.
New York: Plenum, 1981.
Possibilistic Clustering
P
Possibilistic
ibili ti c-means (PCM):
(PCM) uses a possibilistic
ibili ti
type of membership function
Produces memberships that have a good

explanation of degrees of belonging for data
Reduces weakness of FCM in noisy
environment
Resulting membership values of PCM

may generate coincident clusters
R. Krishnapuram and J. M. Keller, “A Possibilistic Approach to

Clustering,” IEEE Trans. Fuzzy Systems, vol. 1, no. 2, pp. 98–110, 1993.

Fuzzy-
Fuzzy -Possibilistic Clustering
F
Fuzzy-possibilistic
ibili ti c-means (FPCM):
(FPCM) Uses
U bboth
th
fuzzy and possibilistic memberships
Weighted average of FCM and PCM

Reduces weakness of FCM in noisy
environment
Reduces coincident clusters generation
problem of PCM
N. R. Pal, K. Pal, J. M. Keller, and J. C. Bezdek, “A Possibilistic

Fuzzy C-Means Clustering Algorithm,” IEEE Trans. Fuzzy Systems,
vol. 13, no. 4, pp. 517–530, 2005.
F. Masulli and S. Rovetta, “Soft Transition from Probabilistic to

Possibilistic Fuzzy Clustering,” IEEE Trans. Fuzzy Systems, vol. 14,
no. 4, pp. 516–527, 2006.

Rough-
R
Rough-Fuzzy-Possibilistic
hF P ibili ti Cl
Clustering:
t i Add memberships
Adds b hi off fuzzy
f sets,
t
and lower and upper approximations of rough sets into hard clustering.
While membership function enables handling of overlapping partitions

partitions,
rough sets deal with vagueness and incompleteness in class definition
Each cluster: represented by a

cluster prototype, a crisp lower
approximations, and a fuzzy
(probabilistic and possibilistic)
boundary
P. Maji and S. K. Pal, ``Rough Set Based Generalized Fuzzy C-Means Algorithm and Quantitative Indices’’, IEEE Trans. System,
Man and Cybernetics, Part B, Cybernetics, 37(6), pp. 1529--1540, 2007.
Rough-
Rough-Fuzzy-
Fuzzy-Possibilistic Clustering
According to rough sets,
sets if xj belongs to lower approximation of ith
cluster, it does not belong to lower approximation of any other clusters -
that is, xj is contained in ith cluster definitely
Hence, the memberships of objects in lower approximation of a cluster

should be independent of other centroids and clusters, and
should not be coupled with their similarity with respect to other
cluster prototypes
prototypes.
Also, objects in lower approximation of a cluster should have similar
influence on corresponding cluster prototype and cluster.
If xj belongs to boundary region of ith cluster, it possibly belongs to ith

cluster and potentially belongs to other clusters - hence, objects in
boundary regions should have different influence on cluster prototypes
and clusters
clusters.
Rough-
Rough-Fuzzy-
Fuzzy-Possibilistic Clustering
S
So, membership
b hi values
l off objects
bj t iin llower approximation
i ti are 1 whilehil those
th
in boundary region are same as fuzzy and possibilistic clustering.
It partitions each cluster into two classes: lower approximation (core) and
boundary (overlapping). Only objects in boundary are fuzzified.
Each cluster: represented by a

cluster prototype, a crisp lower
approximations, and a fuzzy
(probabilistic and possibilistic)
boundary
Objective Function
xj and vi: object and cluster prototype or centroid

w and ŵ (= 1-w): relative importance of lower and boundary region
a and b: relative importance of probabilistic and possibilistic memberships
μij and νij : p
probabilistic and ppossibilistic memberships
p
m1 and m2: probabilistic and possibilistic fuzzifiers
ηi : scale parameter, represents the size of ith cluster
Cluster Prototype and Memberships
Æ Cardinality of lower approximation
Æ Cardinality of boundary region

Convergence of Proposed Algorithm
Cluster
Cl prototype: viRFP = w x viRFP + ŵ x ύiRFP
Convergence of viRFP depends on convergence of viRFP and ύiRFP
A sett off linear

li equations
ti in
i
terms of viRFP and ύiRFP if both
μij and νij are kept constant
Both Matrices to be diagonally

Gauss-Seidel
Gauss-
dominant, we must have
Algorithm
g
sufficient condition, not necessary

If the cluster prototypes and memberships are computed
B d k’
Bezdek’s
alternatively in iteration, then the proposed algorithm converges,
Theorem
at least along a subsequence, to a local optimum solution.
Core and Boundary Region
It starts
t t byb choosing
h i c objects
bj t as centroids
t id off c clusters.
l t The
Th fuzzy
f andd
possibilistic memberships of all objects are then calculated.
After computing the memberships for c clusters and n objects,

objects difference of
two highest memberships of each object is compared with a threshold value δ.
Let uij and ukj be the highest and second highest memberships of object xj,
where uij = a μij + b νij.
If (uij −ukj ) < δ, then xj belongs to boundary regions of ith and kth clusters
and xj does not belong g to lower approximation
pp of any
y cluster
Otherwise xj belongs to lower approximation (core) of ith cluster.
After defining
g lower approximations
pp and boundary y regions
g of different clusters
based on δ, the memberships uij of objects in lower approximation are set to 1,
while those in boundary regions are remain unchanged.
Generalization of Existing Algorithms
If δ = 00, RFPCM reduces to HCM (hard c-means).
c means)
If δ = 1, RFPCM reduces to FPCM (fuzzy-possibilistic c-means).

a = 1, 1 RFPCM reduces to FCM (fuzzy cc-means)
means).
a = 0, RFPCM reduces to PCM (possibilistic c-means).
If δ ≠ 0,, 1 and a = 1,, RFPCM reduces to RFCM (rough-fuzzy

( g y c-means).
)
If δ ≠ 0, 1 and a = 0, RFPCM reduces to RPCM (rough-possibilistic c-means).
n Represents average
1
The value of threshold, δ =
n
∑ (u
j =1
ij − u kj ) difference of two highest
memberships of all
objects

Different Rough-
RFCM RPCM RFPCM
For objects near to centers and far from separating line, different shape of memberships is apparent.

Quantitative Indices
Davies-Bouldin
Davies Bouldin (DB) index is a function of the ratio of sum of within-cluster
within cluster
distance to between-cluster separation:
A good clustering procedure should make DB index value as low as possible.

possible
Dunn’s (D) index is designed to identify sets of clusters that are compact and
well separated. It maximizes
A good clustering procedure should make D

D-- index value as high as possible.
β-Index:
β Index: It is defined as ratio of total variation and within
within-cluster
cluster variation:
For a given image and c value, higher the homogeneity within the segmented
regions, higher would be β value
value. The value also increases with c.
Clustering on Benchmark Data
Data Object Feature Cluster 3.5
HCM
3
Iris 150 4 3
FCM
2.5 FPCM-MR
D B Index
KHCM
Glass 214 10 6
2
KFCM
1.5 RCM
Ionosp 351 34 2 1 RFCM-MBP
0.5 RFCM
Wine 178 13 3 0
RPCM
RFPCM
Wiscon 569 30 2
IRIS GLASS IONOSP WINE WISCON
10 800
HCM HCM
700
Execution Tim e (m illi sec.)

FCM FCM
8 600 FPCM-MR
FPCM-MR
D unn Index
500 KHCM
6 KHCM
KFCM
KFCM 400
RCM
4 RCM 300 RFCM-MBP
RFCM-MBP
200 RFCM
2 RFCM
100 RPCM
RPCM
0 RFPCM
RFPCM 0
IRIS GLASS IONOSP WINE WISCON IRIS GLASS IONOSP WINE WISCON
RFPCM provides higher Dunn, and lower DB and execution time.

Brain MRI Segmentation (AMRI, Kolkata)
Each image is of size 256 X 180 with 16 bit gray levels: segmented into four
categories, namely, CSF, gray matter, white matter, and background.
Brain MRI Segmentation (AMRI, Kolkata)
0.2 3.5
HCM HCM
0.18
FCM 3 FCM
0.16
FPCM-MR FPCM-MR
0.14 2.5
KHCM KHCM
Du nn Index
0.12
DB Index
KFCM 2 KFCM
0.1
RCM 1.5 RCM
0 08
0.08
D
RFCM-MBP RFCM-MBP
0.06 1
RFCM RFCM
0.04
RPCM 0.5 RPCM
0.02
RFPCM RFPCM
0 0
Image-1 Image-2 Image-3 Image-4 Image-1 Image-2 Image-3 Image-4
16 2500
HCM HCM
14
Tim e (m illi sec.)

FCM FCM
2000
12 FPCM-MR FPCM-MR
10 KHCM KHCM
1500
β Index
KFCM KFCM
8
RCM RCM
E xecution T
6 1000
RFCM-MBP RFCM-MBP
4 RFCM RFCM
500
2 RPCM RPCM
RFPCM RFPCM
0 0
Image-1 Image-2 Image-3 Image-4 Image-1 Image-2 Image-3 Image-4
RFPCM provides lower DB, higher Dunn and β index, and lower execution time.

Brain MRI Segmentation (BrainWeb:SBD)
0.4 12
HCM HCM
0.35 FCM FCM
10
0.3 FPCM-MR FPCM-MR
KHCM 8 KHCM
n Index
0.25
DB IIndex
KFCM KFCM
0.2 6
Dunn
RCM RCM
0.15 RFCM-MBP RFCM-MBP
4
0.1 RFCM RFCM
RPCM 2 RPCM
0.05
RFPCM RFPCM
0 0
Noise=0% 1% 3% 5% 7% 9% Noise=0% 1% 3% 5% 7% 9%
60
The noise is calculated relative to the

HCM
50 FCM
FPCM-MR
40 KHCM brightest tissue.
β Index
KFCM
30
RCM
20 RFCM-MBP
10
RFCM
RPCM
The performance of RFPCM is better
0
Noise=0% 1% 3% 5% 7% 9%
RFPCM
than any other c-
c-means algorithms.
Text-
Text -Graphics Segmentation Using HCM

Text-
Text -Graphics Segmentation Using FCM

Text-
Text -Graphics Segmentation Using PCM

Text-
Text -Graphics Segmentation Using FPCM

Text-
Text -Graphics Segmentation Using RFCM

Text-
Text -Graphics Segmentation Using RPCM

Text-
Text -Graphics Segmentation Using RFPCM

Clustering Functionally Similar Genes
40
Data Set Genes Condition Cluster 35
30
Yeast 474 7 4 25
FCM
PCM
S core
20 FPCM
Cho et al 6457 17 33
Z-S
15 RFCM
RPCM
10
Auble 6225 4 10 5
RFPCM
0
Yeast Cho et al Auble
-5
FCM RFPCM Biological Function

2.81e-41 2.39e-42 Sporulation resulting in
formation of cellular spore Highest Z-Z-scores of 36.3
ffor yyeast,, 1.3 for
f Auble
7.24e-24 1.02e-24 Meiosis I and 7.34 for Cho et al are
1.84e-16 5.21e-21 Ribosome Biogenesis obtained using RFPCM
7.19e-10 1.66e-11 Glycosis
Lowest P-
P-value is achieved for RFPCM RFPCM provides more
biologically significant clusters
in case of yeast sporulation data
Rough-
Rough-Fuzzy C
C-
-Medoids Algorithm
Cluster
Cl t Prototypes
P t t (Medoids):
(M d id ) vi = xq ; where
h
Average normalized homology alignment scores of input subsequences with

respectt to
t their
th i corresponding
di prototypes
t t - increases
i with
ith increase
i in
i
homology alignment scores within a cluster
Maximum normalized homology alignment score between prototypes -

homology alignment score between all prototypes should be as low as possible
P. Maji and S. K. Pal, ``Rough-Fuzzy C-Medoids Algorithm and Selection of Bio-Basis for Amino Acid Sequence Analysis’’, IEEE
Trans. Knowledge and Data Engineering, 19(6), pp. 859--872, 2007.
Amino Acid Sequence Analysis (NCBI)
0.9 Fi
Five HIV protein,
t i C Caspase
0.8
0.7
Cleavage, and Cai-Chou
0.6
RFCMdd
FCMdd
HIV data
ndex
0.5 RCMdd
β In
0.4 HCMdd
0.3
0.2
MI
GAFR
Dayhoff amino acid
0.1 mutation matrix is used to
0
Cai Chou CaspaseCl
AAC82593 AAG42635 AAO40777 NP057849 NP057850 Cai-Chou
compute homology score
1
0.9
0.8
RFCMdd
07
0.7
FCMdd
0.6
RFCMdd provides higher β
γ Index
RCMdd
0.5
0.4
HCMdd
MI index and lower γ index
values
0.3
GAFR
0.2
0.1
0
AAC82593 AAG42635 AAO40777 NP057849 NP057850 Cai-Chou CaspaseCl

Conclusion and Future Direction
A new clustering
l i algorithm
l i h - integrating
i i roughh sets andd fuzzy
f clustering:
l i
a combination of restrictive (hard) and descriptive (fuzzy) clustering
The concept of crisp lower bound and fuzzy boundary deals with
uncertainty, vagueness, and incompleteness in class definition. The
membership function handles efficiently overlapping partitions.
Use of rough sets and fuzzy memberships adds a small computational load
to hard clustering
g algorithm;
g ; however the corresponding
p g integrated
g
method shows a definite increase in performance.
Rough-fuzzy
Rough fuzzy cc-medoids
medoids (relational or pair
pair-wise
wise clustering) can be applied
for clustering relational datasets of bioinformatics, data and web mining

References
JJ. C
C. Bezdek
Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithm.
Algorithm New York: Plenum
Plenum, 1981.
1981
R. Krishnapuram and J. M. Keller, “A Possibilistic Approach to Clustering,” IEEE Trans. Fuzzy Systems,
vol. 1, no. 2, pp. 98–110, 1993.
N. R. Pal, K. Pal, J. M. Keller, and J. C. Bezdek, “A Possibilistic Fuzzy C-Means Clustering Algorithm,”
IEEE Trans. Fuzzy Systems, vol. 13, no. 4, pp. 517
517–530,
530, 2005.
F. Masulli and S. Rovetta, “Soft Transition From Probabilistic to Possibilistic Fuzzy Clustering,” IEEE
Trans. Fuzzy Systems, vol. 14, no. 4, pp. 516–527, 2006.
P. Lingras and C. West, “Interval Set Clustering of Web Users with Rough K-means,” Journal of
Intelligent Information Systems, vol. 23, no. 1, pp. 5–16, 2004.
S. Mitra, H. Banka, and W. Pedrycz, “Rough-Fuzzy Collaborative Clustering,” IEEE Trans. Systems,
Man, and Cybernetics - Part B, vol. 36, pp. 795–805, 2006.
J. C. Bezdek and N. R. Pal, “Some New Indexes for Cluster Validity,” IEEE Trans. System, Man, and
Cybernetics, Part B, vol. 28, pp. 301–315, 1988.
S. K. Pal, A. Ghosh, and B. U. Sankar, “Segmentation of Remotely Sensed Images with Fuzzy
Thresholding, and Quantitative Evaluation,” International Journal of Remote Sensing, vol. 21, no. 11,
pp. 2269–2300, 2000.
Z. Pawlak, Rough Sets, Theoretical Aspects of Reasoning About Data. Dordrecht, The Netherlands:
Kluwer 1991
Kluwer, 1991.
Z.R. Yang and R. Thomson, “Bio-Basis Function Neural Network for Prediction of Protease Cleavage
Sites in Proteins,” IEEE Trans. Neural Networks, vol. 16, no. 1, pp. 263-274, 2005.

Some Related Publications
Pradipta
P di Maji
M ji andd Sankar
S k K. K Pal,
P l “Rough-Fuzzy
“R h F P
Pattern R
Recognition:
ii A li i
Applications
in Bioinformatics and Medical Imaging”, John Wiley & Sons, Inc., New
Jersey/IEEE Computer Society Press, ISBN: 978-1-1180-0440-1, January, 2012.
Pradipta Maji and Sushmita Paul, “Rough-Fuzzy Clustering for Grouping

Functionally Similar Genes from Microarray Data”, IEEE/ACM Transactions on
Computational Biology and Bioinformatics, pp. 1--14, 2013.
Pradipta Maji and Sankar K. Pal, “Rough Set Based Generalized Fuzzy C-Means
Algorithm and Quantitative Indices’’, IEEE Transactions on System, Man and
Cybernetics, Part B, Cybernetics, 37(6), pp. 1529--1540, 2007.
Pradipta Maji and Sankar K. Pal, “Rough-Fuzzy C-Medoids Algorithm and

S l ti off Bio-Basis
Selection Bi B i forf Amino
A i Acid
A id Sequence
S A l i ’’ IEEE Transactions
Analysis’’, T ti
on Knowledge and Data Engineering, 19(6), pp. 859--872, 2007.

Biomedical Imaging and Bioinformatics Lab (BIBL)


Pmaji RS RFC PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pmaji RS RFC PDF

Uploaded by

Copyright:

Available Formats

Rough Sets and Rough

 It constitutes sound basis for knowledge discovery in databases

 It can be used for feature selection, feature extraction, data

 Identifies ppartial or total dependencies

 Reducts and Core

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

x１ 16-30 50  U is a non-empty finite set of

 Va is called the value set of a.

 Let IS = (U, A) be an information system, then with any B ⊆ A

 The partitions can be used to build new subsets of the universe.

 It may happen, however, that a concept such as “Walk” cannot

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 We can approximate X using only the information contained

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 A set is said to be rough if its boundary region is non-empty,

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 X is internally B-undefinable, iff B ( X ) = φ and

 X is externally B-undefinable, iff B ( X ) ≠ φ and

B-undefinable iff B ( X ) = φ and B ( X ) = U .

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

If αB ( X ) = 1, X is crisp with respect to B.

If αB ( X ) < 1, X is rough with respect to B.

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 Some of the attributes may be superfluous (redundant).

 Keep only those attributes that preserve the indiscernibility

 There are usually several such subsets of attributes and those

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 Let c ∈ C . Attribute c is dispensable in T

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 The set of all the condition attributes indispensable in T is

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 Discovering dependencies between attributes is an

 A set of attribute D depends totally on a set of

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 Let D and C be subsets of A. We will say that D

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 If k < 1 we say that D depends partially (in a degree

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 The rough membership function can be interpreted

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 Note: the lower and upper approximations as

 Clustering: Hard and Fuzzy

 Key notions of rough set:

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 Granular computing is applicable to the situations

 The computation time is greatly reduced as computations are performed on

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 Fuzzy set theory hinges on the notion of a membership function on the

 Rough-Fuzzy Computing: Combining fuzzy sets and rough sets provides a

 Hard Clustering: k-means or hard c-means (HCM) algorithm

 Basic steps of HCM:

 each object is assigned to exactly one

 In real data analysis, uncertainties arise due

 Also, noise and outliers are unavoidable.

Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India

 Offers the opportunity to deal with the data that

 Resulting membership values of FCM

 Produces memberships that have a good

 Resulting membership values of PCM

It constitutes sound basis for knowledge discovery in databases

It can be used for feature selection, feature extraction, data

Identifies ppartial or total dependencies

Reducts and Core

x１ 16-30 50 U is a non-empty finite set of

Va is called the value set of a.

Let IS = (U, A) be an information system, then with any B ⊆ A

The partitions can be used to build new subsets of the universe.

It may happen, however, that a concept such as “Walk” cannot

We can approximate X using only the information contained

A set is said to be rough if its boundary region is non-empty,

X is internally B-undefinable, iff B ( X ) = φ and

X is externally B-undefinable, iff B ( X ) ≠ φ and

Some of the attributes may be superfluous (redundant).

Keep only those attributes that preserve the indiscernibility

There are usually several such subsets of attributes and those

Let c ∈ C . Attribute c is dispensable in T

The set of all the condition attributes indispensable in T is

Discovering dependencies between attributes is an

A set of attribute D depends totally on a set of

Let D and C be subsets of A. We will say that D

If k < 1 we say that D depends partially (in a degree

The rough membership function can be interpreted

Note: the lower and upper approximations as

Clustering: Hard and Fuzzy

Key notions of rough set:

Granular computing is applicable to the situations

The computation time is greatly reduced as computations are performed on

Fuzzy set theory hinges on the notion of a membership function on the

Rough-Fuzzy Computing: Combining fuzzy sets and rough sets provides a

Hard Clustering: k-means or hard c-means (HCM) algorithm

Basic steps of HCM:

each object is assigned to exactly one

In real data analysis, uncertainties arise due

Also, noise and outliers are unavoidable.

Offers the opportunity to deal with the data that

Resulting membership values of FCM

Produces memberships that have a good

Resulting membership values of PCM

Weighted average of FCM and PCM

While membership function enables handling of overlapping partitions

Hence, the memberships of objects in lower approximation of a cluster

If xj belongs to boundary region of ith cluster, it possibly belongs to ith

xj and vi: object and cluster prototype or centroid

After computing the memberships for c clusters and n objects,

If δ = 1, RFPCM reduces to FPCM (fuzzy-possibilistic c-means).

If δ ≠ 0,, 1 and a = 1,, RFPCM reduces to RFCM (rough-fuzzy

If δ ≠ 0, 1 and a = 0, RFPCM reduces to RPCM (rough-possibilistic c-means).

Average normalized homology alignment scores of input subsequences with

Maximum normalized homology alignment score between prototypes -

Pradipta Maji and Sushmita Paul, “Rough-Fuzzy Clustering for Grouping

Pradipta Maji and Sankar K. Pal, “Rough-Fuzzy C-Medoids Algorithm and