Final Version

A new supervised data classification method for convex and non convex
classes
O. El Melhaoui, M. El Hitmy, and F. Lekhal

.LABO LETAS, FS, University of Mohammed I, Oujda, Morocco
wafa19819@gmail.com
Abstract
The present paper proposes a supervised data classification technique for convex and non convex classes.
The technique is based on two phases; the first phase deals with the creation of prototypes and elimination of
noisy objects, the second phase consists of merging the nearest prototypes into classes. Each prototype is created
based on the K nearest neighbours approach, and is represented by its gravity center. The objects which are very
far from their neighbours are eliminated. The merge phase is inspired from the hierarchical method, and it
consists of regrouping the closest prototypes in the same class in incremental way until that the number of
created classes is equal to the number of classes set out initially. This new technique is compared to the C-
means, the fuzzy C-means, the competitive neural networks and the fuzzy Min-Max classification methods
through a number of simulations. The proposed technique has obtained good results.
Keywords: Supervised classification, Convex and non convex classes, Creation of prototypes, Elimination of
noisy objects, Merge, C-means, Fuzzy C-means, Competitive neural networks, Fuzzy Min-Max classification.
1. Introduction
Data classification is an active subject; it has played a crucial and highly beneficial role in hard tasks such as
data analysis, quality control, biometrics (face recognition), medicine, geology (soil texture recognition) and
automatic categorization of satellite pictures, etc.
The classification is an abstraction and synthesis technique, it consists of partitioning a set of data entity into
separate classes according to a similarity criterion, the objects are as similar as possible within a class , we talk
about intra-class homogeneity, while objects of different classes are the most dissimilar possible, we talk about
inter-class heterogeneity. This process allows having a simplified representation of the initial data. There are two
main types of classification; unsupervised and supervised. We are interested in supervised methods where they
assume that the number of classes is known initially. Among these, we find, C-means (CM), fuzzy C-means
(FCM), support vector machine (SVM), K - nearest neighbours (KNN), neural networks (NN) etc.
C-means, fuzzy C-means and standard competitive neural network methods proved a real ability to solve a
nonlinear problem, and they are very popular because of their simplicity and their theoretical elegance, however
they have several drawbacks, the most important are; the initialization of centers which can lead to local
solutions [4, 2, 5] and they are not convenient for non convex or complex type of classes. The diversity and the
complexity of the problem gives rise to many methods, among them the fuzzy min max classification (FMMC),
which is well suited to convex or non convex type of classes. It consists of creating prototypes or hyperboxes
iteratively until their stabilization, each iteration involves three stages: expansion, overlap and contraction.
The present paper proposes a new technique for data classification suitable to convex and non convex type
of classes, the learning process is made in two steps; the first step consists of creating prototypes and eliminating
noisy objects, the second step consists of merging the prototypes according to some criteria. The creation of
prototypes is based on the K nearest neighbours method while the elimination of isolated objects considers the
fact whether the object is within a very weakly compact class or if it is in the peripherals of the class.
Throughout this work a parameter D is used to measure the isolation of the object. Each object is supposed to be
represented by its attribute vector extracted from the diverse features associated with the object. The position of
each object to be classified is supposed to be known in the attribute space. The objects are initially stored in a
matrix whose size is varying through the iterations. The objects are extracted in an orderly way from this matrix.
The second step consists of regrouping the nearest prototypes into the same class iteratively.
This paper is organised as follows: section 2 describes the different techniques of classification including C-
means, fuzzy C-means, competitive neural networks and fuzzy Min-Max classification. The proposed method is
described in section 3. The result of simulations and comparisons are introduced in section 4. Finally we give a
conclusion.
2 - Classification methods
2.1. Descriptive element
Let’s consider a set of M objects {Oi}, characterized by N parameters regrouped in a line vector Vatt=(a1,a2,
…..aN). Let Ri = (ain)1≤n≤N be a line vector of where the nth component ain is the value taken by an for the
object Oi. Ri is called the observation associated with Oi, it is also called the realization of the attribute vector for
this object. is the observation space, it is also known as the parameters space. The observations (Ri)1≤i ≤M
are associated to C different classes (CLs) 1≤s≤C with respective centers (cs)1≤s≤C , each observation Ri is associated
with its membership degree ui,s to the class CLs.
2.2. C-means (CM)

The C-means method was introduced by Mac Queen in 1967. CM is very popular and widely used in
scientific and industrial applications, because of its great utility in the classification and its simplicity. It consists
of looking for good centers of C classes which minimize intra-class variance and maximize the distance between
the classes by an iterative process. The CM method consists of determining the class centers which minimize
the optimization criterion defined by equation (1) [13]:
M C
J m   ui ,s Ri c s
2
(1)
i 1 s 1
||.|| is the Euclidean distance, ui,s equal to 1 if Ri belongs to CLs, 0 if not.
2.3. Fuzzy C-means (FCM)

The fuzzy C-means (FCM) method was introduced by Dunn in 1973 and was improved by Bezdek in 1981.
This technique inspired from the C-means algorithm, introduces the concept of fuzzy sets. Every object in the
data set belongs to each class with a certain degree of membership rather than belonging entirely to one class.
The underlying principle of the FCM is to form from M objects defined by their realizations (Ri)1≤i≤M, C classes
(CLs)1≤s≤C by minimizing the criterion given by equation (2) and by considering the membership degree of each
object to the different classes. The criterion to be optimized in the FCM is given by [1, 2, 14]:
M C
J m   (u i , s ) df Ri c s
2
i 1 s 1 (2)
M
Under the constraints: for i=1… M and 0< u
i 1
i,s <M for s=1 …C
df is the fuzzy degree often taken equal 2, u i , s is the membership degree of the object i to the class CLs.
For each i,s, ui,s belongs to the interval [0,1]. In order to minimize Jm, ui,s and cs, must be updated at each
iteration according to [2]:
M
and
  
i 1
i,s
df
Ri (3)
cs  M
  
i 1
i,s
df
2.4. Standard competitive neural networks (SCNN)

The competitive neural network has been widely used in various fields including classification and
compression [7, 8, 9, 10]. It divides the input data into a number of disjoint clusters; it allows the output neurons
to compete and excite one neuron and inhibits all others at a given time. The competitive network has two layers;
input layer of N neurons and output layer of C neurons, fully interconnected, C represents the number of classes.
The input layer receives the observation vector of dimension N, X = (x1, x2, ... xN), each neuron j in the output
layer will compute the Euclidean distance between its weight vector Wj= (wj1, wj2 .... wjN) and the observation
vector X, the final result of the network will give the index of a winner neuron whose weight vector is closest to
the observation. The component wjn in the vector Wj is the synaptic weight of the connection between the input
neuron n and the corresponding output neuron j. The standard competitive neural network adjusts the synaptic
weights of the winner neuron by minimizing the criterion given by the formula [6, 15]:
2
1 C
E  j X  W j
2 j 1
)4(
αj = 1 if Wj is the closest to X, αj = 0 otherwise.
Using the gradient algorithm [13], also called the Winner Takes All rule, the weights of the winner neuron
are updated at each step as:
W (t  1)  W (t )   ( X  W ) (5)
j j j j
η is a learning rate which is usually a small real number between 0 and 1.
2.5. Fuzzy Min-Max classification (FMMC)

2.5.1. Principle
FMMC is a classification method introduced by Simpson in 1993 [3], based on neural network architecture.
FMMC contains three layers, input, output and hidden layers. The number of neurons in the input layer is equal
to the dimension of the data representation space. The number of neurons in the hidden layer increases in time,
with respect to the creation of prototypes. The number of neurons in the output layer is equal to the number of
classes known initially. The synaptic weights associated with the connections between input and hidden layers
are formed by two matrices V and W representing the characteristics of the different prototypes in the hidden
layer. The synaptic weights associated with the connections between hidden and output layers are formed by a Z
matrix characterizing the association of prototypes to the classes. The learning process is made in three steps,
expansion overlapping and contraction repeated for each training input pattern. These phases are controlled by
two parameters; the sensitivity γ and the vigilance factor θ that control the maximum size of created hyperboxes.
The fuzzy min-max classification neural network is built using hyperbox fuzzy sets. A hyperbox defines a
region of the N-dimensional pattern space. A hyperbox Bj is defined by its min point Vj = (vjn)1≤n≤N and its max
point Wj =(wjn)1≤n≤N. The fuzzy hyperbox membership function of an observation Oi to an hyperbox Bj is defined
as follows: [3, 11, 12]
(6)
Where Oi =(ai1,ai2,….,aiN) is the ith input pattern, γ is the sensitivity parameter that regulates how fast the
membership value decreases as the distance between Oi and Bj increases [6]. The combination of the min-max
points and the hyperbox membership function defines a fuzzy set.
2.5.2 Learning
Let’s consider A = {(Oi, dk) / i = 1, 2, …M, k=1,…,C} a training set, Oi is the input pattern and dk= {1,2, ...,
C} is the index of one of the C classes. The fuzzy min-max learning is an expansion/contraction process. The
learning process goes through several steps:
 Initialization:
 Initial values for γ and θ.
 Assume that the first input pattern form a hyperbox B1, defined by its min point V1 = (v1n)1≤n≤N and its max
point W1 =(w1n)1≤n≤N, where V1= W1= O1.
 Repeat:
1. Select a new input pattern (Oi, dk) of the set A, identify a hyperbox for the same class, and provide the
highest degree of membership. If a hyperbox cannot be found, a new hyperbox is formed and added to the neural
network.
* * *
2. Expand the new created hyperbox Bj defined by a couple of points (v jn , w jn ) , where w jn  max(w jn , ain )
and v *jn  min(v jn , ain ) , 1≤n≤N.
1 N
3. Compute the size T where: T ( (max(w jn , ain )  min(v jn , ain )))
N n 1
If T>θ, a new hyperbox is created. Otherwise, we check if there is an overlap between the hyperbox Bj expanded
in the last expansion step, and the hyperboxes which represent other classes than that of Bj.
4. If an overlap between two hyperboxes of two different classes has been detected, a contraction of the two
hyperboxes is carried out. Four possible cases for overlapping and contraction procedures are discussed in [3,
11].
 Until stabilization of hyperboxes.
3. Proposed method
3.1 Descriptive element
Let’s Consider a set of M objects {O1, O2, ..., Oi , ..., OM} characterized by N parameters, let Ri be the
observation associated with the object Oi, and let mat_va be a matrix of M lines (representing the objects Oi) and
N columns (representing the parameters aj), defined by:

 
mat _ va  a ij 1i M
1 j  N
3.2 Principle
The proposed method in this work is a supervised classification method. It is based on two steps, the first
step creates the prototypes and removes the noisy objects, and the second step creates the classes from the
created prototypes, called merge step.
The objects to be classified are initially stored in a matrix called mat_va of dimension [M,N], the mat_va
matrix will vary according to the index i of the iteration and will be called mat_vai, the first object of mat_vai,
O1i is taken. The K nearest neighbours to O1i are found from all the M objects considered initially, and the
farthest object to O1i from among its K nearest neighbours is obtained, this object will be called the Kth object.
Now a test is carried out and a decision of whether O1i is noisy or not may be obtained. If the Euclidian distance
between the Kth object and O1i is greater than a threshold D set out initially then O1i is noisy or isolated and is
removed, if not O1i and its K nearest neighbours will constitute a consistent prototype. All the elements of the
obtained prototype are removed from mat_vai and mat_vai+1 is obtained. The merge step is inspired from the
hierarchical clustering method; it consists of grouping the most similar prototypes into a class iteratively.
3.2.1. Creating the prototypes and removing the noisy objects,

The phase of creating prototypes is based on the cloud of points situated in the attributes space. Objects are
initially stored in the realizations matrix mat_va, and are extracted from this matrix. The algorithm for this phase
is:
 Iteration t=1, mat_va1 = mat_va
1. Find the K nearest neighbours of the object O1 from the cloud of points in the attributes space, O1 is the first
1
element of mat_va. Let R k
(k = 1, ..., K) be the observations associated with the K nearest neighbours of O1.
1
2. Let d ( R1 , R K )  kmax d ( R1 , Rk1 ) .
1,.., K
 If d ( R1 , R 1K )  D , the object O1 and its K nearest neighbours form the prototype P1 of gravity center
. The (K +1) objects of P1 in this case are compact and are similar to each other.
The (K +1) objects of P1 are removed from mat_va1 matrix, a mat_va2 is created with M-(K +1) rows, the
new indexation for mat_va2 considers rows from 1 to M-(K+1). The empty rows in mat_va1 are filled with those
non empty immediately coming after them. mat_va2 is formed in this way with no empty lines.
 Ifd ( R1 , R K1 )  D , then the object O1 of the observation R1, is considered to be isolated and noisy, and it is
removed from mat_va1. Then mat_va2 is of the dimension [M-1, N], where the first line of mat_va1 is removed
and again a new indexation for mat_va2 takes place, the lines are indexed from 1 to M-1
Example
 O1 
 
 O2 
 
Let mat _ va1  O3 be a matrix of objects, and K=2. Let’s consider that O2 and O4 are the two nearest
 
 O4 
 
 O5 
neighbours to O1. We assume that the distance between O1 and the second nearest neighbour is smaller than D,
then O1, O2 and O4 form a prototype P whose elements are eliminated of mat_va1, mat_va2 becomes
 O3 
mat _ va 2   
 O5 
If the distance between O1 and the second nearest neighbour is bigger than D, then O1 is eliminated from
 O2 
 
 O3 
mat_va1, O1 is a noisy object and mat_va2 becomes mat _ va 2  
O 
 4
O 
 5
. The algorithm corresponding to the creation of prototypes is:
 Iteration t= J:
Let S be the number of prototypes created and mat_vaJ-1 the matrix formed in the J-1 iteration, it consists of
objects not assigned to prototypes P1,…, PS. The matrix of observations mat_vaJ will be formed in the following
way:
1. Get the K-nearest neighbours of the first object of mat_vaJ-1, O1J 1 , the Kth nearest neighbour of O1J 1 is
OKJ 11 , it is considered to be the farthest from O1J 1 .
2. Let d( O KJ 11 , O1J 1 ) be the Euclidean distance between O1J 1 and O KJ 11 ,
If d( O KJ 11 , O1J 1 ) > D, O1J 1 is considered to be isolated and will be eliminated.
If d( OKJ 11 , O1J 1 )  D , then a new prototype is created and the (K +1) objects of the prototype are removed
from mat_vaJ-1
3. Create a new matrix mat_va J
4. Repeat until the last matrix mat_vae becomes empty.
3.2.2. Merge phase
We assume that phase 1 ended by forming H prototypes P1,… PH.
The task of the merge step is to aggregate the most similar prototypes in the appropriate class iteratively until the
number of classes created will be equal to the number of classes fixed initially. This step has been inspired from
the hierarchical clustering method.
We choose the Euclidean distance as similarity criterion between the prototypes. Each prototype represents
a class, we then have H classes (Figure A)
H classes Figure A
P1 P2 P3 ..…
Pi ..… PH
We compute the Euclidean distance between all
the gravity centers of H prototypes taken two by two
mutually, so we have H(H -1) / 2 distances. The different distances are stored in a matrix mat= [d1,d2,…dH (H-1) /2]
after being ordered in an increasing way. The algorithm for the merge phase is:
 Itération t=1
d1 ( g s , g r )  min (d ( g i , g j ))
Let i 1,.. H 1
j i ,.. H
, where gi and gj are the gravity centers associated to the prototypes Pi, Pj
respectively. Ps and Pr are considered similar prototypes, they will be grouped in one class CL1. The two classes
associated with Ps and Pr are merged to form one single class, the number of circle used to represent the classes
is reduced by one unit (figure B). In this case the number of classes becomes H-1.
sessalc1-
H
P1 P2 P3 …P s ..... P ..... P H
r
P1 P2 P3 …LC P(1 Ps, r) … PH
..… ..…
..…
Figure B
 Iteration t = 2
Let d2(gm,gn) be the second minimum distance. Pm and Pn are considered to be similar prototypes, a test is
then carried out:
1. If one of the prototypes is already assigned to CL1, the other prototype will also be in CL1, the circle
representing the other prototype is removed (Figure C). The number of classes becomes H-2.
sessalcH2-
..……
P 1 P 2 P 3 CL1(Ps, Pr) … Pm PH P 1 P 2 LC P(
P 3…… 1 Ps P
r m) …… PH
..… .. ..
Figure C
2. If the two prototypes Pm and Pn are not assigned to class CL1, they will be grouped in a new class CL2 (Figure
D), the circles associated with Pm and Pn are combined to form one circle associated to CL2. The number of
classes becomes H-2.
sessalcH2-
P 1 P 2 LC P(
1 P s, )
r PP .....
m P.…n H P P 1 2 ..LC P(
1 P, r) LC P(
s .… P,
2P m……)
n H
Figure D
 Iteration t = T
We assume that the number of classes is S. Let dT(gc,gb) be the Tth minimum distance. Pc and Pb are considered
to be similar prototypes, three cases are to be distinguished:
1. If Pc and Pb are not assigned to any class already created during the (T-1) iterations, a new class will be
produced CLv. The number of classes becomes S-1.
2. If one of the prototypes (for example Pc) is already assigned to a class CLi already created during the T-1
iterations, then the other prototype Pb must be assigned to the same class CLi. The number of classes is S-1.
3. If both Pc and Pb prototypes are already assigned to different classes (CLe,CLf) (e<f), respectively, the two
prototypes Pc and Pb must be assigned to the class CLe, all the prototypes that are already assigned to the class
CLf must be assigned to the class CLe and the class CLf will be eliminated. In fact, CLf is combined with CLe to
form one class that we call CLe , the number of classes becomes S-1.
This procedure will run iteratively until that the remaining number of circles associated to the classes is
equal to the number of classes fixed initially.
3.2.3. Example:
This example has three classes of spherical shape. Figure 1 shows the distribution of a Gaussian randomly
generated data in the (X, Y) space, figure 2 shows the distribution of gravity centers of the created prototypes for
K = 10 and D =20.
sessalc 9
Figure 1: The distribution of a Gaussian randomly Figure 2: Prototypes with corresponding centers
.generated data in the (X, Y) space .obtained by the proposed method
The first step
of the proposed technique creates prototypes and eliminates noisy objects. 9 P 1 P 2 P 3 … P 5 ..…… P 9
prototypes are created (Figure E). ..…

Figure E
We notice from figure 2 that the prototypes of a class are all neighbours to each other. The gravity centers
for all the created prototypes are given by:
g1 = [16.0000 60.6364], g2= [16.1818 70.8182], g3= [26.7273 64.1818]
g4= [35.7273 147.7273], g5= [42.1818 151.4545], g6= [44.7273 146.4545]
g7= [52.0909 143.4545], g8= [99.0909 120.3636], g9= [105.636 117.4545]
Table 1 gives the mutual distances between the gravity centers of all prototypes.
.Table 1: different distances calculated between the different prototype gravity centers
g1 g2 g3 g4 g5 g6 g7 g8 g9
g1 0 10.2 11.3 89.3 94.5 90.5 90.3 102 106
g2 10.2 0 12.4 79.3 84.7 80.8 81.0 96.6 100
g3 11.3 12.4 0 84.0 88.6 84.2 83.2 91.6 95.2
g4 89.3 79.3 84.0 0 7.45 9.08 16.9 69.0 76.2
g5 94.5 84.7 88.6 7.45 0 5.61 12.7 64.8 71.9
g6 90.5 80.8 84.2 9.08 5.61 0 7.95 60.3 67.5
g7 90.3 81.0 83.2 16.9 12.7 7.95 0 52.3 59.5
g8 102 96.6 91.6 69.0 64.8 60.3 52.3 0 7.16
g9 106 100 95.2 76.2 71.9 67.5 59.5 7.16 0
sessalc 8
1st iteration
d1 ( g 5 , g 6 )  min (d ( g i , g j ))  5.61
i 1,..8 , then P5 and P6
P P… …… P9
j  i ,..9 P 1 2 3 …LC P(1 P5 6 )
will be grouped in one class CL1, the number of classes is 8. ..PP
.. 9
sessalc 7
2nd iteration
d2(g8,g9)= 7.16, then P8 and P9 will be grouped in the same
class CL2, the number of classes is 7. P 1 P 2 LC LC
P(
1 ……
P5 6) P(
2 P,
8 9)
..…
……
3 th iteration classes6
d3(g5,g4)=7.45, P4 and P5 are therefore grouped in the same
class CL1, the number of classes is 6.
P 1 P2 … CL1(P5 P6, P4) .. CL2(P8,P9)
4 th iteration sessalc 5
D4(g6,g7)=7.95, P6 and P7 will be grouped in the same
class CL1, the number of classes is 5.
P 1 P 2 LC P(
1 P
5 P
LC
,6P4 , )
7 P(
2 P,
8 9)
5 th iteration
d5(g6,g4)=9.08, P4 and P6 are already grouped in one class CL1, the number of classes is 5.
sessalc 4
6 th iteration
d6(g1,g2)=10.2, P1 and P2 will be grouped in one class CL3,
the number of classes is 4. LC P(
3 P
P,
1 2)LC 3 P(
1 P
5 ,6 P
LC
P4 , )
7 P(
2 P,
8 9)
classes 3
7 th iteration
d7(g1,g3)=11.3, P1 and P3 will be grouped in one class CL3 the
number of classes is 3. LC P(
3 P,
1LCP2 , )
3 P(
1 P
5 ,6 PPLC
4, 7 P(
2 P,
8 9)
End of algorithm
The number of classes is three, defining the real existing classes. CL1 contains P4, P5, P6, and P7. CL2 contains
P8, and P9. CL3 contains P1, P2 and P3.
4. Simulations and comparisons
We consider in this section that the classes are of convex and non convex types. All the data used for the
simulations have been generated by a Gaussian distribution routine through Matlab. The difference between the
various simulations can be seen in the number of data, the form of classes, the degree of overlap between the
classes and the class compactness. All data used in the simulations are of two dimensional type which have
allowed to plot the data and to have a clear view of the results.
4.1. Simulation 1
This first simulation has three classes with no overlapping between them. The data are divided into two sets,
one set of 90 objects is used for learning and one set of 70 objects is used for test. Figures 3 and 4 show the
results of the classification of data in (A1, A2) space for CM, FCM and SCNN methods for bad and good
initializations respectively.
Figure 3: Results of the classification in the (A1,A2) Figure 4: Results of the classification in the (A1,A2)
.space for bad initialization .space for good initialization
Figures 3 and 4 show that the methods C-means (CM), fuzzy C-means (FCM) and competitive neural
networks (SCNN) converge to good solutions when using a good initialization, but when a bad initialization is
used, these methods are trapped in local solutions. Figures 5 and 6 show the classification results using the fuzzy
Min-Max classification (FMMC) method and the proposed method respectively. For FMMC we set θ=0.5 and
γ=2. For the proposed method, D is set to 1 and K to 8.
Figure 6: Prototypes with corresponding centers

.obtained by the proposed method
Figure 5: Results of the classification in (A1,A2) Figure 6: Prototypes with corresponding centers
.space for FMMC .obtained by the proposed method
We have also studied in this simulation the impact of K on the classification results, we found that if K is
less than 8, the classification results are good, but the number of prototypes is higher. If K is greater than 10, the
classification results may not be good. The prototypes within a class where the compacity is very low may not be
formed at all, the objects may all be considered within this class as noisy, and the class ignored.
Table 1 gives the classification rates for the methods: C-means, fuzzy C-means, standard competitive neural
networks for bad initialization, fuzzy Min-Max classification and the proposed method.
Number of Number of Classification rate
Classification methods Learning time (s)
prototypes misclassified objects )%(
C-means 3 classes 30 57.14 0.188
Fuzzy C-means 3 classes 30 57.14 0.047
Standard competitive neural network 3 classes 30 57.14 0.016
Fuzzy Min-Max classification 48 0 100 0.422
Proposed method 9 0 100 0.0780
.Table 1: The classification rates obtained by the various methods considered in this work
Table 1 shows that the methods CM, FCM and SCNN obtain bad classification results for bad initialization
and the algorithms have converged to local solutions. The classification methods FMMC and the proposed
method converge to better solutions but the running time for FMMC is much bigger than that required by the
proposed method.
4.2 Simulation 2
This simulation has five classes of spherical shape which are slightly overlapping between them. The data
used are divided into two parts, a training set containing 200 objects and a test set containing 100 objects. Figure
7 shows the classification results of data in (B1, B2) space for CM, FCM and SCNN methods using a good
initialization. Figure 8 shows the results of data classification by the FMMC method, θ is set to 0.5 and γ to 1.2.
Figure 7: Results of the classification in the (B1,B2) Figure 8: Results of the classification in the (B1,B2)
.space for good initialization space for FMMC.
Figure 7 shows the good convergence of the C-means (CM), fuzzy C-means (FCM) and standard
competitive neural networks (SCNN) methods, but when a bad initialization is used, all these methods are
trapped in local solutions. Figure 9 shows the classification results for the proposed method, D=1 and K= 15.
Figure 9: Prototypes with corresponding centers obtained by the proposed method.

We have varied the value of K in this case and have obtained similar results as before. If K is less than
15, good classification results are obtained but at the expense of higher number of prototypes and increasing
running time. If K is greater than 15, the result of classification may not be good, the risk of the algorithm to
ignore less compact classes is always there.
Table 2 gives the classification rate for C-means, fuzzy C-means and standard competitive neural networks
using a good initialization, fuzzy Min-Max classification and the proposed method
Table 2: The classification rates obtained by the various methods considered in this work .
Classification methods Number of Number of Classification rate Learning time (s)
C-means classes 5 3 97 0.609
Fuzzy C-means classes 5 3 97 0.343
Standard competitive neural network classes 5 2 98 0.031
From this table, the proposed method has the highest rate of classification, it is 100%. The other methods
CM, FCM, SCNN and FMMC have a classification rate equal to 97%, 97%, 98% and 95% respectively.
4.3 Simulation 3
This simulation has three classes of spherical form. The data are divided into two parts, a training set
containing 450 objects and a test set containing 300 objects. Figure 10 shows the classification results in (C1,
C2) space using CM, FCM and SCNN methods for good initialization. Figure 11 shows the result of data
classification for the FMMC method.
Figure 10: Results of the classification in the (C1,C2) Figure 11: Result of data classification in the (C1,C2)
space for bad initialization. for the FMMC method.
Figure 12 shows the data and the prototype gravity centers in (C1, C2) space for the proposed method, D is
set to 1 and K to 23.
Figure 12: Prototypes with corresponding centers obtained by the proposed method.
It is found from this experiment that if K is less than 14, noisy objects can be selected into prototypes and
may result in a bad classification. If K is greater than 47, there is a risk of the algorithm to ignore less compact
classes. In this test, K may be of higher value than that of the previous tests because all the classes are more
compact and the number of objects in each class is higher. Table 3 gives the classification rates obtained by CM,
FCM, SCNN, FMMC and the proposed method.

C-means classes 3 9 97 0.7500
Fuzzy c-means classes 3 7 97.66 0.3280
Standard competitive neural
network classes 3 8 97.33 0.094
Fuzzy Min-Max classification 240 14 95.33 4.9531
Proposed method 25 5 98.33 2.8430
Table 3: The classification rates obtained by the various methods considered in this work.
A higher classification rate is obtained by the proposed method; it is equal to 98.33% that means five
objects were misclassified among 300 objects. For the CM, FCM, SCNN and FMMC, the classification rates are
97%, 97.66%, 97.33% and 95.33% respectively.
4.4 Simulation 4
This simulation has two classes of complex shape. The data used are divided into two parts, a training set
containing 360 objects and a test set containing 180 objects. The data have been synthesized by the authors and
this has made use of the rand routine in Matlab. Figure 13 shows the data classification results in (D1, D2) space
by CM, FCM and SCNN. Figure 14 shows the data classification results by FMMC, θ=0.3 and γ=1.
Figure 13: Results of the classification in the (D1,D2) Figure 14: Results of the classification in the (D1,D2)
space for CM, FCM and SCNN. space for. FMMC
Figure 15 shows the data and the prototype gravity centers in (D1, D2) space for the proposed method, D is
set to 1 and K to 30.
Figure 15: Prototype gravity centers obtained by the proposed method.
If K is less than 30, good classification results are obtained, but the computing time and the number of
prototypes increases. Table 4 gives the classification rate obtained by CM, FCM, SCNN and the proposed
method.
C-means classes 2 28 84.4 2.25
Standard competitive neural network classes 2 26 85.56 1.67
From this table, we notice that the performance of the proposed method is higher than that of the others, we
have obtained a good classification rate in a minimum running time.
4.5. Simulation 5
This simulation has two classes of complex shape. The data used are divided into two parts, a training set
containing 556 objects and a test set containing 500 objects. The data have been synthesized by the authors and
this has made use of the rand routine in Matlab. Figure 16 shows the data classification results in (E1, E2) space
by CM, FCM and SCNN. Figure 17 shows the data classification results by FMMC, θ=0.6 and γ=1.
Figure 16: Results of the classification in the (E1, E2) Figure 17: Results of the classification in the (E1, E2)
space for CM, FCM and SCNN. space for FMMC
Figure 18 shows the data and the prototype gravity centers in (E1, E2) space for the proposed method, D is set
to 1 and K to 30.
Figure 18: Prototype gravity centers obtained by the proposed method.

If K is less than 30; there will always be a good classification, but the computing time and the number of
prototypes increase. Table 5 gives the classification rate obtained by CM, FCM, SCNN, FMMC and the
proposed method.
C-means classes 2 214 57.2 1.0150
Standard competitive neural network classes 2 214 57.2 0.17
A very high classification rate is obtained by the proposed method and FMMC but the execution time is less
for the proposed method. The other methods CM, FCM and SCNN have failed to converge to the correct
solution; they have a very low classification rate (57%).
5. Conclusion
The present paper proposes a supervised data classification technique for convex and non convex classes.
The technique is carried out in two steps; step one creates the prototypes and eliminates noisy objects, step two
merges prototypes into classes.
The technique is based on the tuning of two parameters K and D. The prototypes are formed by the K
nearest neighbours technique. The second step consists of merging the closest prototypes into classes. The line of
circles architecture has been used to assist this task. . In this study, we have shown that bigger values of K may
obtain bad classification results, while smaller values of K obtain higher number of prototypes and higher
running time. A good choice of K must be made. Different simulations were performed to validate the proposed
method, the difference between these simulations may be viewed in the number of data, the class compactness,
the form of classes and the degree of overlap between them. The proposed method was compared to various
techniques such as C-means, fuzzy C-means, competitive neural networks and the fuzzy min max classification.
For complex classes, the traditional methods as CM, FCM and SCNN have failed to converge to the true solution
whereas the proposed method has obtained good results with smaller convergence time
In all the simulations performed in this work, the proposed method has always obtained better results in a
fewer running time.
6. References
[1] Ouariachi, H., (2001). “Classification non supervisée de données par réseaux de neurones et une approche
évolutionniste: application à la classification d’images”. Thèse de doctorat, Université Mohamed 1, Maroc.
[2] Nasri, M., (2004). “Contribution à la classification des données par approches évolutionnistes: simulation et
application aux images de textures”. Thèse de Doctorat, Université Mohammed 1, Oujda, Maroc.
[3] Simpson, P.K., (1992). “Fuzzy min-max neural networks. Part 1: Classification”. IEEE Transactions on
Neural Networks, vol. 3, pp: 776-786.
[4] Zalik, K. R. and Zalik, B., (2010). “Validity index for clusters of different sizes and densities”. Pattern
Recognition Letters, 221-234.
[5] Bouguelid, M. S., (2007). “Contribution à l'application de la reconnaissance des formes et la théorie des
possibilités au diagnostic adaptatif et prédictif des systèmes dynamiques ». Thèse de doctorat, Université de
Reins Champagne- Ardenne.
[6] Borne, P., Benrejeb M., Haggége, J. “ Les réseaux de neurones, présentation et applications”. Editions
technip.
[7] DIAB M. ,(2007). “ Classification des signaux EMG Utérins Afin de Détecter Les Accouchements
Prématurés”. Thèse.
[8] Stéphane, D., Postaire J.-G, (1996). “ Classification interactive non supervisée de données
multidimensionnelles par réseaux de neurones à apprentissage compétitif ”. Thèse, France.
[9] Hao, Y. Régis, L., (1992). “Etude des réseaux de neurones en mode non supervisé: Application à la
reconnaissance des formes”. Thèse.
[10] Kopcso, D., Pipino, L., Rybolt, W., (1992). “Classifying the uncertainty arithmetic of individuals using
competitive learning neural networks”. Elsevier: Expert Systems with Applications, Vol 4, pp.157-169.
[11] Gabrys, B., and Bargiel, A., (2000). "General Fuzzy Min-Max Neural Network for Clustering and
Classification". IEEE Transactions on neural networks, Vol. 11, N°. 3.
[12] Chowhan, S.S., Shinde, G. N., (2011). “Iris Recognition Using Fuzzy Min-Max Neural Network”.
International Journal of Computer and Electrical Engineering, Vol. 3, No. 5.
[13] Khan, S., Ahmad, A., (2004). “Cluster center initialization algorithm for K-means clustering”. Elsevier
Pattern Recognition Letters 25, pp. 1293–1302.
[14] Mohamed N. Sameh, A., Yamany, M., Nevin M., Aly A. Farag, and Moriarty, T. (2002). “A Modified
Fuzzy C-Means Algorithm for Bias Field Estimation and Segmentation of MRI Data”. IEEE, Transactions
on medical imaging, Vol. 21, N°. 3.
[15] Dong-Chul Park, (2000) “Centroid Neural Network for Unsupervised Competitive Learning”. IEEE
Transactions on neural networks, Vol. 11, N°. 2.

Final Version

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Final Version

Uploaded by

Copyright:

Available Formats

A new supervised data classification method for convex and non convex

O. El Melhaoui, M. El Hitmy, and F. Lekhal

2.2. C-means (CM)

||.|| is the Euclidean distance, ui,s equal to 1 if Ri belongs to CLs, 0 if not.

2.3. Fuzzy C-means (FCM)

2.4. Standard competitive neural networks (SCNN)

η is a learning rate which is usually a small real number between 0 and 1.

2.5. Fuzzy Min-Max classification (FMMC)

N columns (representing the parameters aj), defined by:

3.2.1. Creating the prototypes and removing the noisy objects,

prototypes are created (Figure E). ..…

Figure 6: Prototypes with corresponding centers

Figure 9: Prototypes with corresponding centers obtained by the proposed method.

Number of Number of Classification rate

Figure 15: Prototype gravity centers obtained by the proposed method.

Figure 18: Prototype gravity centers obtained by the proposed method.

You might also like