An Efficient Data Clustering Algorithm Using Fuzzy PDF

IETECH Journal of Electrical Analysis, Vol: 1, No: 2, 130-136

© IETECH Publications, 2007
AN EFFICIENT DATA CLUSTERING ALGORITHM USING FUZZY

LOGIC FOR CONTROL OF A MULTI COMPRESSOR SYSTEM
Gursewak S. Brar Yadwinder S. Brar Yaduvir Singh

Dept. of Electrical Engineering Dept. of Electrical Engineering Dept. of Electrical and
Baba Banda Singh Bahadur Giani Zail Singh College of Instrumentation Engineering
Engineering College, Engineering and Technology, Thapar University, Patiala,
Fatehgarh Sahib, India. Bhatinda, Punjab, India Punjab, India
ABSTRACT
In recent years, the dramatic rise in the use of the web and the improvement in process industries in
general have transformed our society into one that strongly depends on information. The huge amount of data
that is generated by this process contains important information that accumulates daily in databases and is
not easy to extract. The field of data mining developed as a means of extracting information and knowledge
from databases to discover patterns or concepts that are not evident The process usually consists the method:
transforming the data to a suitable format, cleaning it, and inferring or making conclusions regarding the
data. Machine learning is divided into two primary sub-fields: supervised learning and unsupervised
learning. Within the category of unsupervised learning, one of the primary tools is clustering. In this paper
we have established the fact that fuzzy clustering associate each pattern with every cluster using a
membership function where as traditional clustering approaches generate partitions, where in patterns
belong to one and only one cluster. In this paper, fuzzy clustering has been implemented for a multi-
compressor system for the first time. In fuzzy clustering, a large collection of documents is clustered and each
of the clusters is represented using its center. The fuzzy data-clustering algorithm generated for data based
controller for the multi-compressor system enhances the control efficiency.
Key words: Data clustering, Data clustering Algorithms, Data handling, Fuzzy logic, Fuzzy c-means
Algorithm, Multi-compressor.
1. INTRODUCTION proportion of heavy industry across the globe. [1,

Data clustering is a common technique for 12] Refrigeration and cooling is present in most of
statistical data analysis, which is used in many the mechanical units. There is trend to use
fields; including mechanical process industries computerized numerical techniques with the
machine learning, data mining, pattern recognition, mechanical technology. It greatly reduces energy,
image analysis and bioinformatics. Clustering is the time, cost etc and drastically enhances the
classification of similar objects into different efficiency of mechanical systems. There is trend to
groups, or more precisely, the partitioning of a data design controllers for mechanical systems based on
set into subsets (clusters), so that the data in each data instead of models. In order to clarify how
subset (ideally) share some common trait - often clustering fits into the broader framework, in
proximity according to some defined distance supervised learning, the algorithm is provided with
measure. Mechanical industries form a large both the cases (data points) and the labels that
130
represent the concept to be learned for each case. clustering problem is such that the ideal approach is
The goal is then; learn the concept in the sense that equivalent to finding the global solution of a non-
when a new, unseen case comes to be classified, the linear optimization problem. There are many
algorithm should predict a label for this case. Under different ways to express and formulate the
this paradigm, there is the possibility of over fitting clustering problem; as a consequence, the obtained
or “cheating" by memorizing all the labels for each results and its interpretations depend strongly on the
case, rather than learning general predictive way the clustering problem was originally
relationships between attribute values and labels. In formulated. If we consider all the “variations" of
order to avoid over fitting, these algorithms try to each different algorithm proposed to solve each
achieve a balance between fitting the training data different formulation, we end up with a very large
and good generalization, this is usually referred as family of clustering algorithms. Although in the
the Bias/Variance dilemma. The outcomes of this literature there are as many different classifications
class of algorithms are usually evaluated on a of clustering algorithms as the number of algorithms
disjoint set of examples from the training set, called itself, there is one simple classification that allows
the testing set. Methods range from traditional essentially splitting them into the following two
statistics approaches, neural networks and, lately, main classes: [3, 4]
Support vector machines. On the other hand, in • Parametric Clustering
unsupervised learning the algorithm is provided • Non-Parametric Clustering
with just the data points and no labels, the task is to
find a suitable representation of the underlying
distribution of the data. One major approach to
unsupervised learning is fuzzy data clustering. [2, 5,
6] Both supervised and unsupervised learning have
been combined in what some people called semi-
supervised learning. The unsupervised part is The fuzzy data-clustering algorithm will be
usually applied first to the data in order to make used to design a data based controller for the
some assumptions about the distribution of the data, system. The relationships between the presented
and then these assumptions are reinforced using a identification method and linear regression are
supervised approach. exploited, allowing for the combination of fuzzy
logic techniques with standard system identification
The simplest definition of clustering is shared tools. Attention is paid to the aspects of accuracy
among all and includes one fundamental concept: and transparency of the obtained fuzzy models. GA
the grouping together of similar data items into clustering can be used for this purpose. But fuzzy
clusters. A simple, formal, mathematical definition clustering is more beneficial. Fuzzy clustering alone
of clustering is the following: let X ∈ Rm×n a set of can’t give the optimal output so we use the
data items representing a set of m point’s xi in Rn combination of both fuzzy and GA techniques in the
The goal is to partition X into K groups Ck such design of controller. Using the concepts of model-
every data that belong to the same group are more based predictive control and internal model control
“alike" than data in different groups. Each of the K with an inverted model, the control design based on
groups is called a cluster. The result of the a fuzzy-GA model of a nonlinear dynamic process
algorithm is an injective mapping X→ C of data is addressed. To this end, methods, which exactly
items Xi to clusters Ck. The number K might be pre- invert specific types of fuzzy-GA models, are
assigned by the user or it can be an unknown, presented. In the context of predictive control,
determined by the algorithm. In this paper, we branch-and bound optimization is applied. Attention
assume that the user gives the K. The nature of the is paid to algorithmic solutions of the control
131
problem, mainly with regard to real-time control coefficient of performance (C.O.P.) is calculated as
aspects. This paper presents efficient algorithms for below.
data handling and data clustering as applied to a C.O.P. = Q/W (2)
multi-compressor systems. This multi-compressor Also
system is installed in MNC in Punjab and presently Relative C.O.P. = (actual C.O.P./theoretical C.O.P) (3)
possesses huge losses with an efficiency of around
60-65 %. The performance of the heat pump is taken
into account by a ratio (Q+W)/W and it is known as
2. CASE STUDY: MULTI-COMPRESSOR energy performance ratio (E.P.R.) it is obtained as
SYSTEM below.
This plant is basically a chemical plant and E.P.R. = (1 + Q/W) (4)
used for making the food products. In this Also
temperature variations occur, so cooling is required E.P.R. = (C.O.P. + 1) (5)
from time to time. A robust controller is required,
The value of C.O.P. should be less then one or
which can provide temperature stabilization and
greater then one, which depends upon the type of
accurate cooling. Systems having thermodynamic
the refrigeration system. The value of E.R.P. should
importance are divided into two groups. First, work
always be greater then one. Figure 2 shows the
developing systems which includes all types of
multi-mode system with single compressor which is
engines producing power using thermal energy and
used when numbers of loads at same temperatures
second work-absorbing systems which include
are to be taken by the refrigerating plant.
compressors, refrigerators and heat pumps etc.
Source and sink contain infinite energy at constant
temperature. Source temperature is always higher
then the sink temperature.
Fig. 2 Multimode systems with single compressor
The arrangement of multi-evaporators at

different temperatures with back pressure valves is
Fig.1 Engine, refrigerator and heat pump shown in figure 3.
The performance of an engine is taken into

account by the ratio of work and energy supply ie
W/Q, which is known as efficiency (ñ) of the engine
and is given as below. [10, 11]
ñ = W/Q (1)
The performance of the refrigerator is taken

into account by a ratio Q/W. The theoretical Fig. 3 Multi-evaporators at different temperatures
with back pressure valves
132
The pressures of the refrigerants coming out of the Step 2: Update U (l): Reallocate cluster
evaporators and after leaving the back pressure memberships to minimize squared errors:
valves is same and that is the suction pressure of the
compressor. [10, 11]
3. CLUSTERING ALGORITHM & FUZZY (12)

CLUSTERING Fuzzy Clustering:
Clustering: Group objects mj into c clusters. Fuzzy partition space:
Assume the clusters exist, let C = [c (1)… c(c)] be a
set of prototypes or cluster centers. [7, 8, 9]
(13)
Fuzzy objective function: - is a least-square

functional
(6)
(14)
A cluster can be seen as describing an equivalence
class. Weighting factor:-
(7)
Hard-c-Means Clustering:-
Let c be the number of clusters, the hard
partitioning space. The above algorithm depicts the various
parameters involved in the proper execution of
fuzzy clustering algorithm.
(8)
4. PROBLEM FORMULATION:
Clustering criterion (objective function, cost The typical development of problem for
function) cluster analysis consists of four steps along with a
feed back path as shown in the figure 4. The steps
are as
1. Feature selection and extraction.
(9) 2. Clustering algorithm design or selection
Distance measure: 3. Cluster validation
4. Results interpretation
(10)
Algorithm:
Step 1: Calculate centers of clusters; c-mean vectors
(11) Fig. 4 Cluster Analysis Steps
133
Along with the above considerations the control will have membership values in [0,1] for each
strategy consists of formulating or identifying cluster.
control objective, input variables, output variables, zzzzzzzzz
constraints, operating characteristic, safety,
environmental, and economic considerations,
control structure and algorithm etc.
5. DESIGN & DEVELOPMENT OF FUZZY

DATA CLUSTERING ALGORITHM
Traditional clustering approaches generate
Fig. 5 Fuzzy clustering
partitions; in a partition, each pattern belongs to one
and only one cluster. The clusters in a hard
The ordered pairs in each cluster represent the
clustering are disjoint. Fuzzy clustering extends this th
i pattern and its membership value to the cluster.
notion to associate each pattern with every cluster
Larger membership values indicate higher
using a membership function. The output of such
confidence in the assignment of the pattern to the
algorithms is a clustering, but not a partition. The
cluster. A hard clustering can be obtained from a
designed and implemented high-level partition
fuzzy partition by thresholding the membership
fuzzy clustering algorithm is given below.
value. The design of membership functions is the
most important problem in fuzzy clustering.
5.1. Fuzzy Clustering Algorithm
Different choices are shown as (i) and (ii) in figure
Step 1: Select an initial fuzzy partition of the
6. below, for this case. These are based on similarity
N objects into K clusters by selecting the N * K
decomposition and centroids of clusters.
membership matrix U. An element uij of this matrix zzzzzzzzz
represents the grade of membership of object xi in

cluster cj. Typically, uij [0, 1]
Step 2: Using U, find the value of a fuzzy

criterion function, e.g., a weighted squared error (i) (ii)
Fig. 6: Design of membership function in fuzzy
criterion function, associated with the
clustering
corresponding partition. Fuzzy criterion function is
given in equation 6, as shown below. Clusters Representation:
Partition has been used for the separability of
N K
the data points into clusters. The resulting clusters
E 2 ( x, U ) = ∑∑ uij X i − Ck 2
Where
i =1 k =1
can also be represented or described in a compact
N form to achieve data abstraction. Representing
Ck = ∑ uij X i is the k th cluster center (15) clusters using nodes in a classification tree or
i =1
representing clusters by using conjunctive logical
expressions, as shown in figure7, below.
Step 3: Repeat step 2 until entries in U does
not change significantly Figure 5. Illustrates fuzzy
clustering in which each cluster is a fuzzy set of all
the patterns. The rectangles enclose two “hard”
clusters H1 (=1, 2,3,4,5) and H2 (=6,7,8,9) in the
data. The fuzzy clustering algorithm produces the
two elliptical fuzzy clusters F1 and F2. The patterns
Fig. 7 Fuzzy clustering using classification tree
134
In the above graph error criterion is taken

In figure7, the expression [X1 > 3] [X2 < 2] against no of clusters. The results as shown here
stands for the logical statement ‘X1 is greater than after first iteration are converging with the
3’ and ’X2 is less than 2’. decreasing trend in the integral square error. As the
desired error criterion is reduced the no of iterations
6. SIMULATION AND TESTING: for the objects to get into the clusters as desired by
The simulation has been done in a high level the user increases. The algorithm runs till the fuzzy
language. A set of fuzzy rules has been obtained criterion is satisfied. Once the fuzzy criterion is
from fuzzy clusters of a data set in this case. No of satisfied, depending on the maximum membership
Clusters: 3 and No of objects: 3 Values: Object 1: function, the objects are assigned to the respective
11.7, Object 2 : 3.5 , Object 3 : 2.1 Maximum clusters and we get the desired fuzzy clustering of
weighted square error possible: 100 the data provided.
Table-1: Determination of membership functions 7. RESULTS AND DISCUSSION

Cluster 1 Cluster 2 Cluster 3 A partition clustering like the k-means
For object 1 0.9 0.7 0.6 algorithm cannot separate structures (patterns)
For object 2 1 0.2 0.1 properly, which has been easily achieved by fuzzy
For object 3 0.5 0.4 0.3 clustering. The single-link algorithm works well on
The above results confine to the fuzzy this data, but is computationally expensive. So a
criterion function (Error criterion) in the first hybrid approach may be used to exploit the
iteration and hence determine the membership desirable properties of both these algorithms. It
matrix increases the efficiency of the decision making task.
0.9 0.7 0.6 In order to retrieve documents relevant to a query,
the query is matched with the cluster center rather
1 0.2 0.1
than with all the documents. This helps in retrieving
0.5 0.4 0.3
relevant documents efficiently. Also in several
applications involving large data sets, clustering is
The membership value to be taken: - used to perform indexing, which helps in efficient
For object 1 - 0.9, For object 2 – 1 and For decision-making. Here, a data reduction is achieved
object 3 - 0.5. The fuzzy algorithm gives the result by representing the sub-clusters by their center. The
that the C1 (Cluster 1) contains the objects - fuzzy data-clustering algorithm generated for data
object1, object2, and object3. based controller for the multi-compressor system
Error Square Fuction reduces data classification time and thus enhances
250
the observability and controllability.
220.5
200 200
180.5
8. CONCLUSION & FUTURE SCOPE
150
162
144.5
In this paper, fuzzy clustering has been
Error criterion
128
112.5
implemented for a multi-compressor system. In a
100 98
84.5 cluster based document retrieval technique, a large

72
50
40.5 40.5
50
60.5
collection of documents is clustered and each of the
32 32
24.5
18
12.5
8
4.5 2 4.5
8
12.5
18
24.5
clusters is represented using its center. The fuzzy
0 0.5 0 0.5 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
No of clusters
data-clustering algorithm generated for data based
Error Square Fuction
controller for the multi-compressor greatly enhances
the efficiency. Though the application of Genetic
Fig. 8: Error Criterion Algorithms for designing fuzzy systems is recent, it
135
has seen increasing interest over the last few years the Fifth IEEE International Conference, vol.3,
and will allow to fruitful research to be carried out pp. 2053-2058 ,1996.
in the building of fuzzy logic-based intelligent [7] Bonissone P.P., Khedkar P.S., Chen Y.,
clustering systems. “Genetic Algorithms for Automated Tuning of
Fuzzy Controllers: A Train Handling
Corresponding Author Application” Proc. Fifth IEEE International
Gursewak S. Brar Conference on Fuzzy Systems (FUZZ-
Dept. of Electrical Engineering IEEE'96), New Orleans, pp. 675-680, 1996.
Baba Banda Singh Bahadur Engineering
College, Fatehgarh Sahib, India. [8] Peter Zoltan Baranyi, L. T. Koczy , T. D.
Gedeon, “Improved Fuzzy and Neural
REFERENCES Network Algorithms for word frequency
[1] Bentley, J. L., Friedman, J. H., “Fast Prediction in Document Filtering,”Journal of
algorithms for constructing minimal spanning Advanced Computational Intelligence, vol. 2,
trees in coordinate spaces.” IEEE Trans. on No. 3., pp.88-95,1998.
Computer. C-27, 6 (June), pp: 97–105, 1978. [9] Jain A.K., Murty M.N., Flynn P.J. Data
[2] R L Cannon, J V Dave, J C Bezdek, “Efficient Clustering: A Review, ACM Computing
implementation of the fuzzy c-means Surveys, Vol. 31, No. 3, pp: 264-323,
clustering algorithms Source,” IEEE September 1999.
Transactions on Pattern Analysis and Machine [10] Mohamed Marzouk,Osama Moselhi, “On the
Intelligence. Vol. 8, issue 2, pp. 248 - 255, use of fuzzy clustering in construction
1986. simulation, “Proceedings of the 33rd
[3] Sutton R. S., “Learning to predict by the conference on Winter simulation, Arlington,
methods of temporal differences,” Journal of Virginia. IEEE Computer Society Washington,
Mach. Learn., vol. 3, no. 1, pp. 9–44, 1988. DC, USA. pp.1547 - 1555, 2001.
[4] Bar-dossy A., Duckstein L., “Fuzzy Rule - [11] Joy K.V., “Advantages of Reciprocating
Based Modeling with Applications to Compressor” by ISHREE, Bombay Chapter.
Geophysical” Biological and Engineering Journal of Air Conditioning and Refrigeration,
Systems CRC Press, 1995. Vol. –7, No.-1, pp:147-156 Jan-March, 2004.
[5] M. Delgado , A. Gomez-Skarmeta , M.A. Vila, [12] Gaithersburg MD, Beall K. A., “Performance
“Hierarchical Clustering to validate Fuzzy Characteristic of Refrigeration flow
Clustering. Fuzzy Systems,” IEEE fourth Compressor for natural gas compressor
International Conference on Fuzzy Systems application”. Journal of Energy Resources
and second International Fuzzy Engineering Technology Vol.-127, Issue-1, pp. 7-14,
Symposium., Proceedings of IEEE March 2005.
International Conference, vol.4., pp.1807-
1812, 1995.
[6] F. Klawonn, Member,R. Kruse, “Automatic
Generation of Fuzzy Controllers by Fuzzy
Clustering,” Fuzzy Systems, Proceedings of
136

An Efficient Data Clustering Algorithm Using Fuzzy PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An Efficient Data Clustering Algorithm Using Fuzzy PDF

Uploaded by

Copyright:

Available Formats

IETECH Journal of Electrical Analysis, Vol: 1, No: 2, 130-136

IETECH Journal of Electrical Analysis, Vol: 1, No: 2, 130-136

AN EFFICIENT DATA CLUSTERING ALGORITHM USING FUZZY

Gursewak S. Brar Yadwinder S. Brar Yaduvir Singh

1. INTRODUCTION proportion of heavy industry across the globe. [1,

Fig. 2 Multimode systems with single compressor

The arrangement of multi-evaporators at

The performance of an engine is taken into

The performance of the refrigerator is taken

3. CLUSTERING ALGORITHM & FUZZY (12)

Fuzzy objective function: - is a least-square

(11) Fig. 4 Cluster Analysis Steps

5. DESIGN & DEVELOPMENT OF FUZZY

represents the grade of membership of object xi in

Step 2: Using U, find the value of a fuzzy

In the above graph error criterion is taken

Table-1: Determination of membership functions 7. RESULTS AND DISCUSSION

84.5 cluster based document retrieval technique, a large

You might also like