You are on page 1of 14

MACHINE LEARNING ASSIGNMENT

FUZZY C MEANS CLUSTERING

UNDER THE GUIDANCE


PROF. DURGA DEVI
ASSISTANT PROFESSOR
Fuzzy C Means
• Fuzzy clustering  is a form of clustering in which each 
data point can belong to more than one cluster.
• Clustering or cluster analysis involves assigning data points to
clusters such that items in the same cluster are as similar as
possible, while items belonging to different clusters are as
dissimilar as possible. Clusters are identified via similarity
measures. 
• Fuzzy c-means (FCM) clustering was developed by J.C. Dunn in
1973
• In non-fuzzy clustering (also known as hard clustering), data is
divided into distinct clusters, where each data point can only
belong to exactly one cluster. In fuzzy clustering, data points can
potentially belong to multiple clusters. For example, an apple
can be red or green (hard clustering), but an apple can also be
red AND green (fuzzy clustering). Here, the apple can be red to
a certain degree as well as green to a certain degree. Instead of
the apple belonging to green [green = 1] and not red [red = 0],
the apple can belong to green [green = 0.5] and red [red = 0.5].
These value are normalized between 0 and 1; however, they do
not represent probabilities, so the two values do not need to add
up to 1.
Algorithm
General Description
The fuzzy c-means algorithm is very similar to the k
-means algorithm:
•Choose a number of clusters.
•Assign coefficients randomly to each data point for being in the
clusters.
•Repeat until the algorithm has converged (that is, the coefficients'
change between two iterations is no more than , the given
sensitivity threshold) :
•Compute the centroid for each cluster
•For each data point, compute its coefficients of being in the
clusters.
 

Objective function
•  • Where
𝑘 • Uij- is the degree to
𝑀 which an
𝑢 2
𝑥 −𝜇
∑ ∑ ( 𝑗) observation xi belongs
𝑖⋅𝑗
❑ ❑ ˙𝑖
to a cluster cj
• μj is the center of the
cluster j
𝑗=1 𝑥 𝑖∈𝑐 𝐽˙ • m is the fuzzifier
.

The variable u m ij  is defined as follow:

• Umij= 1/k∑l=1(|xi−cj|/|xi−ck|)2/(m−1)

• The degree of belonging, uij, is linked inversely to the distance from x to the cluster center.
• The parameter m is a real number greater than 1 (1.0<m<∞) and it defines the level of cluster
fuzziness. Note that, a value of m close to 1 gives a cluster solution which becomes increasingly
similar to the solution of hard clustering such as k-means; whereas a value of m close to infinite leads
to complete fuzzyness.
In fuzzy clustering the centroid of a cluster is he
mean of all points, weighted by their degree of
belonging to the cluster:
• Cj=(∑x∈Cjumijx)/(∑x∈Cjumij)

Where,
•Cj is the centroid of the cluster j
•uij is the degree to which an observation xibelongs to a
cluster cj
This algorithm works by assigning membership to
each data point corresponding to each cluster
center on the basis of distance between the
cluster center and the data point. More the data is
near to the cluster center more is its
membership towards the particular cluster center.
Advantages and Disadvantages
• Advantages • Disadvantages
• 1) Gives best result for overlapped data • 1) Apriori specification of the
set and comparatively better then k- number of clusters.
means algorithm.

• 2) Unlike k-means where data point • 2) we get a better result with this
must exclusively belong to one cluster algorithm but at the expense of
center here data point is assigned more number of iteration.
• 3)membership to each cluster center as • 3) Euclidean distance measures
a result of which data point may belong can unequally weight underlying
to more then one cluster center. factors.
Screenshot’s Of Implementation
Code:
Oututs
Thank You

You might also like