You are on page 1of 3

ASSIGNMENT- MACHINE LEARNING

1. Explain K –means clustering algorithm with a suitable example?


A Clustering Algorithm tries to analyse natural groups of data on the basis of some
similarity. It locates the centroid of the group of data points. To carry out effective
clustering, the algorithm evaluates the distance between each point from the
centroid of the cluster.The goal of clustering is to determine the intrinsic grouping
in a set of unlabelled data.

K-means Clustering: K-means (Macqueen, 1967) is one of the simplest


unsupervised learning algorithms that solve the well-known clustering problem. K-
means clustering is a method of vector quantization, originally from signal
processing, that is popular for cluster analysis in data mining.

K-means Clustering – Example: A pizza chain wants to open its delivery centres
across a city. What do you think would be the possible challenges?

 They need to analyse the areas from where the pizza is being ordered
frequently.
 They need to understand as to how many pizza stores has to be opened to
cover delivery in the area.
 They need to figure out the locations for the pizza stores within all these
areas in order to keep the distance between the store and delivery points
minimum.

2. How K means clustering algorithm differ from hierarchical clustering.

 Hierarchical clustering can’t handle big data well but K Means clustering
can. This is because the time complexity of K Means is linear i.e. O(n) while
that of hierarchical clustering is quadratic i.e. O(n 2).
 In K Means clustering, since we start with random choice of clusters, the
results produced by running the algorithm multiple times might differ.
While results are reproducible in Hierarchical clustering.
 K Means is found to work well when the shape of the clusters is hyper
spherical (like circle in 2D, sphere in 3D).
 K Means clustering requires prior knowledge of K i.e. no. of clusters you
want to divide your data into. But, you can stop at whatever number of
clusters you find appropriate in hierarchical clustering by interpreting the
dendrogram.
3. Explain Multiclass classification.

Multiclass classification is a classification task that consists of more than two


classes, (ie. using a model to identify animal types in images from an
encyclopedia). In multiclass classification, a sample can only have one class (ie. an
elephant is only an elephant; it is not also a lemur). Outside of regression,
multiclass classification is probably the most common machine learning task. In
classification, we are presented with a number of training examples divided into K
separate classes, and we build a machine learning model to predict to which of
those classes previously unseen data belongs (ie. the animal types from the
example above). In seeing the training data, the model learns patterns specific to
each class and uses those patterns to predict the membership of future data.

For example, a cybersecurity company might want to be able to monitor a user’s


email inbox and classify incoming emails as either potential phishers or not. To do
so, it might train a classification model on the email texts and inbound email
addresses and learn to predict from which sorts of URLs threatening emails tend to
originate. 

As another example, a marketing company might serve an online ad and want to


predict whether a given customer will click on it. (This is a binary classification
problem.)

4. Explain Classification vs. clustering

BASIS FOR
CLASSIFICATION CLUSTERING
COMPARISON

Basic This model function This function maps the data

classifies the data into one into one of the multiple

of numerous already clusters where the

defined definite classes. arrangement of data items

is relies on the similarities


BASIS FOR
CLASSIFICATION CLUSTERING
COMPARISON

between them.

Involved in Supervised learning Unsupervised learning.

Training sample Labelled data is provided. Unlabelled data provided.

 
5. Apply K-means algorithm on given data for k=3, Use C1 (2), C2 (16)
and C3 (38) as initial cluster centres.
 
Data: 2,4,6,3,31,13,15,16,38,35,14,21,23,25,3

On the basis of given mean value 2,16 and 38 the first cluster formed are:

C1(2) C2(16) C3(38)

K1={2,4,6,3,3} K2={13,14,15,16,21,23,25} K3={31,35,38}

From the new cluster formed i.e, K1,K2,K3 the new mean and clusters will
be

C1(3.6) C2(18.1) C3(34.6)

K1={2,4,6,3,3} K2={13,14,15,16,21,23,25} K3={31,35,38}

Here we will stop calculating mean as because the value of mean is


repeating.

Therefore, the 3 clusters formed are:

K1={2,4,6,3,3} K2={13,14,15,16,21,23,25} K3={31,35,38}

You might also like