You are on page 1of 4

Data Mining Quiz 1 Clustering

Type : Graded Quiz Questions : 8 Time : 45m


Marks: 10
Q No: 1

Correct Answer
Marks: 1/1

Silhouette Score is calculated using the following formula:

Silhouettescore = (p−q)/max(p,q)

What does p & q represent?

p = mean distance to the points in the nearest cluster & q = mean intra-cluster distance to all the
points.
You Selected
p = mean distance to the points in the farthest cluster & q = mean intra-cluster distance to all the
points.

p = mean distance to the points in the nearest cluster & q = sum of the intra-cluster distance of all the
points.

p = mean distance to the points in the farthest cluster & q = sum of the intra-cluster distance of all
the points.
Q No: 2

Correct Answer
Marks: 1/1
At p=2, the Minkowski distance will resemble which type of distance measure?

Euclidean Distance
You Selected
Manhattan Distance

Chebyshev Distance

None of the mentioned

d(x,y)= (Summation( xi - yi)p )1/p

for p=2, d(x,y) becomes (Summation( xi - yi)2 )1/2 

Q No: 3

Correct Answer
Marks: 1/1
Calculate Euclidean Distance for between below points:
p1= [2,3]
p2= [4,5]

2.626

3.100

2.423

2.828
You Selected

Euclidean Distance:

dist((x, y), (a, b)) = √(x - a)² + (y - b)²

(2,3)

(4,5)

Find difference 2-4= -2 and 3-5 =-2

Square and add the values  4 + 4 =8

Take the Square Root of the value  √8 = 2 x √2 = 2 x 1.414 =2.828

Q No: 4

Correct Answer
Marks: 1/1

Calculate the Silhouette Score for below:


np.random.seed(7)
array=np.array(np.random.rand(20)).reshape(10,2)
for n_clusters=2

[hint: scale the array using standard scalar]

0.4164

0.5478

0.4069
You Selected
0.3209
Q No: 5

Correct Answer
Marks: 1/1
Calculate the Manhattan distance between Point P1(4,4) and P2(9,9)?

10
You Selected
(5,5)

None of the Mentioned

Manhattan Distance:

(4,4) (9,9)

d= |(x2-x1)|+|(y2-y1)| 

d= |(9-4)|+|(9-4)| = 5+5=10

Q No: 6

Correct Answer
Marks: 1/1
Agglomerative clustering algorithm is generating 2 different dendrograms. What among the following
could be the possibilities for it to occur?

All of the mentioned.


You Selected
Due to the proximity function

Due to the data points used

Due to the variables used


Q No: 7

Correct Answer
Marks: 1/1
Agglomerative Clustering will start by considering all points as part of one big cluster

True

False
You Selected
Agglomerative Clustering starts by considering all points as individual clusters
Q No: 8

Correct Answer
Marks: 3/3

Use the dataset provided in the instructions.

The within-cluster sum of squared for 4 clusters is:

[Hint: Use KMeans Clustering and keep random_state=0]

1102.32

1694.33

1895.25
You Selected
2123.10

kmeans = KMeans(n_clusters=4,random_state=0)
km=kmeans.fit(dataset_scaled)
print('The within sum of squared for 4 clusters is',round(km.inertia_,2))

The within sum of squared for 4 clusters is 1895.25

You might also like