You are on page 1of 2

Q1:The Levenshtein distance algorithm has been used in:

(1 marks)

Q2: Association: Using the Apriori algorithm show how frequent item sets can be found in
the following data set. where the minimum support count is 2.
TID List -items bought

T1 computer , mouse , camera


T2 mouse, printer
T3 mouse , keyboard
T4 computer , mouse, printer
T5 computer, keyboard
T6 mouse , keyboard
T7 computer, keyboard
T8 computer , mouse , keyboard , camera
T9 computer , mouse , keyboard
You should explain what happens at each step along with the data produced.

(4 marks)

(Option )Q3::( clustering) Suppose that the data mining task is to cluster the following
eight points (with (x; y) representing location) into three clusters.
A1(2; 10);A2(2; 5);A3(8; 4);B1(5; 8);B2(7; 5);B3(6; 4);C1(1; 2);C2(4; 9):
The distance function is Euclidean distance. Suppose initially we assign A1, B1, and C1 as
the center of each cluster, respectively. Use the k-means algorithm to show only
(a) The three cluster centers after the first round of execution and
(b) The final three clusters
Q3:( clustering) what would be the distance matrix
after each of the first three mergers if complete-link
clustering

(4 marks)
Q4: Classification: Given the following data set:-

1
2
3
4
5
6
7
Using CRAT algorithm, which one of the above three attributes will be the root of the
decision tree?
(4 marks)

Q5:Text Mining : Normalize the vectors (20, 10, 8, 12, 56) and (0, 15, 12, 8, 0). Calculate
the distance between the two normalized vectors using the dot product formula.

(4 marks)

You might also like