You are on page 1of 3

Reg. No.

16CAT33 - Data Warehousing and Data Mining Set 1


K1 – Remembering K4 – Analyzing
Knowledge Levels (KL)
K2 - Understanding K5- Evaluating
K3- Applying K6 - Creating

Part – A. Answer ALL questions. 8 x 2 = 16 marks


No Question KL
1 State the disadvantages of Apriori algorithm. K1
2 Define negatively correlated and strongly negatively correlated pattern. K2
3 How does Support and Confidence measure help in generating association rules. K3
4 State Baye’s theorem of estimating the probabilities of classification. K1
5 Define mutually exclusive and exhaustive rules. K2
6 Why support vector machines are popular for classification problem? K4
7 What do you mean by cluster analysis? K2
8 Specify the two major kinds of clustering methods for high dimensional data. K1

Part B. 2 x 13 = 26 marks Marks KL

9 (a) Illustrate Apriori algorithm for mining frequent itemsets using


13 K3
candidate generation.
(OR)
(b) (i) Briefly outline the major steps of decision tree classification. 7 K2
(ii) Explain the rule induction using a sequential covering
algorithm. 6 K2
10 (a) (i) Briefly describe agglomerative and divisive hierarchical
7 K2
clustering.
(ii) Summarize the challenges of outlier detection. 6 K2
(OR)
(b) Illustrate the challenges and major methodologies involved in
13 K2
conducting cluster analysis on high dimensional data.

Part C. 1 x 8 = 8 marks Marks KL


11 Both k-means and k-medoids algorithms can perform effective
clustering. Illustrate the strength and weakness of k-means in
comparison with k-medoids. 8 K4
Reg. No.

16CAT33 - Data Warehousing and Data Mining Set 2

K1 – Remembering K4 – Analyzing

Knowledge Levels (KL) K2 - Understanding K5- Evaluating

K3- Applying K6 - Creating

Part – A. Answer ALL questions. 8 x 2 = 16 marks

No Question KL

1 Define antimonotonicity property of frequent itemsets. K2

2 List the constraints that are considered for frequent pattern mining. K1

3 State the steps involved in association rule mining. K1

4 How are decision trees used for classification? K2

5 Define coverage and accuracy of a rule-based classifier. K2

6 What do you mean by multilayered feed forward neural network? K2

7 How does k-means algorithm work? K2

8 How might we improve the quality and scalability of CLARA? K4

Part B. 2 x 13 = 26 marks Marks KL

9 (a) (i) Why is tree pruning useful in decision tree induction? Point out
the drawback of using a separate set of tuples to evaluate 7 K2
pruning.
(ii) Show that accuracy is a function of sensitivity and specificity. 6 K3

(OR)
(b) Illustrate FP growth algorithm for mining frequent itemsets
13 K3
without candidate generation.
10 (a) (i) Explain how DBSCAN is useful for density based clustering. 7 K2
(ii) Illustrate k-means algorithm for partitioning clusters. 6 K2
(OR)

(b) Explain grid-based clustering methods with STING and


CLIQUE. 13 K2

Part C. 1 x 8 = 8 marks Marks KL


11 A database has five transactions. Let min_sup=60% and
min_conf=80%.
TID Items Bought
T100 {M,O,N,K,E,Y}
T200 {D,O,N,K,E,Y}
T300 {M,A,K,E} 8 K4
T400 {M,U,C,K,Y}
T500 {C,O,O,K,I,E}

Find all frequent itemsets using Apriori algorithm.

You might also like