You are on page 1of 2

Register Number

Programme MCA Semester 03


Course Code Data Warehousing and Data Max. Marks 100
16CAT41
& Title Mining Duration 3 Hrs
(Specify any Chart, Tables or others to be permitted.)

K1 - Remembering K4 – Analysing
Knowledge Levels (KL) K2 - Understanding K5 – Evaluating
K3 - Applying K6 – Creating

Part – A - Answer ALL Questions. 10 x 2 = 20 Marks


No. Question KL
1. Compare OLTP and OLAP K4
2. Are data mart and data warehouse identical? Support your answer with an example. K4
3. What makes a pattern interesting? Can a data mining system generate all of the K4
interesting patterns?
4. Justify the need for data preprocessing in data mining process K4
5. How will you evaluate support and confidence metrics for generating association rules? K5
6. How constraint based frequent pattern mining is done? K2
7. Why is tree pruning useful in decision tree induction? What is the drawback of using a K4
separate set of tuples to evaluate pruning?
8. How is prediction different from classification? K4
9. Classify hierarchical clustering methods. K3
10. What do you mean by outlier analysis? K2

Part – B - Answer ALL Questions. 5 x 13 = 65 Marks


Marks KL
11. (a) Define data warehouse? With the help of a neat sketch, explain the
13 K2
various components in a data warehousing system
OR
(b) What is a multiprocessor architecture? List and discuss the steps
involved in mapping a data warehouse to a multiprocessor 13 K3
architecture

12. (a) i. List and discuss the steps for integrating a data mining system with a
7 K2
data warehouse.
ii. Discuss the classification of data mining systems. 6 K2
OR
(b) Elucidate various methods of data cleaning in detail. 13 K4

13. (a) Giving a concrete example, explain a method that performs frequent
itemset mining using the prior knowledge of frequent itemset 13 K3
properties.
OR
(b) i. Write and explain the algorithm for mining frequent itemsets
7 K2
without candidate key generation.
ii. Evaluate the above algorithm for the below transactions to find all
frequent itemsets.
TID Items_bought
T100 {M,O,N,K,E,Y}
6 K5
T200 {D,O,N,K,E,Y}
T300 {M,A,K,E}
T400 {M,U,C,K,Y}
T500 {C,O,O,K,I,E}

14. (a) i. What is Decision Tree? Explain how classification is done using
8 K4
decision tree induction?
ii. Give a brief account on constraint based classification algorithm. 5 K2
OR
(b) Develop an algorithm for classification using Bayesian
13 K3
classification. Illustrate the algorithm with relevant example.

15. (a) i. Compare and contrast CLARA and CLARANS. 6 K4


ii With an example explain grid based clustering. 7 K3
OR
(b) i. Why is outlier mining important? Describe the different approaches
behind statistical based outlier detection, distance based outlier 13 K4
detection and deviation based outlier detection.

Part – C 1 x 15 =15 Marks


Marks KL
16. (a) Evaluate K-means partitioning algorithm for the five points
{X1 , X 2 , X 3 , X 4 , X 5 } with the following coordinates as a two
15 K5
dimensional sample for clustering : X1 = ( 0.5,2.5 ); X 2 = (0,0);
X3=( 1.5,1 ); X4=( 5,1 ); X5=( 6,2)
OR
(b) Suppose you have the set C of all frequent closed itemsets on a data
set D, as well as the support count for each frequent closed itemset.
15 K4
Describe an algorithm to determine whether a given itemset X is
frequent or not, and the support of X if it is frequent

You might also like