Professional Documents
Culture Documents
Roll No.
National Institute of Technology, Hamirpur
Department of Computer Science & Engineering
End Semester Examination- May, 2019
Course: B.Tech. IIITU
Semester: VIII
Subject: Data Warehouse and Data Mining
Code: CSD-421
Time: 03:00 hrs
Max. Marks: 60
Note: All questions are compulsory
Differentiate between the following pairs (give appropriate examples/diagrams): 1101
(a) Full materialization and Partial materialization
(b) Roll up and Drill down
(c) Snow flake and Star schema
(d) Closed frequent item set and Maximal frequent item set
(e) Data warehouse and Data mart
Q2. (a) Suppose a group of 12 sales price records has been sorted as follows: 5, 10, 11, 13, 15, 151
35, 50, 55, 72, 92, 204, 215.
(a) Calculate mean, inter-quartile range and variance.
(b) Normalize maximum value of data in the range [0, 1].
(c) Partition the data using bin-median method. •
(b) The table given below shows data obtained during the outbreak of smallpox. Test the 151
effectiveness of the vaccination in preventing the disease with the help of x2 at 5% level
of significance (x2 value at 0.05 significance level for one degree of freedom= 3.841).
Q6. (a) Cluster the following eight points (with (x, y) representing location) into three clusters: [6]
A1(2, 10), A2(2, 5), A3(8, 4), B1(5, 8), B2(7, 5), B3(6, 4), Cl(I, 2), C2(4,.9). The
distance function is Minkowski distance. Suppose initially we assign Al, B1, and Cl
as the center of each cluster, respectively. Use the k-means algorithm to show:
(a) The three cluster centers after the first round execution.
(b) The final three clusters.
(b) State advantages and disadvantages of the following data mining techniques. Also, [4]
mention an application scenario for each.
(a) SVM
(b) Neural Network
(c) Fuzzy Logic
(d) Decision Tree