You are on page 1of 3

K.D.K.

COLLEGE OF ENGINEERING, NAGPUR KDK/SSE/3SEM/22-23


Department of Computer Science and Engineering
B.E. (Computer Science & Engineering) Seventh Semester (C.B.S.)
Data Warehousing & Mining

UNIT I
1. a) Define Data mining. What are the steps involved in KDD process? 8
b) Write a short note on:-
i) Classification of Data mining.
ii) Data mining Task Primitive 6

1. a) Name at least six characteristics features of data warehouse? Explain any three of them in
detail.

2. a) Describe the typical architecture of Data mining system? 6


b) Why preprocessing is necessary in Data mining. Explain various preprocessing technique
in brief.

2. a) Differentiate between OLTP house and OLAP technology. 7


b) What is the data Architecture of data operations? 7

UNIT II
Q : What do you mean by data mining?Explain KDD process. 7
3. a) Define Data warehousing. State its characteristics features. 3
b) Differentiate between OLTP and OLAP. 6
c) List different DW schemas. Explain in brief STAR SCHEMA. 4

3. a) What do you mean by Data mining? Explain KDD process. 8


b) Describe the major issues of data mining. 5

4. a) Discuss the different types of OLAP servers. 8


b) Write a short note on any one.
i) Cube materialization. ii) Attribute oriented Induction 5

4. a) Write short note on.


i) Data Reduction. ii) Data cleaning.
iii) Data transformation.
7
b) Draw and explain Architecture of data mining.

UNIT III

6. a) What is cluster Analysis? What are the requirements for cluster Analysis? 7
b) Explain K-means clustering methods with suitable example. 7
5. a) Differentiate between classification and prediction with suitable example. 7
b) Write short note on Rule based classification with example. 7

5. a) Given the following six transaction on items {A, B, C, D, E} 8


TID Items
1 A, B, C
2 A, B, C
3 B, C
4 B, D
5 B, C, D, E
6 E
Use the Apriori Algorithm to compute frequent itemsets and their support. 5
Generate association rules from frequent itemset Minimum support count is 2.
b) Write short notes on constraint – based association Mining

6. a) Discuss various kinds of Association rules. 6


b) Given the following six transaction on items. 7

TID Items
T1 A, B, E
T2 B, C, D
T3 B, D, E
T4 C, D, E
T5 B, C, D, E
T6 B, C, E
Use FP – growth algorithm to compute frequent itemsets. Draw FP – tree.
Minimum support is 20%

UNIT IV

7. a) Discuss the importance of Association rule mining. 5


b) What do you mean by mining frequent patterns, Association and correlations? Elaborate
by giving example. 8
7. a) Explain support vector machine with suitable diagram. 6
b) Discuss different issues related to classification & prediction. 7

8. a) What do you mean by market basket Analysis and how it help in a supermarket. 8
b) Define and describe improving efficiency of Apriori and FP growth algorithms 5
8. a) Explain classification by Decision Tree Induction with an example.7
b) Describe Naive Bayesians classification. 6

UNIT V

9. Write short notes on. 13M


i) Text mining. ii) Web content mining.
iii) Web structure mining. iv) Visual web data mining.

9. a) What is clustering? How it differs from classification? Also give its application area. 4
b) Differentiate between Kmeans and K-medoids. 6
c) What is outlier? Why outlier mining is important. 5
10. a) What do you mean by web mining? Explain web usage mining. 6
b) Differentiate between temporal and spatial data mining. 7
10. Write a short note on:-
a) K-means partitioned method 4
b) Agglomerative and decisive hierarchical clustering. 2
c) Outlier detection 3
d) DBCAN clustering 4

UNIT VI
11. a) Describe in detail Big data technology and tools. 6
b) What do you understand by Map-Reduce paradigm and the Hadoop. 7
11. a) Describe the following Methodologies for stream Data processing:
i) Random sampling
ii) Histograms 6
b) Write short notes on social Network Analysis. Give real life example to support your
answer. 7

12. a) List the features of HDFS. Also explain the significance of secondary name node. 7
b) What is big data analytics? What are the characteristics of big data? Also explain
application areas of big data analytics
12. a) What is multi-relational data mining? Explain various approaches for multi-relational
classification. 6
b) Illustrate how sequence pattern can be mined in biological data. 7

You might also like