Professional Documents
Culture Documents
UNIT I
1. a) Define Data mining. What are the steps involved in KDD process? 8
b) Write a short note on:-
i) Classification of Data mining.
ii) Data mining Task Primitive 6
1. a) Name at least six characteristics features of data warehouse? Explain any three of them in
detail.
UNIT II
Q : What do you mean by data mining?Explain KDD process. 7
3. a) Define Data warehousing. State its characteristics features. 3
b) Differentiate between OLTP and OLAP. 6
c) List different DW schemas. Explain in brief STAR SCHEMA. 4
UNIT III
6. a) What is cluster Analysis? What are the requirements for cluster Analysis? 7
b) Explain K-means clustering methods with suitable example. 7
5. a) Differentiate between classification and prediction with suitable example. 7
b) Write short note on Rule based classification with example. 7
TID Items
T1 A, B, E
T2 B, C, D
T3 B, D, E
T4 C, D, E
T5 B, C, D, E
T6 B, C, E
Use FP – growth algorithm to compute frequent itemsets. Draw FP – tree.
Minimum support is 20%
UNIT IV
8. a) What do you mean by market basket Analysis and how it help in a supermarket. 8
b) Define and describe improving efficiency of Apriori and FP growth algorithms 5
8. a) Explain classification by Decision Tree Induction with an example.7
b) Describe Naive Bayesians classification. 6
UNIT V
9. a) What is clustering? How it differs from classification? Also give its application area. 4
b) Differentiate between Kmeans and K-medoids. 6
c) What is outlier? Why outlier mining is important. 5
10. a) What do you mean by web mining? Explain web usage mining. 6
b) Differentiate between temporal and spatial data mining. 7
10. Write a short note on:-
a) K-means partitioned method 4
b) Agglomerative and decisive hierarchical clustering. 2
c) Outlier detection 3
d) DBCAN clustering 4
UNIT VI
11. a) Describe in detail Big data technology and tools. 6
b) What do you understand by Map-Reduce paradigm and the Hadoop. 7
11. a) Describe the following Methodologies for stream Data processing:
i) Random sampling
ii) Histograms 6
b) Write short notes on social Network Analysis. Give real life example to support your
answer. 7
12. a) List the features of HDFS. Also explain the significance of secondary name node. 7
b) What is big data analytics? What are the characteristics of big data? Also explain
application areas of big data analytics
12. a) What is multi-relational data mining? Explain various approaches for multi-relational
classification. 6
b) Illustrate how sequence pattern can be mined in biological data. 7