You are on page 1of 2

B.E. / B.

Tech DEGREE EXAMINATION, APRIL / MAY 2008

Seventh Semester

Information Technology

CS 1004 – DATA WAREHOUSING AND MINING

(Regulation 2004)

Time: Three hours Maximum: 100 marks

Answer ALL questions.

PART A – (10 x 2 = 20 marks)

1. Compare OLTP and OLAP systems.
2. What is Data Warehouse Metadata?
3. What is Dimensionality Reduction?
4. What is Concept Description?
5. List two interesting measures for association rules.
6. What are Iceberg queries?
7. What is classification?
8. What is cluster analysis?
9. What is Web Usage Mining?
10. What is Visual Data Mining?

PART B – (5 x 16 = 80 marks)

11. (a) Briefly compare the following concepts. Explain your points with an example
(i) Snowflake schema, fact constellation, star net query model [Marks 5]
(ii) Data cleaning, data transformation, refresh [Marks 5]
(iii) Discovery-driven cube, multifeature cube, virtual warehouse [Marks 6]
(b) What are the difference between three main types of data usage: information
processing, analytical processing and data mining? Discuss the motivation behind
OLAP mining. [Marks 16]

12. (a) For class characterization, what are the main differences between a data
cube based
implementation and a relational implementation such as attribute-oriented
induction. Discuss which
method is most efficient and under what condition this is so. [Marks 16]

Or

(a) What are Bayesian classifiers? Explain in detail about: (i) Naïve Bayesian classification [Marks 8] (ii) Linear and multiple regression. [Marks 16] 14. [Marks 8] (ii) With relevant examples discuss the role of statistics in data mining. (a) Explain with an algorithm. distance-based outlier detection and deviation- based outlier detection. Give relevant example. [Marks 8] 13. [Marks 10] (ii) What is time series analysis? Discuss the same with an example.(b) (i) List and discuss the various data mining primitives. how to mine single dimensional Boolean Association Rules from transactional database. [Marks 16] 15. [Marks 6] (ii) Discuss how data mining is done is spatial databases. Give relevant example. (a) (i) What is multidimensional analysis? Discuss the same with an example. [Marks 8] Or (b) Why is outline mining important? Briefly describe the different approaches behind statistical based outlier detection. [Marks 6] . [Marks 10] Or (b) (i) Discuss data mining in multimedia databases. [Marks 16] Or (b) With an algorithm explain constraint-based association mining.