Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
23Activity
0 of .
Results for:
No results containing your search query
P. 1
r05321204-data-warehousing-and-data-mining

r05321204-data-warehousing-and-data-mining

Ratings: (0)|Views: 1,206|Likes:
Published by SRINIVASA RAO GANTA

More info:

Published by: SRINIVASA RAO GANTA on Aug 28, 2008
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

05/09/2014

pdf

text

original

Code No: R05321204
Set No. 1
III B.Tech II Semester Regular Examinations, Apr/May 2008
DATA WAREHOUSING AND DATA MINING
(Information Technology)
Time: 3 hours
Max Marks: 80
Answer any FIVE Questions
All Questions carry equal marks
\u22c6 \u22c6 \u22c6 \u22c6 \u22c6
1. (a) Draw and explain the architecture for on-line analytical mining.
(b) Brie\ufb02y discuss the data warehouse applications.
[8+8]
2. Brie\ufb02y discuss the role of data cube aggregation and dimension reduction in the
data reduction process.
[16]
3. Write the syntax for the following data mining primitives:
(a) Task-relevant data.
(b) Concept hierarchies.
[16]
4. Write short notes for the following in detail:
(a) Measuring the central tendency
(b) Measuring the dispersion of data.
[16]
5. (a) Write the FP-growth algorithm. Explain.
(b) What is an iceberg query? Explain with example.
[10+6]
6. (a) What is classi\ufb01cation? What is prediction?
(b) What is Bayes theorem? Explain about Naive Bayesian classi\ufb01cation.
(c) Discuss about k-Nearest neighbor classi\ufb01ers and case-based reasoning.[4+6+6]

7. (a) Given the following measurement for the variable age:
18, 22, 25, 42, 28, 43, 33, 35, 56, 28
Standardize the variable by the following:

i. Compute the mean absolute deviation of age.
ii. Compute the Z-score for the \ufb01rst four measurements.
(b) What is a distance-based outlier? What are e\ufb03cient algorithms for mining
distance-based algorithm? How are outliers determined in this method?
[4+4+2+3+3]

8. An e-mail database is a database that stores a large number of electronic mail messages. It can be viewed as a semistructured database consisting mainly of text data. Discuss the following.

(a) How can such an e-mail database be structured so as to facilitate multi- dimensional search, such as by sender, by receiver, by subject, by time, and so on?

1 of 2
Code No: R05321204
Set No. 1
(b) What can be mined from such an e-mail database?

(c) suppose you have roughly classi\ufb01ed a set of your previous e-mail messages as junk, unimportant, normal, or important. Describe howa data mining system may take this as the training set to automatically classify new e-mail messages or unclassi\ufb01ed ones.

[5+5+6]
\u22c6 \u22c6 \u22c6 \u22c6 \u22c6
2 of 2
Code No: R05321204
Set No. 2
III B.Tech II Semester Regular Examinations, Apr/May 2008
DATA WAREHOUSING AND DATA MINING
(Information Technology)
Time: 3 hours
Max Marks: 80
Answer any FIVE Questions
All Questions carry equal marks
\u22c6 \u22c6 \u22c6 \u22c6 \u22c6
1. (a) Explain data mining as a step in the process of knowledge discovery.
(b) Di\ufb00erentiate operational database systems and data warehousing.
[8+8]
2. (a) Brie\ufb02y discuss about data integration.
(b) Brie\ufb02y discuss about data transformation.
[8+8]
3. (a) Explain the syntax for Task-relevant data speci\ufb01cation.
(b) Explain the syntax for specifying the kind of knowledge to be mined. [8+8]
4. (a) Write the algorithm for attribute-oriented induction. Explain the steps in-
volved in it.
(b) How can concept description mining be performed incrementally and in a
distributed manner?
[8+8]
5. Explain the Apriori algorithm with example.
[16]
6. Discuss about Backpropagation classi\ufb01cation.
[16]
7. (a) Write algorithms for k-Means and k-Medoids. Explain.
(b) Discuss about density-based methods.
[8+8]

8. Suppose that a city transportation department would like to perform data analysis on highway tra\ufb03c for the planning of highway construction based on the city tra\ufb03c data collected at di\ufb00erent hours every day.

(a) Design a spatial data warehouse that stores the highway tra\ufb03c information so that people can easily see the average and peak time tra\ufb03c \ufb02ow by highway, by time of day, and by weekdays, and the tra\ufb03c situation when a major accident occurs.

(b) What information can we mine from such a spatial data warehouse to help
city planners?

(c) This data warehouse contains both spatial and temporal data. Propose one mining technique that can e\ufb03ciently mine interesting patterns from such a spatio-temporal data warehouse.

[5+5+6]
\u22c6 \u22c6 \u22c6 \u22c6 \u22c6
1 of 1

Activity (23)

You've already reviewed this. Edit your review.
1 hundred reads
1 thousand reads
tmkiran liked this
yadi_palimo liked this
Rodrigo Espada liked this
Fly Withme liked this
Ankush Chaudhary liked this

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->