DM

Roll No . ...................... .
Total Pages : 2
017606
May,2023
B.Tech. (EEIOT) - VI Semester
Data Mining (PEC-CS-DS-601)
Time: 3 Hours Max. Marks:75

Instructions: 1, It is compulsory to answer all the questions (1.5 marks each) of Part-A in short.
2, Answer any four questions from _Part-e'in detail.
3. Different sub-parts ofa question are to be attempted adjacent to·each other.
PART-A
Ql (a) Give any three.applications ofDatamining (1.5)
(b) What are the'advantages of DBSCAN over k-Means clustering algorithm? (1.5)
(c) Suppose a data warehouse consists of three measures customer, account and (1.5)
branch and two measures count (number of customers in the branch) and
balance. Draw the schema diagram using star schema.
(d) Give any two issues related to classification. (1.5)
(e) "Frequent itemset mining in data streams is a challenging task", (1.5)
Explain.
(f) Give reason for the following statement, "Data preprocessing is an important (1.5)
step of Datamining".
(g) What is a data cube? (1.5)
(h) Define the term Frequent Itemset. (1.5)
(i) Differentiate between Offspring and Crossover. (1.5)
0) What is spatial datamining.
(1.5)
PART-B
Q2 (a) Explain frequent subgraph mining using Apriori method. (10)
(b) Explain the three tier architecture of Data ~arehouse. (5)
017606 I 85 / 111 / 340 [P.T.O.
I j
Q3 (a) The following table shows the midterm and final exam grades obtained for
students in a database course
i- - - --
1X (Mid Term)
- - --i---- - ·-··-
Y(Final Exam)
·-
'72 I 84
50 63 - - --
81 77
74 78
94 90
86 70
! ~" 85
-------·· - -·
'. 75 '90
j ~5 :I 80 I
!: --6-5----------,-, 74
__ _L _ _ - -·- - - - - - -- --- -- -~
•
i) Use.the method ofleast squares to find an ,~quation for the prediction of a (10)
student's final exam grade based on the student's midterm grade in the course,
(5)
ii) Predict the final exam grade of a student who received 86 marks on the
midterm exam with the above model.
Q4 (a) Explain k-medoids partitioning me:thod along with its advantages and (10)
disadvantages.
(b) Explain majorcompoqents forcharaci:erizingtimE-series dam. (5)
' QS (a) Given two objects represented by the tuples (26, 1., 45, 10) and (20, 0, 36, •1): (5)
(i) Compute the Euclidean distance between the two objects.

(ii) Compute the Manhattan distance between the two objects.
(b) Differentiate between (10)
• ROLAP, MOLAP and HOLAP
Q6 (a) Explain the characteristics of Social Networks. (5) •

( b) The tra.n saction details are given in the foiiowing table, what is the confidence (5)
and su ort of the association rule Nuts => E , Milk?
TIO Tl T2 T3 T4 . I TS
!
Items Beer, Nuts, Nuts, Egg, I Egg, Milk Be~r. Nuts, Nuts, Egg, I
Pencil Milk ' Pencil. Milk, Salt I
Milk ,Eag_ _ _.___ __ _
( c) iJsing ·che above transaction table draw FP Growth tree. (5)
Q7' Write short notes on: (15)

• KDD
• Random Sampling
• Decision Tree Induction
017606 / 85 / 111 / 340 2

7. Write short note on :
(a) Back end tools and utilities.
• •
Roll No . ·· ··········· ··· ······ ·
Total Page~ 3
(b ) lndex.ing OLAP data.
(c) Data Mining Query Language. 602304
(5X3=15) December 2022
MCA-III SEMESTER
Data Warehousing and Data Mining (MCA-20-207-4)
Time : 3 Hours]
fMax. Marks : 75
Instru ctions :
l. It is compulsory to answe r all the quesrions ( J.5 marks
each) of Part-A in short
2. Answer any four questions from Par1-B in derail.
3. Different sub-parts of a quesrion are ro be attempted
adja cent to each othe r.
PART-A
I. (a ) How is a data warehouse differ e nt from a
database? ( l.5)
(b) Suppose that the minimum and maximum values for
the attribute income are $12 ,000 and $98,000,
respectively. We would like to map income to the
range [0.0, l .O]. Transform a value of $73,600 for
income. by min-max nonnalization.
(c) Differentiate between ROLAP and HOLAP. (1.5)
602304/140/111 /266 4 602304/140/111 /266

\.\'\ [P.T.O.
.?
=
•
(d ) Discus s an example for Multilevel Association
( 1.5 )
4. (a) •Explain data mining as a step in knowledge discovery
process. (10)
rule.
Define frequent itemset, support and confidence. ( 1.5 ) (b) With illu strative examples explain various OLAP
(e)
operations. (5)
(f) How would yo u measure the quality of clusters?
(1.5)
(g) Write a short note on market basket analysis. (1.5) 5. What is the main objectives of clustering ? Give the
cateogrization of clu stering approaches. Brielfy discuss
(h) Define classifier accuracy. (1.5)
(15)
What are the major issues in Data Mining? Explain them .
(i)
briefly. ( 1.5) ( 15)
6. TlD List of Item IDS
(j) Describe Co nfu sion Matrix . ( 1.5) I I , 12, 15
T!O0
T200 12, 14
PART-B T300 12, 13
T400 IL 12, 14
2. (a) Define each of the following data mining functionalities:
T500 11 , 13
c haracr eriza ri o n. assoc iati o n, cl ass ification , and
T600 12 , 13
clustering and outlier analysis. Give examples of each
T700 11. 13
data minin g functionality , using a real-life database
T800 I I, 12, 13, 15
that you are familiar with. (] 0) 11, 12, 13
T900
(b) In real-world data, tuples with missing values for some
Table Shows Transactional data for All Electronics Branch.
attributes are a common occurrence. Describe various
(5) (a) Find all freque nt itemsets using Apriori algorithm.
meth ods for handling this problem.
(b) Li st all the stron g association rules (with supports and
confidence.
3. A data warehouse can be modelled by either a star schema
(c) matching the fo llowing metarule, where Xis a variable
or a snowflake schema. Briefly describe the similarities
representing customers. and item i denotes variables
and the differences of the two models using suitable
representing ite ms (e.g., "A," "B,"): x £ transaction,
examples, and then analyze their advantages and
disadvantages with regard to one another. ( 15) buys(X.item l ) buys(X,item 2) buys(X,item 3) [s,c].
602304/140/111/266 3 [P.T.O
602304/140/111 /266 2
91219 )
Roll No.
Total Pages: 3
220505
December, 2019
MCA V SEMESTER
Data Warehousing and Data Mining (MCA-17-309(vi))
Time 3 Hours] [Max. Marks 75
Instructions:
I. It is compulsory to answer all the questions (1.5 marks
each) of Part-A in short.
2 Answer any four questions from Part-B in detail.
3. Ditferent sub-parts ofa question are to be attempted
adjacent to each other
PART A
1. (a) Define Data warehouse. (1.5)

(b) What is a Concept hierarchy? (1.5)
(c) Define Data Cube. (1.5)
(d) What is Bitmap Indexing? (1.5)
(e) Differentiate between supervised Learning and
Unsupervised Learning. (1.5)
220505/60/111/212 [P.T.O
19/12
What do you mean by continuous data in data List all of the strong association rules (with
support s
mining? (1.5) and confidence c) matching the
following metarule,
(g) Differentiate between Pre-Pruning and Post-Pruning. where X is a variable representing customers and
item,
denotes variables representing items:
(1.5)
(h) What is a Data Mart? Vxe transactions, buys(X, item,) A buys(X, item,)
(1.5)
A =buys(X, item,).
) List 4 applications of data Mining. (10)
(1.5) b) Explain various back-end tools and utilities used in a
) Define Web Mining. (1.5) datawarehouse. (5)
4. (a) Outline the major steps of Decision Tree Classification.

PART B
(10)
2. (a) How is datawarehouse different from a database? How (b) Explain Constraint based Association Mining. (5)
is it similar? (10)
b) Explain various schemas of a datawarehouse. 5. (a) What is Data Mining? Explain Data Mining as a step
5)
process in Knowledge discovery from Databases. (10)
b) Explain any two Attribute Selection Measures. (5)
3. (a) A database has four transactions. Let min_sup=60%
and min_conf=80% 6. What is DMQL? Explain data mining task primitives for
TID ITEMS BOUGHT specifying a data mining task. (15)
T100 {K, A, D, B}
7. Write short notes on (any two)
T200 (D, A, C, E, B)
(a) 3-tier Architecture of DataWarehouse.
T300 C, A, B, E) b) Classification using Backpropogation.
T400 (B, A, D} (c) Various Clustering Techniques. (15)
Find all frequent item sets using Apriori.
220505/60/111/212 2 220505/60/111/212 3

DM

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DM

Uploaded by

Copyright:

Available Formats

Roll No . ...................... .

Time: 3 Hours Max. Marks:75

017606 I 85 / 111 / 340 [P.T.O.

(i) Compute the Euclidean distance between the two objects.

Q6 (a) Explain the characteristics of Social Networks. (5) •

Q7' Write short notes on: (15)

017606 / 85 / 111 / 340 2

602304/140/111 /266 4 602304/140/111 /266

Time 3 Hours] [Max. Marks 75

1. (a) Define Data warehouse. (1.5)

4. (a) Outline the major steps of Decision Tree Classification.

You might also like