Professional Documents
Culture Documents
Total Pages : 2
017606
May,2023
B.Tech. (EEIOT) - VI Semester
Data Mining (PEC-CS-DS-601)
PART-A
Ql (a) Give any three.applications ofDatamining (1.5)
(b) What are the'advantages of DBSCAN over k-Means clustering algorithm? (1.5)
(c) Suppose a data warehouse consists of three measures customer, account and (1.5)
branch and two measures count (number of customers in the branch) and
balance. Draw the schema diagram using star schema.
(d) Give any two issues related to classification. (1.5)
(e) "Frequent itemset mining in data streams is a challenging task", (1.5)
Explain.
(f) Give reason for the following statement, "Data preprocessing is an important (1.5)
step of Datamining".
(g) What is a data cube? (1.5)
(h) Define the term Frequent Itemset. (1.5)
(i) Differentiate between Offspring and Crossover. (1.5)
0) What is spatial datamining.
(1.5)
PART-B
Q2 (a) Explain frequent subgraph mining using Apriori method. (10)
(b) Explain the three tier architecture of Data ~arehouse. (5)
I j
Q3 (a) The following table shows the midterm and final exam grades obtained for
students in a database course
i- - - --
1X (Mid Term)
- - --i---- - ·-··-
Y(Final Exam)
·-
'72 I 84
50 63 - - --
81 77
74 78
94 90
86 70
! ~" 85
-------·· - -·
'. 75 '90
j ~5 :I 80 I
!: --6-5----------,-, 74
__ _L _ _ - -·- - - - - - -- --- -- -~
•
i) Use.the method ofleast squares to find an ,~quation for the prediction of a (10)
student's final exam grade based on the student's midterm grade in the course,
(5)
ii) Predict the final exam grade of a student who received 86 marks on the
midterm exam with the above model.
Q4 (a) Explain k-medoids partitioning me:thod along with its advantages and (10)
disadvantages.
(b) Explain majorcompoqents forcharaci:erizingtimE-series dam. (5)
' QS (a) Given two objects represented by the tuples (26, 1., 45, 10) and (20, 0, 36, •1): (5)
Time : 3 Hours]
fMax. Marks : 75
Instru ctions :
l. It is compulsory to answe r all the quesrions ( J.5 marks
each) of Part-A in short
2. Answer any four questions from Par1-B in derail.
3. Different sub-parts of a quesrion are ro be attempted
adja cent to each othe r.
PART-A
I. (a ) How is a data warehouse differ e nt from a
database? ( l.5)
(b) Suppose that the minimum and maximum values for
the attribute income are $12 ,000 and $98,000,
respectively. We would like to map income to the
range [0.0, l .O]. Transform a value of $73,600 for
income. by min-max nonnalization.
(c) Differentiate between ROLAP and HOLAP. (1.5)
(g) Write a short note on market basket analysis. (1.5) 5. What is the main objectives of clustering ? Give the
cateogrization of clu stering approaches. Brielfy discuss
(h) Define classifier accuracy. (1.5)
(15)
What are the major issues in Data Mining? Explain them .
(i)
briefly. ( 1.5) ( 15)
6. TlD List of Item IDS
(j) Describe Co nfu sion Matrix . ( 1.5) I I , 12, 15
T!O0
T200 12, 14
PART-B T300 12, 13
T400 IL 12, 14
2. (a) Define each of the following data mining functionalities:
T500 11 , 13
c haracr eriza ri o n. assoc iati o n, cl ass ification , and
T600 12 , 13
clustering and outlier analysis. Give examples of each
T700 11. 13
data minin g functionality , using a real-life database
T800 I I, 12, 13, 15
that you are familiar with. (] 0) 11, 12, 13
T900
(b) In real-world data, tuples with missing values for some
Table Shows Transactional data for All Electronics Branch.
attributes are a common occurrence. Describe various
(5) (a) Find all freque nt itemsets using Apriori algorithm.
meth ods for handling this problem.
(b) Li st all the stron g association rules (with supports and
confidence.
3. A data warehouse can be modelled by either a star schema
(c) matching the fo llowing metarule, where Xis a variable
or a snowflake schema. Briefly describe the similarities
representing customers. and item i denotes variables
and the differences of the two models using suitable
representing ite ms (e.g., "A," "B,"): x £ transaction,
examples, and then analyze their advantages and
disadvantages with regard to one another. ( 15) buys(X.item l ) buys(X,item 2) buys(X,item 3) [s,c].
602304/140/111/266 3 [P.T.O
602304/140/111 /266 2
91219 )
Roll No.
Total Pages: 3
220505
December, 2019
MCA V SEMESTER
Data Warehousing and Data Mining (MCA-17-309(vi))
Instructions:
I. It is compulsory to answer all the questions (1.5 marks
each) of Part-A in short.
2 Answer any four questions from Part-B in detail.
3. Ditferent sub-parts ofa question are to be attempted
adjacent to each other
PART A
220505/60/111/212 [P.T.O
19/12
What do you mean by continuous data in data List all of the strong association rules (with
support s
mining? (1.5) and confidence c) matching the
following metarule,
(g) Differentiate between Pre-Pruning and Post-Pruning. where X is a variable representing customers and
item,
denotes variables representing items:
(1.5)
(h) What is a Data Mart? Vxe transactions, buys(X, item,) A buys(X, item,)
(1.5)
A =buys(X, item,).
) List 4 applications of data Mining. (10)
(1.5) b) Explain various back-end tools and utilities used in a
) Define Web Mining. (1.5) datawarehouse. (5)
T100 {K, A, D, B}
7. Write short notes on (any two)
T200 (D, A, C, E, B)
(a) 3-tier Architecture of DataWarehouse.
T300 C, A, B, E) b) Classification using Backpropogation.
T400 (B, A, D} (c) Various Clustering Techniques. (15)
Find all frequent item sets using Apriori.
220505/60/111/212 2 220505/60/111/212 3