Professional Documents
Culture Documents
stased,as
Warehouses,
aie (ii) Draw a schema diagram for the above data warehouse using
eply oneschemaof the
Aelahenal ae ciasses istod in (a)
(iii) Starting with the base cuboid [day, d:ctor, patientj, what
should be performed in order to list the total fee collected
specific OLAP operations
by each doctor in 2004?
(iv) To obtain the same list, wite an SQL query assuming the data is stored in a
reiationai daiabase with the schema +cc (uay, iuviili,
ycür, dcctor, hospital, pat:ent,
count, charge).
(5) (CO 1)
Q3
a) Cencrate tn FF ree-for the following dat:iset (min_supp= 3).
TiD
(5) (CO 1)
Itemset
f, a,c,d,g,m,p
a,b,c.f.l.m,o
b,f,h,o
b,k,c.p
a,f.c.lp.m,n
b) For the following given transaction datase generate rules using
Consider the value as Support= 50% ancl (Confidence =75%. Apriori algorithm.
(5) (CO 1)
TiD Items Purchased
Brcai, Chee, g8, Jujce
Bread, Cheese, Juice
Bread, Miik, Yoghurt
Bread, Juice, Milk
Cheese, Juice, Milk
Roll No. Total Pages: 3
003603
August/September 2022
B.Tech. (CE/CSE) VI SEMESTER
t o 0t o 3 o o s S
DATA MINING (PEC-CSD-601)
Instructions:
1. lt is compulsory to answer all the questions (1.5 marks
each) of Part-A in short.
2. Answer any four questions from Part-B in detail.
3. Different sub-parts of a question are to be
adjacent to each other
attempted
PART-A
1. (a Define Data Cube. Give an
example. (1.5)
What is the role of meta data repository in data
warehouse?
(1.5)
Give an
example for
snow-flake schema. (1.5)
d) What is data cleaning?
(1.5)
Ke) Define confidence of an association rule. (1.5)
Give an
example for maximal frequent itemsets. (1.5)
g How effective are Bayesian classifiers? (1.5)
O03603/490/111/277
PT.O
(h) What is the need of outlier detection? List two TID List of item IDs
(1.5)
applications of it. 1
iLii5
(What are the objectives of clustering? (1.5) 2 2i4jsi8
Differentiate frequent subsequence and frequent 3
substructure. (1.5)
5 i2i446jg
PART-B
6 ili2iis
a) Describe 2-tier and 3-tier Architecture of Data 7
Warehouse with a neat sketch. (10) 8
S Design Fact constellation table with suitable example. 9
5)
10 i1.i3i4å6
(10)
(a) What is the curse of dimensionality? How to
reduce it? (5)
With necessary diagrams and examples of data cubes
a Describe kNN Algorithm for data classification with
appropriate example. (10)
explain various OLAP operations. (10)
bY Discuss about key issues in Hierarchical clustering.
(5)
4. Describe the various phases in knowledge discovery process
with a neat diagram. (15)
7. Discuss the similarity measures and distance measures
frequently used in clustering the data. (15)
(a) How will you solve a classification problem using
Bayesian Belief Networks. (5)
(b) Apply FP-Growth algorithm to the following
transactional data to find frequent itemsets. List all
frequent itemsets with theirsupportcouny Sol