You are on page 1of 3

RaW-p.

> Redue dlme-Nos


0S.d.inbiay
MUA Up (once}t HeA
0s0-mon MovhJ dom t#he pt_htooy | MUeaslsd(mnsio
c e Acone diesio« Sckdkj and a neo gub-ube ig
eate
University ¢f Science and Technology,
J.C. Bose
ute daA h Me Favidabad, Haryana YMCA Jo

Please writeyour Enrolmentumber)


SESSIONAL-1 EXAMINATION A3 doun ab
MaSdh
Sdbject Code: PEC-CS-D-601 Enrolment Number..
Subject: Data Mining
las to. Tine: 1.5 Hours Maximim Marks: 30 Mes
dce aWhat are the key features of datawarehouse22
b)What are the different OLAP operations?
2.5 X 4
(CO)
ce s
Briefly explain cube materialization
48eM Hem valb-tie co1)
d) Explain KDD process.
Balet a (CO 1)
as kased-palhdhuengtodaptoy iAN Alaiegeesinte DLAP->, omline alhsicd
hillt ha Me vleao hede ho(CES,
a) piscuss ROES¥, MOLAP, HOLP in des c01)
ueahau e& pose that data
a warchouse consists ofTlethreei MoLAP
and the two measures count and
charge, where charge is the fee that a doctor charges a
patient fora visit.
Cats ad dMeGan Enumerate three classes of schemas tiat are popularly used for modeling data

stased,as
Warehouses,
aie (ii) Draw a schema diagram for the above data warehouse using
eply oneschemaof the
Aelahenal ae ciasses istod in (a)
(iii) Starting with the base cuboid [day, d:ctor, patientj, what
should be performed in order to list the total fee collected
specific OLAP operations
by each doctor in 2004?
(iv) To obtain the same list, wite an SQL query assuming the data is stored in a
reiationai daiabase with the schema +cc (uay, iuviili,
ycür, dcctor, hospital, pat:ent,
count, charge).
(5) (CO 1)
Q3
a) Cencrate tn FF ree-for the following dat:iset (min_supp= 3).
TiD
(5) (CO 1)
Itemset
f, a,c,d,g,m,p
a,b,c.f.l.m,o
b,f,h,o
b,k,c.p
a,f.c.lp.m,n
b) For the following given transaction datase generate rules using
Consider the value as Support= 50% ancl (Confidence =75%. Apriori algorithm.
(5) (CO 1)
TiD Items Purchased
Brcai, Chee, g8, Jujce
Bread, Cheese, Juice
Bread, Miik, Yoghurt
Bread, Juice, Milk
Cheese, Juice, Milk
Roll No. Total Pages: 3

003603
August/September 2022
B.Tech. (CE/CSE) VI SEMESTER
t o 0t o 3 o o s S
DATA MINING (PEC-CSD-601)

Time 3 Hours] [Max.Marks: 75

Instructions:
1. lt is compulsory to answer all the questions (1.5 marks
each) of Part-A in short.
2. Answer any four questions from Part-B in detail.
3. Different sub-parts of a question are to be
adjacent to each other
attempted

PART-A
1. (a Define Data Cube. Give an
example. (1.5)
What is the role of meta data repository in data
warehouse?
(1.5)
Give an
example for
snow-flake schema. (1.5)
d) What is data cleaning?
(1.5)
Ke) Define confidence of an association rule. (1.5)
Give an
example for maximal frequent itemsets. (1.5)
g How effective are Bayesian classifiers? (1.5)
O03603/490/111/277
PT.O
(h) What is the need of outlier detection? List two TID List of item IDs
(1.5)
applications of it. 1
iLii5
(What are the objectives of clustering? (1.5) 2 2i4jsi8
Differentiate frequent subsequence and frequent 3
substructure. (1.5)
5 i2i446jg
PART-B
6 ili2iis
a) Describe 2-tier and 3-tier Architecture of Data 7
Warehouse with a neat sketch. (10) 8
S Design Fact constellation table with suitable example. 9
5)
10 i1.i3i4å6
(10)
(a) What is the curse of dimensionality? How to
reduce it? (5)
With necessary diagrams and examples of data cubes
a Describe kNN Algorithm for data classification with
appropriate example. (10)
explain various OLAP operations. (10)
bY Discuss about key issues in Hierarchical clustering.
(5)
4. Describe the various phases in knowledge discovery process
with a neat diagram. (15)
7. Discuss the similarity measures and distance measures
frequently used in clustering the data. (15)
(a) How will you solve a classification problem using
Bayesian Belief Networks. (5)
(b) Apply FP-Growth algorithm to the following
transactional data to find frequent itemsets. List all
frequent itemsets with theirsupportcouny Sol

003603/490/111/277 2 Coref dence Cawd O03603/490/111/277


so

You might also like