You are on page 1of 5

Total No. of Questions : 5] SEAT No.

8
23
P6989 [Total No. of Pages : 5

ic-
[5865] - 302

tat
M.C.A. (Management)

5s
IT 32: DATA WAREHOUSING AND DATA MINING

3:2
(2020 Pattern) (Semester - III)

02 71
8:3
1
Time : 2½ Hours] [Max. Marks : 50

20
5/0 013
Instructions to the candidates:
1) All questions are compulsory.
2) Draw neat & labelled diagram wherever necessary.
8/2
.23 MP

Q1) Answer the following multiple choice questions. [20×½=10]


8.2 IM

82

8
i) Find the wrong statement of the K-means clustering.

23
ic-
a) K-means clustering is a method of vector quantization.
16

tat
b) K-means neighbour is same as K-nearest.

5s
c) K-means clustering aims to partition ‘n’ observations into K clusters.
.24

3:2
1
d) K-means clustering produces the final estimate of cluster centroids.
49

8:3
17
20
13

ii) How many tier data warehouse architecture?


02
P0

a) 2 b) 1
8/2

c) 3 d) 4
M
5/0
IM
82

8
23
iii) _____ is an intermediate storage area used for data processing in ETL
.23

process of data warehousing.


ic-
16

tat
a) Buffer b) Virtual memory
8.2

5s

c) Staging area d) Inter storage area


.24

3:2
71
49

8:3
31

iv) ______ is a good alternative to the star schema.


20
1

a) Star schema b) Snowflake schema


02
P0
8/2

c) Fact constellation d) Star-snowflake schema


M
5/0
IM
82

v) In a snowflake schema which of the following types of tables are


considered?
.23

a) Fact b) Dimension
16
8.2

c) Both fact and dimension d) None of the mentioned


.24
49

[5865] - 302 1
P.T.O.
vi) The role of ETL is to ______.

8
23
a) Find erroneous data

ic-
b) Fix erroneous data

tat
c) Both finding and fixing erroneous data

5s
d) Filtering of the data source

3:2
02 71
vii) _____ is a data transformation process.

8:3
1
a) Comparison b) Projection

20
5/0 013
c) Selection d) Filtering
8/2
.23 MP

viii) OLTP stands for ______.


a) Online Transaction Protocol
8.2 IM

82

b) Online Transaction Processing

8
23
c) Online Terminal Protocol

ic-
d) Online Terminal Processing
16

tat
5s
ix) An approach in which the aggregated totals are stored in a multidimensional
.24

3:2
database while the detailed data is stored in the relational database is
1
49

8:3
a ______
17

a) MOLAP b) ROLAP
20
13

c) HOLAP d) OLAP
02
P0
8/2

x) Summary of data from an OLAP can be presented in ______.


M
5/0

a) Normalization b) Primary keys


IM
82

c) Pivot Table d) Foreign keys

8
23
.23

xi)
ic-
Efficiency and scalability of data mining algorithm is related to ______.
16

tat
a) Mining methodology b) User interaction
8.2

5s

c) Diverse data types d) None of the mentioned


.24

3:2
71
49

8:3

xii) Strategic value of data mining is ______.


31

a) Cost sensitive b) Work sensitive


20

c) Time-sensitive d) Technical-sensitive
1
02
P0
8/2

xiii) If the ETL process featches the data separately from the host server
M
5/0

during the automatic load to the data warehouse, one of the challenge
IM
82

involved is ______
a) the associated network may be down
.23

b) it may end up pulling the incomplete /incorrect file.


16

c) it may end up connecting to an incorrect host server.


8.2

d) None
.24
49

[5865] - 302 2
xiv) Webmining helps to improve the power of web search engine by identifying

8
23
_______

ic-
a) Web pages and classifying the web documents

tat
b) XML documents

5s
c) Text documents

3:2
02 71
d) Database

8:3
1
20
5/0 013
xv) ______ is achieved by splitting the text into white spaces.
a) Text cleanup
8/2 b) Tokenization
.23 MP

c) Speach tagging d) Text transformation


8.2 IM

82

8
xvi) Assigning data to one of the clusters and recompute the centroid are the

23
steps in which algorithm.

ic-
16

tat
a) Apriori algorithm b) Bayesian classification

5s
c) FP tree algorithm d) K-mean
.24

3:2
1
49

8:3
17
xvii) A disadvantage of KNN algorithm is, it takes _____
20

a) More time for training


13
02

b) More time for testing


P0
8/2

c) Equal time for training


M
5/0

d) Equal time for testing


IM
82

8
23
.23

xviii) Which is not a characteristics of Data warehouse?


ic-
16

a) Volatile
tat
8.2

b) Subject oriented
5s
.24

3:2

c) Non volatile
71
49

d) Time varient
8:3
31
20
1

xix) What is the first stage of Kimball Life Cycle diagram.


02
P0
8/2

a) Requirement Definition b) Dimensional Modelling


M

c) ETL Design Development d) Maintenance


5/0
IM
82

xx) A connected region of a multidimensional space with a comparatively


.23

high density objects.


16

a) Clustering b) Association
8.2

c) Classification d) Subset
.24
49

[5865] - 302 3
Q2) a) What is a Data warehouse. Explain the need and characteristics of Data

8
23
warehouse. [5]

ic-
tat
b) Explain the schemas of Data warehouse. [5]

5s
3:2
02 71
8:3
OR

1
20
a) 5/0 013
Explain Kimball Life Cycle diagram in detail. [5]
8/2
.23 MP

b) What is a Data warehouse? Explain the properties of Data warehouse


8.2 IM

82

architecture. [5]

8
23
ic-
16

tat
Q3) a) What is ETL? Explain data preprocessing techniques in detail. [6]

5s
b) What is OLAP? Describe the characteristics of OLAP. [4]
.24

3:2
1
49

8:3
17
20

OR
13
02

a) Describe ETL. What are the tasks to be performed during data


P0
8/2

transformation. [6]
M
5/0

b) What are the basic operations of OLAP? [4]


IM
82

8
23
.23

Q4) a) What is Data mining? Explain the architecture of Data mining.


ic-[3]
16

tat
b) Apply FP Tree Algorithm to construct FP Tree and find frequent itemset
8.2

5s

for the following dataset given below (minimum support = 30%) [7]
.24

3:2
71
49

Transaction ID List of Products


8:3
31
20

1 Apple, Berries, Coconut


1
02

2 Berries, Coconut, Dates


P0
8/2

3 Coconut, Dates
M
5/0

4 Berries, Dates
IM
82
.23

5 Apple, Coconut
16

6 Apple, Coconut, Dates


8.2

OR
.24
49

[5865] - 302 4
a) Explain data mining techniques in brief. [3]

8
23
b) How does the KNN algorithm works? [7]

ic-
tat
Apply KNN classification algorithm for the given dataset and predict the

5s
class for X(P1 = 3, P2 = 7) (K = 3)

3:2
P1 P2 Class

02 71
8:3
7 7 False

1
20
7 4 5/0 013
False
8/2
.23 MP

3 4 True
1 4 True
8.2 IM

82

8
23
ic-
Q5) a) What is text mining? Explain the process of text mining. [4]
16

tat
5s
b) Explain K-means algorithm. Apply K-means algorithm for group of visitors
.24

3:2
to a website into two groups using their age as follows:
1
49

8:3
17
15, 16, 19, 20, 21, 28, 35, 40, 42, 44, 60, 65
20
13

(Consider initial centroid 16 and 28 of two groups) [6]


02
P0

OR
8/2
M
5/0

a) Apply K-means algorithm for the given data set where K is the cluster
IM

number D = {2, 3, 4, 10, 11, 12, 20, 25, 30}, K = 2. [6]


82

8
23
.23

b) What are the different types of web mining? [4]


ic-
16

tat
8.2

5s
.24

3:2
71


49

8:3
31
20
1
02
P0
8/2
M
5/0
IM
82
.23
16
8.2
.24
49

[5865] - 302 5

You might also like