Professional Documents
Culture Documents
Sppu Dsbda QP Nov - Dec - 2023
Sppu Dsbda QP Nov - Dec - 2023
8
23
P-7545 [Total No. of Pages : 3
ic-
tat
[6180]-53
5s
T.E. (Computer Engineering)
3:3
02 91
9:5
DATA SCIENCE AND BIG DATA ANALYTICS
0
30
(2019 Pattern) (Semester - II) (310251)
2/1 13
Time : 2½ Hours] [Max. Marks : 70
0
2/2
.23 GP
8
C
23
3) Figures to the right side indicate full marks.
ic-
4) Assume suitable data if necessary.
16
tat
5) Use of Scientific calculator is permitted.
8.2
5s
.24
Q1) a) Explain Data Analytics Cycle with suitable diagram and its phases. [8]
3:3
91
49
[9]
01
02
OR
2/2
GP
Q2) a) List and explain the key roles for successful analytics project. [8]
2/1
CE
8
23
i) Common Tools for the Model Building
.23
tat
8.2
5s
.24
3:3
Q3) a) List and explain the various types of analytics in Big data. [9]
91
49
9:5
b) Calculates the support and confidence value for all the possible item sets.[9]
30
30
OR
8.2
P.T.O.
.24
49
Q4) a) Explain the need of logistic regression along with its various types. [9]
8
23
b) Explain the following terms with suitable example. [9]
ic-
i) Removing Duplicates from dataset.
tat
5s
ii) Handling Missing Data
3:3
02 91
9:5
Q5) a) Suppose that the given data the task is to cluster points (with (x, y)
0
30
representing location) into three clusters, where the points are A1 (2, 10),
2/1 13
A2(2, 5), A3(8, 4), B1(5, 8), B2(7, 5), B3(6, 4), C1(1, 2), C2(4, 9). The
0
2/2
distance function is Euclidean distance. Suppose initially we assign A1,
.23 GP
Use the k-means algorithm to show only show only the first round of
81
8
C
23
execution with cluster center.
ic-
b) Explain the following Text Analysis steps with suitable example [9]
16
tat
8.2
i) Part-of-speech(POS)tagging
5s
.24
3:3
ii) Lemmatization
91
49
9:5
OR
30
30
Q6) a) Given the confusion matrix, Calculate Accuracy, Precision, Recall, Error
01
02
Predicted classes
2/1
8
-Yes -No
23
.23
tat
classes Yes
8.2
5s
3:3
91
No
49
9:5
30
Q7) a) List the few data visualization tools and discuss any four applications of
GP
2/1
data visualization along with the use of the various plots with Python/R
CE
OR
.24
[6180]-53 2
49
Q8) a) Explain in detail the Hadoop Ecosystem with suitable diagram along with
8
23
the various components. [9]
ic-
b) Write a short note on the following. [9]
tat
5s
a) Map Reduce
3:3
b) Pig
02 91
9:5
0
30
2/1 13
0
2/2
.23 GP
E
81
8
C
23
ic-
16
tat
8.2
5s
.24
3:3
91
49
9:5
30
30
01
02
2/2
GP
2/1
CE
81
8
23
.23
ic-
16
tat
8.2
5s
.24
3:3
91
49
9:5
30
30
01
02
2/2
GP
2/1
CE
81
.23
16
8.2
.24
[6180]-53 3
49