Professional Documents
Culture Documents
Be Computer Engineering Semester 7 2023 May Big Data Analysis Rev 2019 C Scheme
Be Computer Engineering Semester 7 2023 May Big Data Analysis Rev 2019 C Scheme
CD
17
AA
06
Paper / Subject Code: 42172 / BIG DATA ANALYTICS
EF
E6
F0
D7
E7
CD
17
AA
B0
F7
0
E6
F0
D7
E7
AE
98
CD
17
B0
F7
9A
E6
7
E7
AE
98
1E
D
Time: 3 Hours Marks: 80
D
B0
F7
9A
C
1F
7
7
E
8
1E
A0
D
0E
D
9
AA
F7
A
7C
1F
61
B
Note: 1. Question 1 is compulsory
E9
E
98
A0
00
7D
E
AA
F1
B0
A
2. Answer any three out of the remaining five questions.
7F
61
EF
E9
1
7
98
0
61
3. Assume any suitable data wherever required and justify the same.
A
A
1
0
B0
A
DE
1F
A
F
61
9
7
7
98
E
A0
61
00
E
7C
A
F1
B0
Q1 a) Distinguish between Name node and Data node.
9A
[5]
7A
7F
1
7D
1
CD
06
8
1E
A0
61
0E
A9
b) List and explain the core business drivers behind the NoSQL movement. EF [5]
F0
7
DE
1F
1
8B
7D
E9
7
AA
06
c) Mention four characteristics of big data. Elaborate these characteristics with
A0
[5]
1
7C
A9
F
F1
0
E7
E
E
7F
respect to social media websites.
61
D
9
01
CD
A
E
B0
F7
61
0
A
1A
F1
F0
d) List and explain the different issues and challenges in data stream query [5]
9A
D7
7
E
E
98
01
D
7
AA
6
processing.
1E
B0
F7
61
9A
00
7C
1A
7
DE
1F
AE
7F
8
1E
D
0E
9
06
A0
7
61
9A
C
1F
7A
B
F0
Q2 a) What is a key-value store? What are the benefits of using a key-value store? [10]
D7
E
E
8
61
1E
A0
17
A9
AA
B0
F7
00
b) Write a map reduce pseudo code to multiply two matrices. Apply map reduce [10]
7C
1F
61
E6
E9
E7
7F
8
A0
CD
A9
A
F1
B0
F7
61
7A
7F
1 2 6 7
9
1
6
D7
DE
AE
8
1E
0
61
E
9
X
A
0
F7
A
DE
7C
F
A
7F
61
B
9
3 4 8 9
1
E7
E
8
E
0
61
00
7D
C
A9
A
1A
F1
B0
D7
DE
7A
7F
EF
E9
01
06
8
F7
E
Q3 a) Suppose the stream is S = {2, 1, 6, 1, 5, 9, 2, 3, 5}. Let hash functions h(x) = ax + [10]
C
A9
AA
1A
E6
1
F0
B0
D7
1F
AE
b mod 16 for some a and b, treat result as a 4-bit binary integer. Show how the
E9
D
E7
8
A0
F7
00
C
A
A9
Flajolet- Martin algorithm will estimate the number of distinct elements, h(x) = 4x
6
F1
B0
D7
E7
E
AE
61
E9
01
CD
+ 1 mod 16.
98
B0
F7
0
A
1A
E6
F1
F0
9A
D7
E7
E
98
01
D
17
AA
06
1E
B0
7
9A
7C
E6
0
E7
1F
7F
98
1E
CD
06
B0
A0
7
61
9A
1F
F0
1 11 1 56
7
E7
E
E
98
61
1E
17
AA
2 12 2 75
B0
7
9A
00
7C
1F
EF
E6
E7
7F
3 13 1 48
8
1E
0
7D
CD
A9
A
1A
61
1F
4 14 2 69
8B
EF
E9
6
D7
E7
DE
0
00
5 15 1 84
9
AA
1A
F1
F7
A
7C
7F
6 16 2 53
E9
01
6
E
8
00
7D
E
A9
A
1A
F1
7A
F
8B
EF
E9
01
17
06
AA
1A
E6
1
F0
A
1F
the output.
8B
E9
CD
06
E7
A0
61
9
F1
ii. Create a subset where the course column is less than 3 or the class equals
F0
B0
9A
D7
DE
01
7
06
1A
F0
9A
DE
1F
7D
06
1E
A0
61
Q4 a) Explain natural join and grouping and aggregation relational algebraic operation
7C
[10]
EF
0
DE
1F
7F
1
7D
using MapReduce.
AA
06
A0
61
7C
EF
F0
E7
DE
61
7D
17
AA
b) With a neat sketch, explain the architecture of the data-stream management [10]
00
7C
EF
E6
system.
7
7F
7D
0E
CD
AA
61
B
EF
D7
7
DE
98
0E
30013 Page 1 of 2
AA
F7
9A
7C
8B
AE
1E
7D
0E
A9
7A
8B
EF
E9
0E
A9
01F1E9A98B0E7AAEF7D7CDE617F0061A
AA
F1
1E E7 D7 7F 1F
9A AA CD 00 1E
98 6 1A 9A
F1 B 0E
EF
7D
E6
17 0 98
E9 7AA 7C F 00
1F
1E
B0
E7
A9
8B EF DE 6 1A 9A A
61 0 98 AE
0E
7
7D
7C 7F0 1F B0 F7
A9 AA 06 1E E7 D7
8B EF DE
61 1A 9A A AE C
0E 7D 7 01 98 DE
Q6 a)
Q5 a)
b)
b)
30013
7 AA 7C E7F0 F1
7F
B0 F7
D7
61
06 E9
EF A9 DE
61 AA 1A 00
61 C
0E 7D 01 8 7 E DE
A
B0 F7 61 A0
bars
7A 7C F0 F1
Milk
users.
7F 1F
Bread
06 E9 E7 D7
AE DE 1 A A A C D 0 0 1E
9 6
Product
F7 61 A0 1 9A
Detergent
8B
Chocolate
EF E6
Cola Cans
D7 7F 1F 0 7 1 A0 98
0 1
B
AA CD E7 7F 1F
different days:
06 E9 D7 B0
EF E6 1A AA CD 00 1E E7
5
6
A9 61
8 E E 9A
10
21
12
7D 17 01 B F 6 A AA
F0 F1 0 7 1 0 9 8
Newman algorithm.
7C 06 E9 E7 D 7 1 B0 EF
1 A 7C F0 F1
C
DE A E 7D
61 A0 98B A D 06
1A
E9 7 A 7C
7F 1F 0
EF
7
E6
1 0
A9
8 A E DE
E D 7 1
8
7
1
3
00 1E
7 7 F F B F 61
27
61 9A AA CD 00 1E 0E 7D 7F
A0 98B EF E6 6 1A 9 A9 7 AA 7CD 00
1F 61
1E 0E 7D 17
F0
01 8 B0 E F7 E 61 A0
9A 7A 7C F1 E D 7
A D 0 61 E 9 7A 7C F 0
1F
E
4
5
98 E A 0 1E
D
12
33
18
B0
E7
F7
D7
61
7F
A0
1F 9 8 AE DE 61 9A
0
B0 F7 61 A0 98
Page 2 of 2
AA CD 06 1 E9 E7 D7 7F 1F B0
EF E6 1A A9 AA CD 00 1E E7
7D 17 01 8 E E 61 9A
B F 6 A AA
_____________________
Monday Tuesday Wednesday
0 7 1 0
F
DE 7C
6
y
1A A9 A D 06 E9 E7
11
61
20
13
20
8 E E 1 A
01F1E9A98B0E7AAEF7D7CDE617F0061A
01 AA
7F 61 9
E
00 F1 B0 F7 A0
8 EF
E9 E7 D7 7F 1F B0
61 A9 A C 0 06 1 E7 7D
Thursda
E9
A0
1F 8 B0
AE
F7
DE
61 1A0 A A AE
E D 7 1 98
1E B
ii. Name and explain the operators used to form data subsets in R.
9
Paper / Subject Code: 42172 / BIG DATA ANALYTICS
7A F0 F1 F7
23
12
12
15
9A 7C 0 E 0E
98 A E D E 61 9 A 7 A
D7
CD
B0 F7 61 A0 98 AE
Friday
E7 D7 7F 1F B0 F7
AA CD 00 1E E7
1A
[10]
[10]
F1
[10]
Determine communities for the given social network graph using Girvan- [10]
00 E E D 17
61 9 7 A 7 C F0
A0 A9
8B A EF D E6
06
1F 0E 7D 17
1E F0
9A 7A 7C
A D 0