Professional Documents
Culture Documents
Handwritten and should be Uploaded in VTOP and MS Team Mark split-up (5+5=10)
PART - A
As per the serial number shown in the following table, choose your case study and
apply the followings
Choose the area of the domains for applying the various categories of the analytics
over the domain. Formulate the analytical problem statement with in the domain with
your elaborated title and also identify the best suitable type of analytics for solving the
problem. Visualize the outcome with your own diagram and match the activity with
respect to the data science life cycle.
QNO REGISTER NO
1 20MCA0002
2 20MCA0003
3 20MCA0005
4 20MCA0011
5 20MCA0014
6 20MCA0016
7 20MCA0023
8 20MCA0026
9 20MCA0033
10 20MCA0042
11 20MCA0053
12 20MCA0056
13 20MCA0064
14 20MCA0071
15 20MCA0073
16 20MCA0076
17 20MCA0080
18 20MCA0081
19 20MCA0082
20 20MCA0085
21 20MCA0086
22 20MCA0087
23 20MCA0088
24 20MCA0093
25 20MCA0095
26 20MCA0106
27 20MCA0107
28 20MCA0108
29 20MCA0113
30 20MCA0115
31 20MCA0119
32 20MCA0122
33 20MCA0123
34 20MCA0126
35 20MCA0128
36 20MCA0129
37 20MCA0132
38 20MCA0134
39 20MCA0135
40 20MCA0136
41 20MCA0143
42 20MCA0144
43 20MCA0146
44 20MCA0147
45 20MCA0155
46 20MCA0157
47 20MCA0164
48 20MCA0165
49 20MCA0167
50 20MCA0168
51 20MCA0174
52 20MCA0178
53 20MCA0182
54 20MCA0189
55 20MCA0193
56 20MCA0194
57 20MCA0204
58 20MCA0224
59 20MCA0225
60 20MCA0226
61 20MCA0227
62 20MCA0235
63 20MCA0239
64 20MCA0242
65 20MCA0243
66 20MCA0254
67 20MCA0258
68 20MCA0259
69 20MCA0263
70 20MCA0265
QuestionNo.Topic
7. Diabetics Analytics
Choose the question number from the following list based on your last digit of the
registration number and then solve it and summit it. The question number should be as
same as your last digit of registration number
Question No: 0. Apply the page rank algorithm and estimate the page rank using map reduce
paradigm of the following
Question No: 1. Consider the problem revolves around movies dataset. The dataset contains 2
files which are follows
Movies.txt - MovieID,Title,Genres
Ratings.txt- UserID-MovieID-Ratings-TimeStamp
Determine the Top 3 most viewed Movies with their movies name in ascending order
using Map-Reduce Paradigm. Use your own sample data for Movies and their Ratings.
Question No: 5. Assume that Pay-Roll of the employee has been calculated for the
payment of the salary to the individual
The format of the record to store the details of the employee is
First Name,Last Name,Job Titles,Department,Full or Part-Time,Salary or
Hourly,Typical Hours,Annual Salary,Hourly Rate
The sample set of employees are given as follows:
Question No: 6. let's say you work for a retailer that sells 100 different kinds of shoes.
There are dress shoes, hiking boots, sandals, etc. Using EDA, you are open to the fact
that any number of people might buy any number of different types of shoes.You
visualize the data using exploratory data analysis to find that most customers buy 1-3
different types of shoes.
Question No: 7. Calculate the similarity between any pair of users of the social media network
to detect the community of the users. Solve the above community detection problem using
MapReduce.
Question No: 8 How do you perform join operations in MapReduce on different dataset by
applying mapper side join and reducer side join?
Faculty
Faculty ID Faculty_Name Age
1001 Ramu 45
1002 Kumar 56
1003 Murugan 61
1004 Muthu 34
Workload
Faculty_ID SubjectCode SubName Credits
1001 ITA6008 Cloud Computing 4
1001 ITA5008 Database 3
1002 ITA6009 Big Data 4
1002 CSE1007 Java Programming 3
1003 SWE2002 Data Mining 3
1004 SWE4002 Data Science 4
1004 SWE2002 Data Mining 3
1001 ITA6008 Cloud Computing 4
1002 Ita6009 Big data 4
The Faculty’s name along with the number of times the faculty has handled a subject.
The total credits of the subjects by him/her for handling the subjects
Question No: 9. Find the shortest path of any start and destination of the graph using map
reduce pattern
^^^^^^^^^^^^^^^^^^^^^^^^^