You are on page 1of 9

SLOT – D1+D2

SCHOOL OF INFORMATION TECHNOLOGY AND ENGINEERING


DIGITAL ASSIGNMENT – I - SUMMER SEMESTER 2020-2021
Programme Name & Branch: MCA Course Name: Big Data Analytics
Course Code: ITA6009

Handwritten and should be Uploaded in VTOP and MS Team Mark split-up (5+5=10)

Uploading File Name : Reg.No_Questionno_PARTA_Questionno_PARTB

PART - A

As per the serial number shown in the following table, choose your case study and
apply the followings

Choose the area of the domains for applying the various categories of the analytics
over the domain. Formulate the analytical problem statement with in the domain with
your elaborated title and also identify the best suitable type of analytics for solving the
problem. Visualize the outcome with your own diagram and match the activity with
respect to the data science life cycle.

QNO REGISTER NO
1 20MCA0002
2 20MCA0003
3 20MCA0005
4 20MCA0011
5 20MCA0014
6 20MCA0016
7 20MCA0023
8 20MCA0026
9 20MCA0033
10 20MCA0042
11 20MCA0053
12 20MCA0056
13 20MCA0064
14 20MCA0071
15 20MCA0073
16 20MCA0076
17 20MCA0080
18 20MCA0081
19 20MCA0082
20 20MCA0085
21 20MCA0086
22 20MCA0087
23 20MCA0088
24 20MCA0093
25 20MCA0095
26 20MCA0106
27 20MCA0107
28 20MCA0108
29 20MCA0113
30 20MCA0115
31 20MCA0119
32 20MCA0122
33 20MCA0123
34 20MCA0126
35 20MCA0128
36 20MCA0129
37 20MCA0132
38 20MCA0134
39 20MCA0135
40 20MCA0136
41 20MCA0143
42 20MCA0144
43 20MCA0146
44 20MCA0147
45 20MCA0155
46 20MCA0157
47 20MCA0164
48 20MCA0165
49 20MCA0167
50 20MCA0168
51 20MCA0174
52 20MCA0178
53 20MCA0182
54 20MCA0189
55 20MCA0193
56 20MCA0194
57 20MCA0204
58 20MCA0224
59 20MCA0225
60 20MCA0226
61 20MCA0227
62 20MCA0235
63 20MCA0239
64 20MCA0242
65 20MCA0243
66 20MCA0254
67 20MCA0258
68 20MCA0259
69 20MCA0263
70 20MCA0265

QuestionNo.Topic

1. Demand prediction of driver availability

2. Loan Eligibility Prediction

3. Log Analytics Solution

4. Customer Churn Analytics

5. Healthcare Disease Analytics

6. Heart failure Prediction

7. Diabetics Analytics

8. Kidney Disease Prediction

9. UBER/OLA Travel Analytics

10. Airways Travel Analytics

11. Bus Transport Analytics

12. Courier Service Analytics

13. Raliway Analytics

15. Navy Analytics

16. Fisheries Analytics

17. Animal Zoo Analytics

18. PET animal Analytics

19. Bird (Vedanthankal) Analytics

20. Electricity Consumer Analytics

21. Electricity Distribution Analytics

22. Wind power Analytics


23. Solar Power Analytics

24. Thermal Power Analytics

25. Water Resource Analytics

26. Earth Science Analytics

27. Claimate Analytics

28. Weather Data Analytics

29. Rain Fall Analytics

30. Sky Cloud Pattern analytics

31. Eath Quake Analytics

32. Tsunami Analytics

33. Storm/Cyclone Analytics

34. Insects Analytics (Mosquito, Honey bee …)

35. Plant / Tree Analytics

36. Agricultural Yield Analytics

37. Agricultural Soil / Land Analysis

38. Fruits Analytics

39. Manufacuring / Machine Fault Detection by Analytics

40. Measuring and analysing the performance of Motor / Engine

41. Building Strength Analytics

42. Toll-Gate Analytics

43. Highway Bridge Strength Analytics

44. GST Tax Analytics

45. GDP of Country Analytics

46. Income Tax Analytics

47. Lorry Service Analytics


48. Cloud service analytics

49. Crime Analytics

50. Traffic (Police) Analytics

51. Smart City Traffic Congestion Analytics

52. Analytics helping CAR Show Room

53. Student Online class Analytics (Learning Analytics)

54. Course Preference Analytics among Student/Teacher

55. IoT Data Analytics

56. Blockchain Data Analytics

57. Cyber Analytics

58. TV Shows / TRP Analytics

59. Indian Movie – Back office Analytics

60. Child Labour Analytics

61. Indian Penal Court Law Analytics

62. Driving Licence Analytics / RTO Analytics

63. Kidney Dialyses Analytics

64. Road Accident Analytics

65. Analytics helps for Two Wheeler Sales Agency

66. Social Media Network Analytics

67. GAS Consumer Analytics

68. Family / Ration Card Analytics

69. Ready Made Garment Analytics

70. Petrol /Diesel Analytics


PART B.

Choose the question number from the following list based on your last digit of the
registration number and then solve it and summit it. The question number should be as
same as your last digit of registration number

Question No: 0. Apply the page rank algorithm and estimate the page rank using map reduce
paradigm of the following

Question No: 1. Consider the problem revolves around movies dataset. The dataset contains 2
files which are follows

Movies.txt - MovieID,Title,Genres

Ratings.txt- UserID-MovieID-Ratings-TimeStamp

Determine the Top 3 most viewed Movies with their movies name in ascending order
using Map-Reduce Paradigm. Use your own sample data for Movies and their Ratings.

Question No: 2. Suppose there are two separate data


sets, customer_detailsand transaction_details which contains the details of customers and
transaction records of the customer respectively, we need to write two mappers for each data
set, if we want to find the total amount spent by each customer for the products they buy at a
supermarket.
customer_details:
Cust ID Name Phone No
100001 Rohith 1234321543
100002 Ankith 5498123433
100003 Pradumnya 4352313344
……….. …………. …………….
transaction_details:
Trans ID Product Cust ID Amount
000001 Soap 100001 100
000002 Chocolate 100001 50
000003 Ice Cream 100002 200
000004 Milk 100003 125
000005 Cheese 100003 400
Question No: 3. Visualize the word count MapReduce process for the following lines of
inputs.

Hello wordcount MapReduce Hadoop program.


This is my first MapReduce program.
Question No: 4. Let us consider the matrix multiplication example to visualize MapReduce.
Consider the following matrix

Question No: 5. Assume that Pay-Roll of the employee has been calculated for the
payment of the salary to the individual
The format of the record to store the details of the employee is
First Name,Last Name,Job Titles,Department,Full or Part-Time,Salary or
Hourly,Typical Hours,Annual Salary,Hourly Rate
The sample set of employees are given as follows:

dubert,tomasz ,paramedic i/c,fire,f,salary,,91080.00,


edwards,tim p,lieutenant,fire,f,salary,,114846.00,
elkins,eric j,sergeant,police,f,salary,,104628.00,
estrada,luis f,police officer,police,f,salary,,96060.00,
ewing,marie a,clerk iii,police,f,salary,,53076.00,
finn,sean p,firefighter,fire,f,salary,,87006.00,
fitch,jordan m,law clerk,law,f,hourly,35,,14.51
Apply the Map-reduce pattern to list out the maximum and minimum salary as per the
category of salary.

Question No: 6. let's say you work for a retailer that sells 100 different kinds of shoes.
There are dress shoes, hiking boots, sandals, etc. Using EDA, you are open to the fact
that any number of people might buy any number of different types of shoes.You
visualize the data using exploratory data analysis to find that most customers buy 1-3
different types of shoes.

Question No: 7. Calculate the similarity between any pair of users of the social media network
to detect the community of the users. Solve the above community detection problem using
MapReduce.
Question No: 8 How do you perform join operations in MapReduce on different dataset by
applying mapper side join and reducer side join?

Faculty
Faculty ID Faculty_Name Age
1001 Ramu 45
1002 Kumar 56
1003 Murugan 61
1004 Muthu 34

Workload
Faculty_ID SubjectCode SubName Credits
1001 ITA6008 Cloud Computing 4
1001 ITA5008 Database 3
1002 ITA6009 Big Data 4
1002 CSE1007 Java Programming 3
1003 SWE2002 Data Mining 3
1004 SWE4002 Data Science 4
1004 SWE2002 Data Mining 3
1001 ITA6008 Cloud Computing 4
1002 Ita6009 Big data 4

Determine the following using

 The Faculty’s name along with the number of times the faculty has handled a subject.
 The total credits of the subjects by him/her for handling the subjects

Question No: 9. Find the shortest path of any start and destination of the graph using map
reduce pattern
^^^^^^^^^^^^^^^^^^^^^^^^^

You might also like