Professional Documents
Culture Documents
Teaching Scheme
(Hrs.) Credits Assigned
Subject Subject
Code Name
Theory Practical Tutorial Theory Practic Tutori Total
al al
Big Data -- -- -- 04
ECCDLO Analytics 04 -- 04
7032
Examination Scheme
Theory Marks
Subject Subject Internal assessment End Sem. Term Practi
Code Name Test 1 Avg. Of
Test2 Test 1 and Exam Or Tota
Work cal & al l
Test 2 Oral
ECCDLO Big Data -
Analytics 20 20 20 80 -- - -- 100
7032
2
Module Unit
No. No. Topics Hrs.
3
Module Unit Hrs.
No. No. Topics
4.0 MapReduce 08
4.1 MapReduce and The New Software Stack: Distributed File Systems,
Physical Organization of Compute Nodes, Large Scale File-System Organization.
4.2 MapReduce: The Map Tasks, Grouping by Key, The Reduce Tasks,
Combiners, Details of MapReduce Execution, Coping With Node Failures.
Text Books :
1. Radha Shankarmani and M Vijayalakshmi ―Big Data Analytics‖, Wiley
2. Alex Holmes ―Hadoop in Practice‖, Manning Press, Dreamtech Press.
3. Dan McCreary and Ann Kelly ―Making Sense of NoSQL‖ – A guide for managers and the rest of
us, Manning Press.
5
What is Big data
Big data is data which is too large, complex and
dynamic for any conventional data tools to capture,
store, analyze and manage for optimized decision
making.
VARIETY VELOCITY
People to people People to Machine Machine to Machine 2.9 Millions 20 Hours of
Communications 50 Million
Medical devices, Sensors, GPS, emails sent video
, Tweets per
Digital TV, Barcode Scanner, second uploaded
Social day
E-Commerce, Surveillance cameras every minutes 7
networking, Smart and Bank card
How Long we need to collect the data ?
It depends on following mathematical notation
8
Big Data Analytics
Big data analytics is the often complex process of examining
big data to uncover information, such as hidden patterns,
correlations, market trends and customer preferences that can
help organizations make business decisions.
⬥ No pre-processing required.
11
Case Study……..
1. Feedback mail or reviews link we get from bank, college
and food apps.
4. Smart city.
12
Big Data Analytics Applications
⬥ Machine Learning.
⬥ Data Management.
⬥ Hadoop.
⬥ Predictive analytics.
⬥ In-memory analytics.
⬥ Statistical computing.
13
Hadoop
Hadoop is a framework that allows you to store big
data in a distributed system, so that we can process
it parallely. It is divided into two:
14
Hadoop Problems with traditional approach:
1. HDFS
2. YARN
3. MapReduce
4. Common
16
Hadoop Examples of Hadoop:
18
HDFS Architecture
19
HDFS Continued....
20
YARN Architecture
21
How YARN Works?
1. The ResourceManager instructs a NodeManager to start
an Application Master for this request, which is then
started in a container.
2. Application Master registers itself with the RM. The
Application Master proceeds to contact the HDFS
NameNode and determine the location of the needed data
blocks and calculates the amount of map and reduce tasks
needed to process the data.
3. Application Master then requests the needed resources
from the RM and continues to communicate the resource
requirements throughout the life-cycle of the container. 22
Continued.......
4. The RM schedules the resources along with the requests
from all the other Application Masters and queues their
requests.
5. The Application Manager contacts the NodeManager for
that slave node and requests it to create a container by
providing variables, authentication tokens, and the
command string for the process.
6. The Application Manager then monitors the process and
reacts in the event of failure by restarting the process on the
next available slot. If it fails after four different attempts, the
entire job fails. 23
MapReduce
24
Map Phase
25
Reduce Phase
26
Big Data Approach......
1. Traditional Approach
Data Relationship.
Types of data. 27
Technologies availble Big Data......
Apache Hadoop
Microsoft HDInsight
NoSQL
Hive
Sqoop
PolyBase
28
Presto
Case Study…….
Solve Advertisers Problem and Offer Marketing Insights
Risk Management
29
⬥ www.cloudera.com ⬥ Vm ware or
⬥ Cloudera ⬥ oracle virtual box
⬥ Downloads
⬥ 5.4
30
Insurance Manufactures Firm such as
Company and distributers hotel
31