You are on page 1of 2

2018-19

MALLA REDDY ENGINEERING COLLEGE B.Tech.


Onwards
(Autonomous) VII Semester
(MR-18)
Code: 80521 L T P
BIG DATA ANALYTICS
Credits: 3 3 1 -

Prerequisites: Databases, programming fundamentals.


Course Objectives:
This course enables the students to learn and understand Big data, data analytics, R
language, developing map reduce programs, discuss about concepts of big data, make
use of Hadoop concepts for designing applications, develop applications using Hadoop
I/O and analyze big data using programming tools such as Pig and Hive.

MODULE I: Big data overview, data analytics, and R Language [09 Periods]
Big Data Overview : Data Structures, Analyst Perspective on Data Repositories ,
State of the Practice in Analytics, Bl Versus Data Science, Current Analytical
Architecture , Drivers of Big Data , Emerging Big Data Ecosystem and a New
Approach to Analytics, Key Roles for the New Big Data Ecosystem, Examples of Big
Data Analytics. Data Analytics Lifecycle , Model Building and Basic Data Analytic
Methods Using R Data Analytics Lifecycle Overview, Key Roles for a Successful
Analytics Project, Background and Overview of Data Analytics Lifecycle - Discovery
, Data Preparation, Learning the Business Domain , Model Planning , Model building,
Communicate Results, Operationalize and case study example Global Innovation
Network and Analysis (GINA)
R Introduction: Introduction to R, Exploratory Data Analysis, Statistical Methods for
Evaluation, Hypothesis Testing, Difference of Means, Rank-Sum Test, Errors, Sample
Size data

MODULE II: Working with Big Data [09 Periods]


Hadoop - Google File System, Hadoop Distributed File System (HDFS)– Building
blocks of Hadoop (Namenode, Datanode, Secondary Namenode, JobTracker,
TaskTracker).
Configuring of Hadoop Cluster - Introducing and Configuring Hadoop cluster
(Local, Pseudo-distributed mode, Fully Distributed mode), Configuring XML files.

MODULE III: Hadoop API and Map Reduce Programs [09 Periods]
A: Hadoop API - Writing MapReduce Programs: A Weather Dataset, Understanding
Hadoop API for MapReduce Framework (Old and New)
B: MapReduce Programs with classes - Basic programs of Hadoop MapReduce:
Driver code, Mapper code, Reducer code, RecordReader, Combiner, Partitioner.

MODULE IV: Hadoop I/O and Implementation [09 Periods]


Hadoop I/O - The Writable Interface, Writable Comparable and comparators,
Writable Classes: Writable wrappers for Java primitives, Text, BytesWritable,
NullWritable, ObjectWritable and GenericWritable, Writable collections.
Implementation - Implementing a Custom Writable: Implementing a RawComparator
for speed, Custom comparators.

185
MODULE V: PIG and HIVE HADOOP TOOL [12 Periods]
PIG - HADOOP TOOL - Hadoop Programming Made Easier - Admiring the Pig
Architecture, Going with the Pig Latin Application Flow, Working through the ABCs
of Pig Latin, Evaluating Local and Distributed Modes of Running Pig Scripts, Checking
out the Pig Script Interfaces, Scripting with Pig Latin.
HIVE – HADOOP TOOL - Saying Hello to Hive, Seeing How the Hive is Put
Together, Getting Started with Apache Hive, Examining the Hive Clients, Working
with Hive Data Types, Creating and Managing Databases and Tables, Seeing How the
Hive Data Manipulation Language Works, Querying and Analyzing Data.

TEXT BOOKS:
1. Data Science & Big Data Analytics Discovering, Analyzing, Visualizing and
Presenting Data EMC Education Services, Wiley Publishers, 2015.
2. Cay Horstmann, Wiley John Wiley & Sons, “Big Java”, 4th Edition, INC
3. Tom White, “Hadoop: The Definitive Guide” 3rd Edition, O’reilly

REFERENCES:
1. Alex Holmes, “Hadoop in Practice”, MANNING Publ.
2. Srinath Perera, Thilina Gunarathne, “Hadoop MapReduce” Cookbook.

E-RESOURCES:
1. http://newton.uam.mx/xgeorge/uea/Lab_Prog_O_O/materiales_auxiliares/Big_Jav
a_4th_Ed.pdf
2. http://www.isical.ac.in/~acmsc/WBDA2015/slides/hg/Oreilly.Hadoop.The.Definit
ive.Guide.3rd.Edition.Jan.2012.pdf-
3. https://static.googleusercontent.com/media/research.google.com/en//archive/mapr
educe-osdi04.pdf
4. http://www.comp.nus.edu.sg/~ooibc/mapreduce-survey.pdf
5. http://freevideolectures.com/Course/3613/Big-Data-and-Hadoop/18
6. http://freevideolectures.com/Course/3613/Big-Data-and-Hadoop/40

Course Outcomes:
At the end of the course, students will be able to
1. Develop simple applications using R language
2. Analyze file systems such as GFS and HDFS.
3. Design applications by applying Map reduce concepts.
4. Build up programs by making use of I/O.
5. Explore and inspect the big data using programming tools like Pig and Hive.

CO- PO, PSO Mapping


(3/2/1 indicates strength of correlation) 3-Strong, 2-Medium, 1-Weak
COs
Programme Outcomes(POs) PSOs
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1
3 3 3 3 3 1 1 3 2 1 3 3 2 3
CO2 3 2 3 3 3 2 1 3 2 2
CO3
3 3 3 3 3 3 3 2 2
CO4 3 3 3 3 3 1 3 3 2 2
CO5 2 3 3 3 3 1 3 3 2 2

186

You might also like