Professional Documents
Culture Documents
Big Data Analytics Syllabus
Big Data Analytics Syllabus
MODULE I: Big data overview, data analytics, and R Language [09 Periods]
Big Data Overview : Data Structures, Analyst Perspective on Data Repositories ,
State of the Practice in Analytics, Bl Versus Data Science, Current Analytical
Architecture , Drivers of Big Data , Emerging Big Data Ecosystem and a New
Approach to Analytics, Key Roles for the New Big Data Ecosystem, Examples of Big
Data Analytics. Data Analytics Lifecycle , Model Building and Basic Data Analytic
Methods Using R Data Analytics Lifecycle Overview, Key Roles for a Successful
Analytics Project, Background and Overview of Data Analytics Lifecycle - Discovery
, Data Preparation, Learning the Business Domain , Model Planning , Model building,
Communicate Results, Operationalize and case study example Global Innovation
Network and Analysis (GINA)
R Introduction: Introduction to R, Exploratory Data Analysis, Statistical Methods for
Evaluation, Hypothesis Testing, Difference of Means, Rank-Sum Test, Errors, Sample
Size data
MODULE III: Hadoop API and Map Reduce Programs [09 Periods]
A: Hadoop API - Writing MapReduce Programs: A Weather Dataset, Understanding
Hadoop API for MapReduce Framework (Old and New)
B: MapReduce Programs with classes - Basic programs of Hadoop MapReduce:
Driver code, Mapper code, Reducer code, RecordReader, Combiner, Partitioner.
185
MODULE V: PIG and HIVE HADOOP TOOL [12 Periods]
PIG - HADOOP TOOL - Hadoop Programming Made Easier - Admiring the Pig
Architecture, Going with the Pig Latin Application Flow, Working through the ABCs
of Pig Latin, Evaluating Local and Distributed Modes of Running Pig Scripts, Checking
out the Pig Script Interfaces, Scripting with Pig Latin.
HIVE – HADOOP TOOL - Saying Hello to Hive, Seeing How the Hive is Put
Together, Getting Started with Apache Hive, Examining the Hive Clients, Working
with Hive Data Types, Creating and Managing Databases and Tables, Seeing How the
Hive Data Manipulation Language Works, Querying and Analyzing Data.
TEXT BOOKS:
1. Data Science & Big Data Analytics Discovering, Analyzing, Visualizing and
Presenting Data EMC Education Services, Wiley Publishers, 2015.
2. Cay Horstmann, Wiley John Wiley & Sons, “Big Java”, 4th Edition, INC
3. Tom White, “Hadoop: The Definitive Guide” 3rd Edition, O’reilly
REFERENCES:
1. Alex Holmes, “Hadoop in Practice”, MANNING Publ.
2. Srinath Perera, Thilina Gunarathne, “Hadoop MapReduce” Cookbook.
E-RESOURCES:
1. http://newton.uam.mx/xgeorge/uea/Lab_Prog_O_O/materiales_auxiliares/Big_Jav
a_4th_Ed.pdf
2. http://www.isical.ac.in/~acmsc/WBDA2015/slides/hg/Oreilly.Hadoop.The.Definit
ive.Guide.3rd.Edition.Jan.2012.pdf-
3. https://static.googleusercontent.com/media/research.google.com/en//archive/mapr
educe-osdi04.pdf
4. http://www.comp.nus.edu.sg/~ooibc/mapreduce-survey.pdf
5. http://freevideolectures.com/Course/3613/Big-Data-and-Hadoop/18
6. http://freevideolectures.com/Course/3613/Big-Data-and-Hadoop/40
Course Outcomes:
At the end of the course, students will be able to
1. Develop simple applications using R language
2. Analyze file systems such as GFS and HDFS.
3. Design applications by applying Map reduce concepts.
4. Build up programs by making use of I/O.
5. Explore and inspect the big data using programming tools like Pig and Hive.
186