You are on page 1of 3

Department of Computer Science and Engineering, GITAM Deemed to be University

L T P S J C
CSEN3101 BIG DATA ANALYTICS
2 1 0 0 0 3
Pre-requisite CSEN2061: Database Management Systems
Co-requisite None
Preferable None
exposure

Course Description:
The course is designed to impart the insights of Big Data analytics which involves collecting data
from different sources, manage it in a way that it becomes available to be consumed by analysts
and finally deliver data products useful to the business organizations.
Course Educational Objectives:
This course enables students to
Understand business decisions and create competitive advantage with Big Data analytics.
Introducing Java concepts required for developing map reduce programs.
Derive business benefit from unstructured data.
Imparting the architectural concepts of Hadoop.
To introduce programming tools Hbase & HIVE in Hadoop echo system.

UNIT 1 Understanding Big Data 9 hours


Introduction of big data, convergence of key trends, unstructured data, industry examples of
big data, web analytics, big data and marketing, fraud and big data, risk and big data, credit
risk management, big data and algorithmic trading, big data and healthcare, big data in
medicine, advertising and big data, big data technologies, introduction to Hadoop, open
source technologies, cloud and big data, mobile business intelligence, Crowd sourcing
analytics, inter and trans firewall analytics.

UNIT 2 NoSQL Data Management 9 hours


Introduction to NoSQL, aggregate data models, aggregates, key-value and document data
models, relationships, graph databases, schemeless databases, materialized views,
distribution models, sharding, master-slave replication, peer to peer replication, sharding
and replication, consistency, relaxing consistency, version stamps, map-reduce, partitioning
and combining, composing map-reduce calculations.

B Tech. Computer Science and Engineering (DS) w.e.f. 2021-22 admitted batch
Department of Computer Science and Engineering, GITAM Deemed to be University

UNIT 3 Basics of HADOOP 9 hours


Data format, analyzing data with Hadoop, scaling out, Hadoop streaming, Hadoop pipes,
design of Hadoop distributed file system (HDFS), HDFS concepts, Java interface, data flow,
Hadoop I/O, data integrity, compression, serialization, Avro, file-based data structures.

UNIT 4 Introduction to distributed database 9 hours


HBase, data model and implementations, HBase clients, HBase examples, praxis.Cassandra,
Cassandra data model, Cassandra examples, Cassandra clients, Hadoop integration
UNIT 5 Tools for HADOOP 9 hours
Hive, data types and file formats, HiveQL data definition, HiveQL data manipulation, HiveQL
queries. Case study on analysing different phases of data analytics.

Textbooks:
1. Michael Minelli, Michelle Chambers, and Ambiga Dhiraj, "Big Data, Big Analytics: Emerging
Business Intelligence and Analytic Trends for Today's Businesses", Wiley, 2013.
2. P. J. Sadalage, M. Fowler, "NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot
Persistence", Addison-Wesley Professional, 2014.
3. Tom White, "Hadoop: The Definitive Guide", 3/e,4/e O'Reilly, 2015.
References:
1. Douglas Eadline,"Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in
the Apache Hadoop 2 Ecosystem", 1stEdition, Pearson Education, 2016. ISBN-13: 978-
9332570351
Website(s):
https://www.coursera.org/specializations/big-data#courses

Course Outcomes:
After successful completion of the course the student will be able to:
1. Understand data analysis and its importance.
2. Design and analyse unstructured data using NoSQL.
3. Demonstrate the big data concepts using parallel processing.
4. Build a complete business data analytic solution and apply structure of Hadoop data.
5. Develop real time applications to study different stages of data analytic.

B Tech. Computer Science and Engineering (DS) w.e.f. 2021-22 admitted batch
Department of Computer Science and Engineering, GITAM Deemed to be University

CO-PO Mapping:
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
CO1 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0
CO2 0 1 2 0 3 0 0 1 1 0 1 0 2 3 0
CO3 1 0 0 0 2 0 0 0 0 0 0 0 0 3 0
CO4 0 0 3 0 3 0 0 0 1 1 2 0 0 0 2
CO5 0 2 1 0 2 0 0 0 1 0 1 0 0 0 1

Note: 1 - Low Correlation 2 - Medium Correlation 3 - High Correlation

APPROVED IN:
BOS :06-09-2021 ACADEMIC COUNCIL: 01-04-2022

SDG No. & Statement:

SDG Justification:

B Tech. Computer Science and Engineering (DS) w.e.f. 2021-22 admitted batch

You might also like