Big Data Course Overview
Animesh Giri
Department of Computer Science and Engineering
animeshgiri@pes.edu
Overview of the course
• Big Data – why is it becoming important?
• What is data? And what is Big Data?
• The business drivers for Big Data
• Why is it different?
How is this course distinct from other related courses ?
Focus is not in Analysis
Data Set Model
Obtain answer to query
How Many Liked ?
How Many didn’t like ?
What is the designation given generally for this skill set?
Typical Data Flow Cycle
Data Engineering
Overview of the course
• Big Data Introduction
• Big Data Infrastructure
• In Memory Computation
• Streaming Analysis
• Advanced Analytics on Big Data
Overview of the course
• Big Data Systems and Technologies
• Technologies: usage and design
• Storing data : HDFS
• Extracting information : HIVE, Hbase
• Computing with Big Data: Hadoop, Spark, Streaming Spark
https://www.edureka.co/blog/hadoop-ecosystem
• Hadoop - Ecosystem
• Machine learning with Big Data – MLLib
• Computation Models
• Batch and Interactive processing
Overview of the course
• Big Data Algorithms
• Matrix Multiplication
• Page Rank
• Streaming algorithms Source: https://en.wikipedia.org/wiki/PageRank
• ML algorithms
What this course is not about
• Analytics techniques – focus of a separate
course;
• This is meant to complement an analytics
course.
Course tips
• Please come to class prepared; read previous
lecture notes
• Programming assignments are intensive
• Start early; do not underestimate complexity
• Installation of software – also has learnings
• Plaigiarism policy
• Not permitted to copy – lose full marks for
assignment – both donor and receipient
•
Books and references
“Big Data Analytics”, Rajkamal, Preeti Saxena, 1st Edition, McGraw Hill Education,
2019
“Big Data Simplified”, Sourabh Mukherjee, Amit Kumar Das, Sayan Goswami, 1 st
Edition, Pearson, 2019
“Mining of Massive Datasets”, Anand Rajaraman, Jure Leskovec, Jeffrey D. Ullman,
Cambridge Press, 2014. – for the algorithms
THANK YOU
Animesh Giri
Department of Computer Science and Engineering
animeshgiri@pes.edu
+91 80 6618 6603