You are on page 1of 3

TechGeest Solutions

www.techgeest.com Opp Manyatha Tech Park,


+91-9620828049, 8095799993 Gate No:1 (IBM), 2nd Floor,
(By Real Time Expert) Siddhartha Learning Academy,
Above Kuttunad Restaurant

 Hive
 BigData Hadoop o Comparison with RDBMS
 Fundamentals o HQL
o Data Storage and Analysis o Data types
o Comparison with RDBMS o Importing and Exporting
 Hadoop – A Brief History o Partitioning and Bucketing – Advanced.
o Joins and Join Optimization.
 HDFS
o Functions- Built in & user defined
o Blocks
o Advanced Optimization of HQL
o NN & DN
o Storage File Formats – Advanced
o HDFS Federation & High Availability
o Loading and Storing Data
 HDFS Clients o SerDes – Advanced
o HDFS Command Line
 Sqoop
o HDFS CLI – File System Operations Lab
o Introduction
o HDFS Web UI
o Import – Deep dive
o HDFS Java Client
o Export – Deep dive
o HDFS Java Client – File System Operations
o Sqoop Optimization – Incremental Load
Lab
o Real time scenarios
o CRUD Operations using Java Client
 YARN – Cluster Management (Hadoop  Flume
o Configure Flume and Import data
2.x) o Architecture and LAB
o How Yarn Applications run?
o YARN vs Map Reduce
 Oozie
o Different workflow jobs
o YARN Scheduling
o Ooze scheduler. LAB
 Capacity, Fair Scheduler, FIFO
 Map Reduce  HBase
o MR Programming Model o NoSQL databases Introduction
o Input Formats o CAP theorem
o Output Formats o HBase Architecture
o Compression o HBase Clients – Java Client
o Serialization & Data Types o Loading Data
o File Based Data structures o Hive – HBase Integration
o Sequence file, Map File, ORC, Parquet  Monitoring the Cluster
o Tuning Map Reduce Jobs o Horton Works Ambari
o Advanced Map Reduce o Cloudera Manager
 Joins -Map-side, o MapR MCS
 Reduce-side o HUE, RM UI
 Distributed Cache  Real Time Project

1/2 | P a g e
TechGeest Solutions
www.techgeest.com Opp Manyatha Tech Park,
+91-9620828049, 8095799993 Gate No:1 (IBM), 2nd Floor,
(By Real Time Expert) Siddhartha Learning Academy,
Above Kuttunad Restaurant

o Architecture o Catalyst Query Optimization


o Terminology used o Hands-on
o Production implementation o Creating (CSV, JSON) Data Frames

 Cont.… o Querying with Data Frame API and SQL


o Caching and Re-using Data Frames

 SPARK & SCALA 


o Process Hive data in Spark

 Scala Basics  Spark Streaming


o Lecture, Functional language o Lecture, Streaming Sources
o Scala Vs Java o DStream APIs and Stateful Streams
o Hands-On o Hands-On
o Strings, Numbers o Creating DStreams from Sources
o List, Array, Map, Set o Operating on DStream Data
o Control Statements, collections o Structured Streaming
o Functions, methods
 Kafka
o Pattern matching
o Kafka introduction
 Spark Overview o Installation
o Lecture o Kafka integration with Spark
o The power of Spark? o Integration with Flume
o Spark Ecosystem
 Labs:
o Spark Components vs Hadoop
o Covers All Certification Syllabus
o Hands-On
o Real Time use cases and Data sets
o Installation & Eclipse configuration
covered
o Programs in Command line Interface &
o Word count, Sensors(Weather
Eclipse
Sensors)Dataset, Social Media data sets
o Process Local, HDFS files
like YouTube, Twitter data analysis,
 RDD Fundamentals o Unix Basics Lab
o Lecture, Purpose and Structure of RDDs o SparkSQL, Hadoop, Hive, Sqoop, Oozie,
o Transformations, Actions, and DAG HBase, Flume Installations –Pseudo Mode
o Key-Value Pair RDDs
 Master Projects:
o Hands-On
o Real-time BigData EDW
o Creating RDDs from Data Files
o Real-time Streaming Application
o Reshaping Data to Add Structure
o Real-time concepts covered are
o Interactive Queries Using RDDs
 Spark SQL, SCALA
 SparkSQL and Data Frames  Hive - Advanced topics
o Lecture  Sqoop import/export
o Spark SQL and Data Frame Uses  Oozie Scheduling
o Data Frame / SQL APIs  How Hadoop MR used in DW

2/2 | P a g e
TechGeest Solutions
www.techgeest.com Opp Manyatha Tech Park,
+91-9620828049, 8095799993 Gate No:1 (IBM), 2nd Floor,
(By Real Time Expert) Siddhartha Learning Academy,
Above Kuttunad Restaurant

 RDBMS concepts, ETL tool


concepts, Integration with
Reporting tools

3/2 | P a g e

You might also like