You are on page 1of 36

Integrated Program for Data

Science, AI ML & Big Data

Job oriented global certification


program crafted by experts

Disclaimer: This material is protected under copyright act AnalytixLabs ©, 2011-2020. Unauthorized use and/ or duplication of this material or any part of this material
including data, in any form without explicit and written permission from AnalytixLabs is strictly prohibited. Any violation of this copyright will attract legal actions

Learn to Evolve
About AnalytixLabs
AnalytixLabs is a capability building and training solutions firm led by McKinsey, IIM, ISB and IIT alumni with deep industry experience and a
flair for coaching. We are focused at helping our clients develop skills in basic and advanced analytics to enable them to emerge as
“Industry Ready” professionals and enhance career opportunities. AnalytixLabs has been also featured as top institutes by prestigious
publications like Analytics India Magazine and Higher Education Review, since 2013.

Bottom line
• Job-oriented training
Faculty
• Lucrative job prospects in high
growth domain
• Seasoned analytics professionals
Content
• Together we have 50 + years of • Support for relevant
• World class course structure experience with prestigious firms, certifications and diplomas
Approach like McKinsey, KPMG, Deloitte
• Career counseling and planning
• Surpasses industry requirements and AOL
• 80-20 focus on practical & theory • Value for money with high return
• Cater to Standard certifications • Regular sessions by industry
on investment
• Personal attention and Individual experts
counselling • High quality course material and
real life case studies
• Industry best practices
Candidates trained by us are working in leading companies across
industries…
On average 8 hours of
Integrated Program for Data Science self-study per week

1. Building blocks 2. Data Visualization & 3. R for Data Science 4. Data Science & ML using
Analytics (e-learning**) Python
Introduction to Analytics & Data
Science R Fundamentals & Visualization
Excel Data Science with Python
Fundamentals of Analytics (for Statistical Analysis with R
Tableau Machine Learning
Non-programmers) Linear Regression with R
SQL Text Mining & NLP
Machine Learning with R
VBA AI & Cloud Computing
3 Pre-learning hours +
9 live training hours 47 e-learning hours 34 Pre-learning hours + 33 Pre-learning hours +
18 live training hours 93 live training hours

7. Industry & Functional 6. AI & Cloud Computing 5. Certified Big Data


Sessions Engineering
Introduction to AI & Cloud
50 Assignments and Computing Building Blocks
Deep Learning with TensorFlow Hadoop Eco-system & Spark
Projects included 12 live training hours
Computer Vision Application NoSQL- MongoDB
Build Your Own Chatbot Cloud Computing
AI Project Deployment in Cloud
45 Pre-learning hours +
21 Pre-learning hours + 72 live training hours
39 live training hours
*Pre-learning refers to basic foundation concepts and candidates are required to go through these using e-learning followed by live sessions
**e-learning includes video based modules, which are non-critical optional topics and candidates may go-through them based on their interest/ need basis
Data Visualization & Analytics
1. Building Blocks
Introduction to Bridge Course & Analytics Software’s Introduction to Basic Statistics Introduction to Analytics & Data Science
Basic Excel • Introduction to Statistics • What is analytics & Data Science?
• Excel Environment • Measures of central tendencies • Business Analytics vs. Data Analytics vs. Data
• Key Terminologies • Measures of variance Science
• Short Cuts • Measures of frequency • Common Terms in Analytics
• Key Functionalities • Measures of Rank • Analytics vs. Data warehousing, OLAP, MIS
• Copy-paste-paste special • Basics of Probability, distributions Reporting
• Formatting & conditional Formatting • Conditional Probability (Bayes Theorem) • Types of data (Structured vs. Unstructured vs.
• Basic Excel Functions - Types of Functions Semi Structured)
• Relational operators RDBMS & SQL (Basics) • Relevance of Analytics in industry and need of
• Data Sorting, Filtering and Data Validation • Basic RDBMS Concepts the hour
• Understanding of Name Ranges • Introduction to Relational Database • Critical success drivers
• Pivot tables - Charts management system. Why SQL? • Overview of analytics tools & their popularity
• Basics of charts • A glance at the tool and its • Analytics Methodology & problem solving
advantages and disadvantages framework
Basic Programming Elements • Understanding Schema, ERDs and • Stages of Analytics
• Overview of programming languages Metadata
• Basics of programming elements • Introduction to MS SQL Server
• variables, data types, data structures, • What is SQL – A Quick Introduction
loops, conditional statements, inputs, • Installing MS SQL Server for windows
outputs, functions etc • Introduction to SQL Server
• Understanding key terms Management Studio
• Client/server • Understanding basic database
• Database concepts
• Hosting/deployment • Getting started
2. Data Visualization & Analytics (Excel) (E-learning) (1/4)
Quick Recap of Basics of Excel Data Visualization in Excel
Data manipulation using functions • Overview of chart types – column/bar charts,
• Descriptive functions line/area , pie, doughnut charts, scatter plots
• Logical functions: IF, and, or, not • How to select right chart for your data
• Date and Time functions • Creating and customizing advance charts -
• Text functions thermometer charts, waterfall charts, population
• Array functions pyramids
• Use and application of lookup functions
• Limitations of lookup functions Overview of Dashboards
• Using Index, Match, Offset, reverse vlookup • What is dashboard & Excel dashboard
• Adding icons and images to dashboards
Data analysis and reporting • Making dashboards dynamic
• Data Analysis using Pivot Tables - use of row
and column shelf, values and filters Create dashboards in Excel - Using Pivot controls
• Difference between data layering and cross • Concept of pivot cache and its use in creating
tabulation, summary reports, advantages and interactive dashboards in excel
limitations • Pivot table design elements - concept of slicers and
• Change aggregation types and summarization timelines
• Creating groups and bins in pivot data • Designing sample dashboard using Pivot Controls
• Concept of calculated fields, usage and • Design principles for including charts in dashboards -
limitations do's and dont's
• Changing report layouts - Outline, compact
and tabular forms Business Dashboard Creation
• Show and hide grand totals and subtotals • Management Dashboard for Sales & Services
• Creating summary reports using pivot tables • Best practices - Tips and Tricks to enhance
dashboard designing
2. Data Visualization & Analytics (SQL) (E-learning) (2/4)
Quick Recap of RDBMS & Basic SQL
Accessing data from Multiple Tables using SELECT
Data based objects creation (DDL Commands) • Append and Joins
• Creating databases and tables. Understanding data types • Union and Union All – Use & constraints
• Inserting values into the table • Intersect and Except statements
• Altering table properties • Table Joins - inner join, left join, right join, full join
• Introduction to Keys and constraints • Cross joins/cartisian products, self joins, natural joins etc
• Creating, Modifying & Deleting Tables • Inline views and sub-queries & it's types
• Create Table & Create Index statements • Optimizing your work
• Drop & Truncate statements – Uses & Differences • Update operations with and without joins
• DDL Statements with constraints
• Import and Export wizard to get the data in SQL server from excel Advanced SQL
files or delimited files • Creating table copy and database copy
• Views
Data manipulation (DML Commands) • Transactions
• Data Manipulation statements • Stored Procedures in SQL
• Insert, Update & Delete statements • Crud operations using stored procedures
• Select statement – Sub setting, Filters, Sorting. Removing Duplicates, • Window functions in SQL
grouping and aggregations etc • Miscellaneous Topics: Rollup and cube
• Operators, predicates and built in functions(Top, distinct, Limit)
• Where, Group By, Order by & Having clauses Apply learning's on Business Case study
• SQL Functions – Number, Text, Date, etc
• SQL Keywords – Top, Distinct, Null, etc
• SQL Operators - Relational (single valued and multi valued), Logical
(and, or, not), Use of wildcard operators and wildcard characters, etc
2. Data Visualization & Analytics (Tableau) (E-learning) (3/4)
Getting Started Data handling & summaries Parameters
• What is Tableau? • Sets (In/Out Sets/Combined Sets • Using Parameters in calculated fields
• Tableau product suite • Grouping/Bins/Histograms • Bins/Reference Lines
• How Does Tableau Work? • Drilling up/down – drill through • Filters/Sets
• Tableau Architecture • Hierarchies • Display Options (Dynamic Dimension/Measure Selection)
• Connecting to Data & data source concepts • View data • Create What-If/ Scenario analysis
• Understanding the Tableau workspace • Actions (across sheets) Building Interactive Dashboards
• Dimensions and Measures Building Advanced Reports/ Maps • Combining multiple visualizations into a dashboard (overview)
• Data Types & Default Properties • Explain latitude and longitude • Making your worksheet interactive by using actions
• Tour of Shelves & Marks Card • Default location/Edit locations • Filter/URL/Highlight
• Using Show Me • Building geographical maps • Complete Interactive Dashboard for Sales & Services
• Saving and Sharing your work-overview • Using Map layers Building Stories
Data handling & summaries Calculated Fields • Story Points
• Date Aggregations and Date parts • Aggregate vs. Disaggregate data • Options in Formatting your Visualization
• Cross tab & Tabular charts • Explain - #Number of Rows • Working with Labels and Annotations
• Totals & Subtotals • Basic Functions • Effective Use of Titles and Captions
• Bar Charts & Stacked Bars (String/Date/Numbers etc) Working with Data
• Line Graphs with Date & Without Date • Usage of Logical conditions • Multiple Table Join
• Tree maps/Scatter Plots Table calculations • Data Blending
• Individual Axes, Blended Axes, Dual Axes & • Explain scope and direction • Difference between joining and blending data, and when we
Combination chart/Edit axis • Percent of Total, Running / should do each
• Parts of Views Cumulative calculations • Toggle between to Direct Connection and Extracts
• Sorting • Introduction to LOD (Level of Detail) Sharing work with others
• Trend/Reference Lines/Forecasting Expressions • Sharing Workbooks
• Filters/Context filters • User applications of Table calculations • Publish to Reader/PDF
• Publish to Tableau Server and sharing on the web
2. Data Visualization & Analytics (VBA) (E-learning) (4/4)
Introducing VBA Programming constructs in VBA Communicating with Your Users
• What is Logic? • Control Structures • Simple Dialog Boxes
• What Is VBA? • Looping Structures • User Form Basics
• Introduction to Macro Recordings, IDE • The With- End with Block • Using User Form Controls
• Add-ins
How VBA Works with Excel Functions & Procedures in VBA – • Accessing Your Macros through the User
• Working In the Visual Basic Editor Modularizing your programs Interface
• Introducing the Excel Object Model • Worksheet & workbook functions • Retrieve information through Excel from
• Using the Excel Macro Recorder • Automatic Procedures and Events Access Database using VBA
• VBA Sub and Function Procedures • Arrays

Key Components of Programming language Objects & Memory Management in VBA


• Essential VBA Language Elements • The NEW and SET Key words
• Keywords & Syntax • Destroying Objects – The Nothing Keyword
• Programming statements
• Variables & Data types Error Handling
• Comments
• Operators Controlling accessibility of your code – Access
• Working with Range Objects specifiers

A look at some commonly used code snippets Code Reusability – Adding references and
components to your code
Data Science using R
1. R For Data Science (1/2)
Data Importing/Exporting Data Manipulation Using R with Databases
• Introduction R/R-Studio - GUI • Reshaping data • R and Relational Databases
• Concept of Packages - Useful Packages (Base & • Sampling • Connecting to Relational Databases using RJDBC
Other packages) • Operators and RODBC
• Data Structure & Data Types (Vectors, • Control Structures (if, if else) • Database Design and Querying Data
Matrices, factors, Data frames, and Lists) • Loops (Conditional, iterative loops) • Modifying Data and Using Stored Procedures
• Importing Data from various sources • apply functions • In-Database Analytics with R
• Exporting Data to various formats • Arrays
• Viewing Data (Viewing partial data and full • R Built-in Functions Data Visualization with R
data) • Text, Numeric, Date, utility • Basic Visualization Tools
• Variable & Value Labels – Date Values • R User Defined Functions • Bar Charts/Histograms/Pie Charts
• Aggregation/Summarization • Scatter Plots
Data Manipulation • Line Plots and Regression
• Creating New Variables (calculations & Data Analysis • Specialized Visualization Tools
Binning) • Introduction exploratory data analysis • Word Clouds/ Radar Charts
• Dummy variable creation • Descriptive statistics, Frequency Tables and • Waffle Charts/ Box Plots
• Applying transformations summarization • How to create Maps
• Handling duplicates/missing's • Uni-variate Analysis (Distribution of data) • Creating Maps in R
• Sorting and Filtering • Bivariate Analysis(Cross Tabs, Distributions & • How to build interactive web pages
• Sub setting (Rows/Columns) Relationships) • Introduction to Shiny
• Appending (Row/column appending) • Creating and Customizing Shiny Apps
• Merging/Joining (Left,right,inner,full,outer) • Additional Shiny Features
• Data type conversions
• Renaming
• Formatting
1. R For Data Science (2/2)
Introduction to Statistics Machine Learning vs Statistical Modeling & Supervised vs Dimensionality Reduction & Collaborative Filtering
• Basic Statistics - Measures of Central Tendencies and Unsupervised Learning • Dimensionality Reduction: Feature Extraction & Selection
Variance • Machine Learning Languages, Types, and Examples • Collaborative Filtering & Its Challenges
• Building blocks - Probability Distributions - Normal • Machine Learning vs Statistical Modelling
distribution - Central Limit Theorem • Supervised vs Unsupervised Learning
• Inferential Statistics -Sampling - Concept of Hypothesis • Supervised Learning Classification
Testing • Unsupervised Learning
• Statistical Methods - Z/t-tests (One sample,
independent, paired), Anova, Correlations and Chi- Supervised Learning I
square • K-Nearest Neighbors
• Decision Trees
Linear Regression: Solving regression problems • Random Forests
• Introduction - Applications • Reliability of Random Forests
• Assumptions of Linear Regression • Advantages & Disadvantages of Decision Trees
• Building Linear Regression Model
• Understanding standard metrics (Variable significance, Supervised Learning II
R-square/Adjusted R-square, Global hypothesis ,etc) • Regression Algorithms
• Assess the overall effectiveness of the model • Model Evaluation
• Validation of Models (Re running Vs. Scoring) • Model Evaluation: Overfitting & Underfitting
• Standard Business Outputs (Decile Analysis, Error • Understanding Different Evaluation Models
distribution (histogram), Model equation, drivers etc.)
• Interpretation of Results - Business Validation - Unsupervised Learning
Implementation on new data • K-Means Clustering plus Advantages & Disadvantages
• Hierarchical Clustering plus Advantages & Disadvantages
• Measuring the Distances Between Clusters - Single
Linkage Clustering
• Measuring the Distances Between Clusters - Algorithms
for Hierarchy Clustering
• Density-Based Clustering
Data Science & Machine Learning with Python
1. Data Science & ML with Python: Building Blocks
Introduction to Basic Statistics Introduction to Analytics & Data Science
• Introduction to Statistics • What is analytics & Data Science?
• Measures of central tendencies • Business Analytics vs. Data Analytics vs. Data
• Measures of variance Science
• Measures of frequency • Common Terms in Analytics
• Measures of Rank • Analytics vs. Data warehousing, OLAP, MIS
• Basics of Probability, distributions Reporting
• Conditional Probability (Bayes Theorem) • Types of data (Structured vs. Unstructured vs.
Semi Structured)
Introduction to Mathematical foundations • Relevance of Analytics in industry and need of
• Introduction to Linear Algebra the hour
• Matrices Operations • Critical success drivers
• Introduction to Calculus • Overview of analytics tools & their popularity
• Derivatives & Integration • Analytics Methodology & problem solving
• Maxima, minima framework
• Area under the curve • Stages of Analytics
• Theory of optimization
2. Python For Data Science (1/2)
Python Essentials (Core) Operations with NumPy (Numerical Python) • Mutation of table (Adding/deleting columns)
• Overview of Python- Starting with Python • What is NumPy? • Binning data (Binning numerical variables in to
• Why Python for data science? • Overview of functions & methods in NumPy categorical variables)
• Anaconda vs. python • Data structures in NumPy • Renaming columns or rows
• Introduction to installation of Python • Creating arrays and initializing • Sorting (by data/values, index)
• Introduction to Python IDE's(Jupyter,/Ipython) • Reading arrays from files • By one column or multiple columns
• Concept of Packages - Important packages • Special initializing functions • Ascending or Descending
• NumPy, SciPy, scikit-learn, Pandas, • Slicing and indexing • Type conversions
Matplotlib, etc • Reshaping arrays • Setting index
• Installing & loading Packages & Name Spaces • Combining arrays • Handling duplicates /missing/Outliers
• Data Types & Data objects/structures (strings, • NumPy Maths • Creating dummies from categorical data (using
Tuples, Lists, Dictionaries) get_dummies())
• List and Dictionary Comprehensions Overview of Pandas • Applying functions to all the variables in a data
• Variable & Value Labels – Date & Time Values • What is pandas, its functions & methods frame (broadcasting)
• Basic Operations – Mathematical/string/date • Pandas Data Structures (Series & Data Frames) • Data manipulation tools(Operators, Functions,
• Control flow & conditional statements • Creating Data Structures (Data import – reading Packages, control structures, Loops, arrays etc.)
• Debugging & Code profiling into pandas)
• Python Built-in Functions (Text, numeric, date, Data Analysis using Python
utility functions) Cleansing Data with Python • Exploratory data analysis
• User defined functions – Lambda functions • Understand the data • Descriptive statistics, Frequency Tables and
• Concept of apply functions • Sub Setting / Filtering / Slicing Data summarization
• Python – Objects – OOPs concepts • Using [] brackets • Uni-variate Analysis (Distribution of data &
• How to create & call class and modules? • Using indexing or referring with column Graphical Analysis)
names/rows • Bi-Variate Analysis(Cross Tabs, Distributions &
• Using functions Relationships, Graphical Analysis)
• Dropping rows & columns
2. Python For Data Science (2/2)
Data Visualization with Python Statistical Methods & Hypothesis Testing
• Introduction to Data Visualization • Descriptive vs. Inferential Statistics
• Introduction to Matplotlib • What is probability distribution?
• Basic Plotting with Matplotlib • Important distributions (discrete & continuous
• Line Plots distributions)
Basic Visualization Tools • Deep dive of normal distributions and properties
• Area Plots • Concept of sampling & types of sampling
• Histograms • Concept of standard error and central limit theorem
• Bar Charts • Concept of Hypothesis Testing
• Pie Charts • Statistical Methods - Z/t-tests (One sample,
• Box Plots independent, paired), ANOVA, Correlation and Chi-
• Scatter Plots square
• Bubble Plots
Advanced Visualization Tools
• Waffle Charts
• Word Clouds
• Seaborn and Regression Plots
Visualizing Geospatial Data
• Introduction to Folium
• Maps with Markers
• Choropleth Maps
3. Predictive Modeling & Machine Learning
Introduction to Predictive Modeling Supervised Learning: Regression problems Time Series Forecasting
• Concept of model in analytics and how it is used? • Linear Regression • What is forecasting?
• Common terminology used in modeling process • Non-linear Regression • Applications of forecasting
• Types of Business problems - Mapping of Algorithms • K-Nearest Neighbor • Time Series Components and
• Different Phases of Predictive Modeling • Decision Trees Decomposition
• Data Exploration for modeling • Ensemble Learning - Bagging, Random • Types of Seasonality
• Exploring the data and identifying any problems Forest, Adaboost, Gradient Boost, XGBoost • Important terminology: lag, lead,
with the data (Data Audit Report) • Support Vector Regressor Stationary, stationary tests, auto
• Identify missing/Outliers in the data correlation & white noise, ACF & PACF
• Visualize the data trends and patterns Supervised Learning: Classification problems plots, auto regression, differencing
• Logistic Regression • Classification of Time Series Techniques
Introduction to Machine Learning • K-Nearest Neighbor (Uni-variate & Multivariate)
• Applications of Machine Learning • Naïve Bayes Classifier • Time Series Modeling & Forecasting
• Supervised vs Unsupervised Learning • Decision Trees Techniques
• Overall process of executing the ML project • Ensemble Learning - Bagging, Random • Averages (Moving average, Weighted
• Stages of ML Project Forest, Adaboost, Gradient Boost, XGBoost Moving Average)
• Concept of Over fitting and Under fitting (Bias- • Support Vector Classifier • ETS models (Holt Winter Methods)
Variance Trade off) & Performance Metrics • Seasonal Decomposition
• Concept of feature engineering Unsupervised Learning • ARIMA/ARIMAX/SARIMA/SARIMAX
• Regularization (LASSO, Elastic net and Ridge) • Principle Component Analysis • Regression
• Types of Cross validation(Train & Test, K-Fold • K-Means Clustering • Evaluation of Forecasting Models
validation etc.) • Hierarchical Clustering
• Concept of optimization - Gradient descent • Density-Based Clustering
algorithm
• Cost & optimization functions Recommender Systems
• Python libraries suitable for Machine Learning • Content-based recommender systems
• Collaborative Filtering
4. Text Mining using NLP
Introduction to Text Mining • Creating Term-Document matrix Text Mining – Predictive Modeling
• Text Mining - characteristics, trends • Tagging text with parts of speech • Semantic similarity between texts
• Text Processing using Base Python & Pandas, • Word Sense Disambiguation • Text Segmentation
Regular Expressions • Finding associations • Topic Mining (LDA)
• Text processing using string functions & • Measurement of similarity between • Text Classification(spam detection,
methods documents and terms sentiment analysis, Intent Analysis)
• Understanding regular expressions • Visualization of term significance in the form
• Identifying patterns in the text using regular of word clouds
expressions
Advanced data processing and visualization
Text Processing with modules like NLTK, sklearn • Vectorization (Count, TF-IDF, Word
• Getting Started with NLTK Embedding's)
• Introduction to NLP & NLTK • Sentiment analysis (vocabulary approach,
• Introduction to NLTK Modules (corpus, tokenize, based on Bayesian probability methods)
Stem, collocations, tag, classify, cluster, tbl, chunk, • Name entity recognition (NER)
Parse, ccg, sem, inference, metrics, app, chat, • Methods of data visualization
toolbox etc) • word length counts plot
• word frequency plots
Initial data processing and simple statistical tools • word clouds
• Reading data from file folder/from text file, from the • correlation plots
Internet & Web scrapping, Data Parsing • letter frequency plot
• Cleaning and normalization of data • Heat map
• Sentence Tokenize and Word Tokenize, Removing • Grouping texts using different methods
insignificant words(“stop words”), Removing special • Language Models and n-grams -- Statistical
symbols, removing bullet points and digits, changing Models of Unseen Data (Smoothing)
letters to lowercase, stemming /lemmatization
/chunking
5. Introduction to AI & DL & Cloud Computing
Introduction to Artificial Intelligence (AI) Introduction to Google Colab
• Modern era of AI
• Role of Machine learning & Deep Learning in AI Introduction to Cloud Computing
• Hardware for AI (CPU vs. GPU vs. FPGA) • What is Cloud Computing? Why it matters?
• Software Frameworks for AI & Deep Learning • Traditional IT Infrastructure vs. Cloud Infrastructure
• Key Industry applications of AI • Cloud Companies (Microsoft Azure, GCP, AWS ) & their Cloud
Services (Compute, storage, networking, apps, cognitive etc.)
Introduction to Deep Learning • Use Cases of Cloud computing
• What are the Limitations of Machine Learning? • Over view of Cloud Segments: IaaS, PaaS, SaaS
• What is Deep Learning? • Overview of Cloud Deployment Models
• Advantage of Deep Learning over Machine learning • Overview of Cloud Security
• Reasons to go for Deep Learning • AWS vs. Azure vs. GCP
• Real-Life use cases of Deep Learning • Implementation of ML/DL model in Cloud
• Overview of important python packages for Deep
Learning

Artificial Neural Network


• Overview of Neural Networks
• Activation Functions, hidden layers, hidden units
• Illustrate & Training a Perceptron
• Important Parameters of Perceptron
• Understand limitations of A Single Layer Perceptron
• Illustrate Multi-Layer Perceptron
• Understand Backpropagation – Using Example
• Implementation of ANN in Python- Keras
Big Data Engineering
1. Big Data Engineering: Building Blocks
Introduction to Big Data & Data Engineering Introduction to MS SQL Server • Operators, predicates and built in functions(Top,
• What is Big Data & Data engineering? • What is SQL – A Quick Introduction distinct, Limit)
• Importance of Data engineering in the Big Data • Installing MS SQL Server for windows • Where, Group By, Order by & Having clauses
world • Overview of SQL Server Management Studio • SQL Functions – Number, Text, Date, etc
• Role of RDBMS (SQL Server), Hadoop, Spark, • Understanding basic database concepts • SQL Keywords – Top, Distinct, Null, etc
NOSQL and Cloud computing in Data engineering • SQL Operators - Relational (single valued and
• What is Big Data Analytics Data based objects creation (DDL Commands) multi valued), Logical (and, or, not), Use of
• Key terminologies (Data Mart, Data ware house, • Creating databases and tables. wildcard operators and wildcard characters, etc
Data Lake, Data Ocean, ETL, Data Model, Schema, Understanding data types
Data pipeline etc…) • Inserting values into the table Accessing data from Multiple Tables using SELECT
• Need of Data Engineering • Altering table properties • Append and Joins
• Applications of Data Engineering • Introduction to Keys and constraints • Union and Union All – Use & constraints
• Introducing software for this course • Creating, Modifying & Deleting Tables • Intersect and Except statements
• Create Table & Create Index statements • Joins - inner join, left join, right join, full join
Introduction to Linux • Drop vs. Truncate statements • Cross joins/cartisian products, self joins etc.
• Role of Linux commands in Data Engineering • DDL Statements with constraints • Inline views and sub-queries & it's types
• Introduction to Linux Environment • Import and Export wizard to get the data in • Optimizing your work
• Overview of frequently used commands SQL server from excel files or delimited files • Update operations with and without joins

Introduction RDBMS Concepts Data manipulation (DML Commands) Advanced SQL


• Introduction to Relational Database management • Data Manipulation statements • Creating table copy and database copy
system. Why SQL? • Insert, Update & Delete statements • Views
• A glance at the tool and its advantages and • Select statement – Sub setting, Filters, • Transactions
disadvantages Sorting. Removing Duplicates, grouping and • Stored Procedures in SQL
• Understanding Schema, ERDs and Metadata aggregations etc • Crud operations using stored procedures
• Window functions in SQL
2. Hadoop Ecosystem for Big Data Engineering
Hadoop(Big Data) Eco-System Data Analysis using Hive
• Motivation for Hadoop • Apache Hive - Hive Use Cases
• Comparison of traditional data management • Discuss the Hive data storage principle
systems with Big Data Evaluate key framework • Explain the File formats and Records formats
requirements for Big Data analytics supported by the Hive environment
• Hadoop Ecosystem & core components • Perform operations with data in Hive
• Explain different types of cluster setups(Fully • Hive QL: Joining Tables, Dynamic Partitioning, Custom
distributed/Pseudo etc.) Map/Reduce Scripts
• Hadoop Cluster Overview & Architecture • Hive Script, Hive UDF
• HDFS Overview & Data storage in HDFS • Join datasets using a variety of techniques, including
• Get the data into Hadoop from local Map-side joins and Sort-Merge-Bucket joins
machine(Data Loading ) - vice versa • Use advanced Hive features like windowing, views and
• Practice complete data loading and managing ORC files
them using command line & HUE • Loading data in Hive - Methods
• Map Reduce Overview(Traditional way Vs. • Serialization & Deserialization
MapReduce way) • Integrating external BI tools with Hadoop Hive
• Use the Hive analytics functions (rank, dense_rank,
RDBMS & Hadoop integration using Sqoop cume_dist, row_number)
• Integrating Hadoop into an Existing Enterprise • Use Hive to compute ngrams on Avro-formatted files
• Loading Data from an RDBMS into HDFS, Hive
Using Sqoop Data Analysis using Impala
• Exporting Data to RDBMS from HDFS, Hive Using • Impala & Architecture
Sqoop • How Impala executes Queries and its importance
3. Spark for Big Data Engineering (1/2)
SPARK: Introduction Scala: Basic Object Oriented Programming Spark Overview for Scala Analytics
• Introduction to Apache Spark • Classes • Introduction to DataFrames
• Streaming Data Vs. In Memory Data • Immutable and Mutable Fields • Spark SQL
• Map Reduce Vs. Spark • Methods • Introduction to Spark MLlib
• Modes of Spark • Default and Named Arguments
• Spark Installation Demo • Objects Data Science for Scala
• Overview of Spark on a cluster • Basic Statistics and Data Types
• Spark Standalone Cluster Scala: Case Objects and Classes • Vectors and Labelled Points
• Companion Objects • Local and Distributed Matrices
Spark: Spark in Practice • Case Classes and Case Objects • Summary Statistics, Correlations, and Sampling
• Invoking Spark Shell • Apply and Unapply • Hypothesis Testing
• Creating the Spark Context • Synthetic Methods
• Loading a File in Shell • Immutability and Thread Safety Preparing Data
• Performing Some Basic Operations on • Statistics, Random data and Sampling on Data Frames
Files in Spark Shell Scala: Collections • Handling Missing Data and Imputing Values
• Caching Overview • Collections overview • Transformers and Estimators
• Distributed Persistence • Sequences and Sets • Data Normalization
• Options • Identifying Outliers
Scala: Introduction • Tuples and Maps
• Introduction to Scala • Higher Order Functions Feature Engineering
• Creating a Scala Doc • Feature Vectors
• Creating a Scala Project Scala: Idiomatic Scala • Categorical Features
• The Scala REPL • For expressions • Using Explode, User Defined Functions, and Pivot
• Scala Documentation • Pattern Matching • Principal Component Analysis (PCA) in Feature Engineering
• Handling Options • RFormulas
• Handling Failures/Futures
3. Spark for Big Data Engineering (2/2)
Fitting & Evaluating Machine Learning Simplifying Data Pipelines with Apache Kafka Kafka Consumer Java API
Model • What Kafka is and why it was created • The Kafka consumer client
• Decision Trees • The Kafka Architecture • Some of the Kafka Consumer
• Random Forests • The main components of Kafka configuration settings and what they do
• Gradient-Boosting Trees • Some of the use cases for Kafka • How to create a Kafka consumer using the
• Linear Methods Java API
• Evaluation Kafka Command Line
• The contents of Kafka's /bin directory Kafka Connect and Spark Streaming
Pipeline and Grid Search • How to start and stop Kafka • Kafka Connect and how to use a pre-built
• Predicting Grant Applications • How to create new topics connector
• Introduction • How to use Kafka command line tools to • Some of the components of Kafka
• Creating Features produce and consume messages Connect
• Building a Pipeline • How to use Kafka and Spark Streaming
• Cross Validation and Model Tuning Kafka Producer Java API together
• The Kafka producer client
• Some of the Kafka Producer configuration
settings and what they do
• How to create a Kafka producer using the Java
API and send messages both synchronously
and asynchronously
4. MongoDB for Big Data Engineering
Introduction to NoSQL MongoDB Drivers
• Limitations of RDBMS & Motivation for NoSQL • API and drivers for MongoDB, HTTP and REST
• Nosql Design goals & Advantages interface
• Types of Nosql databases (Categories) – • Install Node.js, dependencies
Cassandra/MongoDB/Hbase • Node.js find & display data, Node.js saving
• CAP theorem and deleting data
• How data stored in a NoSQL data storage
• NoSQL database queries and update languages MongoDB - Indexing
• Indexing and searching in NoSQL Databases • Indexing concepts, Index types, Index
• Reducing data via reduce function properties, aggregation
• Clustering and scaling of NoSQL Database
MongoDB Administration, Scale & Security
MongoDB Architecture & Installation • MongoDB monitoring, health check, backups
• Overview & Architecture of MongoDB & Recovery options, Performance Tuning,
• Depth understanding of Database and Collection • Data Imports & Exports to & from MongoDB.
• Documents and Key/Values etc. • Introduction to Scalability & Availability,
• Introduction to JSON and BSON Documents • MongoDB replication, Concepts around
• Installing MongoDB on Linux sharding, Types of sharding and Managing
• Usage of various MongoDB Tools available with shards,
MongoDB package • Master – Slave Replication,
• Introduction to MongoDB shell • Security concepts & Securing MongoDB
• MongoDB Data types
• CRUD concepts & operations MongoDB – Application Development
• Query behaviors in MongoDB • Creation of MongoDB app
5. Cloud Computing for Big Data Engineering
Introduction to Cloud Computing Container & Kubernetes Essentials with IBM Cloud
• What is Cloud Computing? Why it matters? • Install required software
• Traditional IT Infrastructure vs. Cloud Infrastructure • Provision a cluster
• Cloud Companies & their Cloud Services (Compute, storage,
networking, apps, cognitive etc.) Virtual machines, containers, and Kubernetes
• Use Cases of Cloud computing • Virtual machines
• Over view of Cloud Segments: IaaS, PaaS, SaaS • Containers
• Overview of Cloud Deployment Models • Virtual machines versus containers
• Overview of Cloud Security
• Introduction to AWS, Microsoft Azure Cloud and IBM Cloud. Similarities Relationship between Kubernetes and containers
and differences between these Public / Private Cloud offerings • Kubernetes orchestration
• How Kubernetes was created
Big Data Analytics using Cloud Computing (IBM) • Kubernetes architecture
• Introduction IBM cloud & Key offerings • Kubernetes resource model
• Creating Virtual machine • Key resources and pods
• Overview of available Big Data products & Analytics Services in Cloud • Kubernetes application deployment workflow
• Storage services
• Compute Services Set up and deploy your first application
• Database Services Scale and update apps: Services, replica sets, and health checks
• Analytics Services Deploy an application with IBM Watson services
• Machine Learning Services
• Manage Hadoop Ecosystem & Spark, NOSQL in the Cloud Services
• Creating Data pipelines
• Scaling Data Analysis & Machine Learning Models

AI & Cloud Computing
AI & Cloud Computing (1/2)
Introduction to Artificial Intelligence (AI) Artificial Neural Network Convolutional Neural Networks (CNN)
• Modern era of AI • Overview of Neural Networks • CNN History
• Role of Machine learning & Deep Learning in AI • Hidden layers, hidden units • Understanding CNNs
• Hardware for AI (CPU vs. GPU vs. FPGA) • Illustrate & Training a Perceptron • CNN Application
• Software Frameworks for AI & Deep Learning • Important Parameters of Perceptron
• Key Industry applications of AI • Limitations of A Single Layer Perceptron Recurrent Neural Networks (RNN)
• Illustrate Multi-Layer Perceptron • Intro to RNN Model
Introduction to Deep Learning • Activation function, Optimizers, Loss Functions • Long Short-Term memory (LSTM)
• What are the Limitations of Machine Learning? • Understand Backpropagation – Using Example • Recursive Neural Tensor Network Theory
• What is Deep Learning? • Recurrent Neural Network Model
• Advantage of Deep Learning over Machine learning Deep Learning with Keras
• Reasons to go for Deep Learning • Define Keras Unsupervised Learning
• Real-Life use cases of Deep Learning • How to compose Models in Keras • Restricted Boltzmann Machine
• Overview of important python packages for Deep • Functional Composition • Collaborative Filtering with RBM
Learning • Predefined Neural Network Layers
• What is Batch Normalization Auto Encoders
Introduction to Cloud Computing • Saving and Loading a model with Keras • Auto Encoders
• Introduction to Google Colab • Using Tensor Board with Keras • Deep Belief Network
• What is Cloud Computing? Why it matters? • Use-Case Implementation with Keras
• Traditional IT Infrastructure vs. Cloud Infrastructure • Intuitively building networks with Keras Accelerating Deep Learning with GPU
• Cloud Companies (IBM, Microsoft Azure, GCP, AWS ) • Hardware Accelerated Deep Learning
& their Cloud Services Deep Learning with Tensorflow • Distributed Deep Learning
• Use Cases of Cloud computing • Hello World with TensorFlow • Deep Learning in the Cloud
• Over view of Cloud Segments: IaaS, PaaS, SaaS • Key concepts of Tensorflow
• Overview of Cloud Deployment Models • Implementing various types of models
• Implementation of ML/DL model in Cloud • Linear/Non-linear models
AI & Cloud Computing (2/2)
Computer Vision Application Text Mining & Language Models with Deep Build Your Own Chatbot
Learning • Introduction to Chatbots
Introduction to Computer Vision • What are chatbots?
Text Mining • Chatbots are trending
OpenCV • NLP vs. NLU vs. NLG • How chatbots work
• Introduction to OpenCV • Vectorization using Word Embedding's • Working with Intents
• Core Functionalities • Word2vec and Glove • Understanding Intents
• Image processing using OpenCV • Working with Entities
• Video processing using OpenCV Language Models • Understanding Entities
• Feature Detection • Transfer Learning in the Text Mining • Create Entities
• Video Analysis • Introduction to Popular Language Models • Import and Export Entities
• ULMFiT • Defining the Dialog
Computer Vision Applications • Transformer • Putting it all together
• Concept of Transfer Learning • Google’s BERT • Building user-friendly chatbots
• Popular Image net models • Transformer-XL • Implement the Dialog
• Object Classification • OpenAI’s GPT-2 • Define Domain-Specific Intents
• Object Detection • ELMo • Deploying to a WordPress site
• Object Tracking • Flair • Add a preview and retrieve your credentials
• Object Localization • StanfordNLP • Deploy your Chatbot
• Object Segmentation • Watson Assistant in the Private Cloud
Language Models Application
Generative Adversarial Networks • Machine Translation
• Text Classification
• Text Segmentation
• Sentiment Analysis
Industrial & Functional Sessions (Domain Understanding)
Introduction to Data Sources for Various Industries Banking & Financial Services, Insurance
Introduction to Analytics Project Management • Applications of analytics to various functions
Marketing Analytics • Retail Banking
• Introduction to Marketing Function • Commercial Banking
• Marketing Research - Analytics • Investment Banking
• Customer Analytics • Insurance
• Campaign Analytics Retail & E-Commerce
• Pricing Analytics • Customer Analytics
• Marketing Return on Investment (MROI) • Demand & Supply Chain
• Market Mix Models • Merchandizing & planning
Risk Analytics • Price & Promotion Analysis
• Introduction to Risk Function • Trust Analytics
• Enterprise Risk Function • Customer Service
• Credit Scoring (Application & Behavioral) • A/B Testing
• Fraud Analytics • Recommendation Engines
Operation Analytics Pharma & Health Care
• Overview of Operation Analytics • Marketing & Sales Analytics
• Applications Analytics in different functions • Provider-Payer-Patient Analytics
• Service Operations • Claims Analytics
• Manufacturing • Fraud Analytics
• Logistics • MROI
• Business Support Functions (HR Analytics) Telecom & Network Pre-requisite: Completion of
• Inventory Management • Network Analytics assignments and projects of earlier
Digital Analytics(Web Analytics) • Subscriber Analytics
Social Network Analytics • Loyalty Analytics
modules within this course
• Revenue Leakage Analytics
Course completion and career assistance
Course completion & Certification criteria What is included in career assistance?

• You shall be awarded an AnalytixLabs certificate only • Post successful course completion, candidates can seek
post the submission and evaluation of mandatory course assistance from AnalytixLabs for profile building. A team
project work. These will be provided as a part of the of seasoned professionals will help you based on your
training. overall education background and work experience. This
will be followed by interview preparation along with
• There is no pass/fail for these assignments and projects . mock interviews (if required)
Our objective is to ensure that trainees get strong hands-
on experience so that they are well-prepared for job • Job referrals are based on the requirements we get from
interviews along with performance at their jobs. various organizations, HR consultants and large pool of
AnalytixLabs’ ex-students working in various companies.
• Incase the assignments and projects are not up-to-the-
mark, trainees are welcome to take help and support for • No one can truthfully provide job guarantee, particularly
improvisation. for good quality job profiles in Analytics. However, most
of our students do get multiple interview calls and good
• While weekly schedule is shared with trainees for regular career options based on the skills they learn during
assignments, candidates get 3 months, post course training. For this there will be continuous support from
completion, to submit their final assignment and our side for as long as required.
projects.
Time and Investment
Training: 240 hours live training + 214 e-learning hours + Practice,
INR 150,000 95,000 + 18% GST ~ 37% Combo discount

Timing: 6 hours per weekend live training (Saturday & Sunday 3 hours each) + Practice

Training mode: Fully interactive live online class / Classroom (In Gurgaon, Bangalore and Noida center only)
(In addition to the above, you will also get access to the recordings for future reference and self study)

Components: Learning Management System access for courseware like class recordings - study material, Industry-
relevant project work

Certification: Participants will be awarded a Co-branded Global certificate on successful completion of the stipulated
requirements including an evaluation
We provide trainings both in ‘fully interactive live online’ and
classroom* mode
Fully interactive
live online class
with personal
attention
Access to quality
Saves training and 24x7
commuting time practice
and resources in sessions
today’s chaotic available at the
world comfort of your
Ensures place
best use of
time and
Delivered resources
Studies prove
lectures are
that online
recorded and
education beats
can be replayed
the conventional
by individuals as
classroom
per their needs One of strongest
global trends in
education, both
in developing
and developed
countries

*Classroom only available at Gurgaon, Bangalore and Noida center


Contact Us

Visit us on: http://www.analytixlabs.in/

For course registration, please visit: http://www.analytixlabs.co.in/course-registration/

For more information, please contact us: http://www.analytixlabs.co.in/contact-us/


Or email: info@analytixlabs.co.in
Call us we would love to speak with you: (+91) 9555219007

Join us on:
Twitter - http://twitter.com/#!/AnalytixLabs
Facebook - http://www.facebook.com/analytixlabs
LinkedIn - http://www.linkedin.com/in/analytixlabs
Blog - http://www.analytixlabs.co.in/category/blog/
Visit Us

Gurgaon Address: Bengaluru Address: Noida Address:

GF 382, Sector 29, Bldg 41, First floor, First Floor, A-78,
Adjoining IFFCO Chowk Metro 14th Main Road, Near BDA A Block, Sector 2,
Station (Gate 2), complex, (Adjacent to Sector 15 metro
Next to Magicpin Office, (Landmark: Max store) station)
Gurgaon, Sector 7, HSR Layout, Bengaluru Noida,
Haryana- 122001 Karnataka - 560102 Uttar Pradesh - 201301

You might also like