You are on page 1of 48

edureka!

Discover Learning

Data Science Masters Program

Course Curriculum

masters
programme
About Edureka

Edureka is a leading e-learning platform providing live instructor-led interactive online

training. We cater to professionals and students across the globe in categories like Big

Data & Hadoop, Business Analytics, NoSQL Databases, Java & Mobile Technologies,

System Engineering, Project Management and Programming.

We have an easy and affordable learning solution that is accessible to millions of

learners. With our students spread across countries like the US, India, UK, Canada,

Singapore, Australia, Middle East, Brazil and many others, we have built a community of

over 1 million learners across the globe.

About The Course

Data Science Masters Program makes you proficient in tools and systems used by Data

Science Professionals. It includes training on Statistics, Data Science, Python, Apache

Spark & Scala, Tensorflow and Tableau. The curriculum has been determined by

extensive research on 5000+ job descriptions across the globe.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka!

Index
1. Python Statistics for Data Science Course 03

2. R Statistics for Data Science Course 07

3. Data Science Certification Training 11

4. Python Certification Training for Data Science 18

5. Apache Spark and Scala Certification Training 27

6. AI & Deep Learning with TensorFlow 33

7. Tableau Training & Certification 40

8. Data Science Master Program Capstone Project 47

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 03

Python Statistics for


Data Science Course
Course Curriculum

Module 1 : Introduction to Python and Scripting Concepts

Learning Objectives

After completing this module, you should be able to: Understand Python basic concepts,

Understand scripting concepts and Distinguish Programming language and scripting

language.

Topics

Get an overview of Python Start Python


The companies using Python Discuss Interpreter PATH
Other applications in which Python can Run a Python Script
be used
Discuss Python Scripts on
Explore Python Frameworks and IDEs UNIX/Windows

Concept of Scripting

Hands-on/Demo

Create “Hello world” code

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 04

Module 2 : Introduction to Data Types and Decision Making

Learning Objectives

After completing this module, you should be able to: Define Reserved Keywords and

Command Line Arguments, Understand data types, Execute decision-making statements

and Use Loops.

Topics

Introduction to Identifiers Decision making statements

What are the different variable types? Loops

Different operators

Hands-on/Demo

Data types - string, numbers Variables

Keywords Demonstrating decision making statements

Demonstrating Loops

Module 3 : Deep Dive into Data Types

Learning Objectives

After completing this module, you should be able to: Explain Numbers , Explain Strings,

Tuples, Lists, Dictionaries, and Sets.

Topics

Numbers Lists and related operations

Strings and related operations Dictionaries and related operations

Tuples and related operations Sets and related operations

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 05

Hands-on/Demo

Tuple - properties, related operations, comparison with list

List - properties, related operations

Dictionary - properties, related operations

Set - properties, related operations

Module 4 : Functions, Modules and Exception Handling

Learning Objectives

After completing this module, you should be able to: Take input from user with the help

of Pycharm and perform operations on it, Create and execute functions, Define

Modules (Boto3), Handle the exceptions and Explain Standard Library.

Topics

Python files I/O Functions Python Boto ec2 module

Function Parameters Errors and Exception Handling

Global variables Handling multiple exceptions

Variable scope and Returning Values The standard exception hierarchy

Modules used in python Using Modules

Hands-on/Demo

Functions - syntax, arguments, keyword arguments, return values

Errors and exceptions - types of issues, remediation

Packages and module - modules, import options, sys path

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 06

Module 5 : Network Programming, Multithreading,


and GUI Programming

Learning Objectives

After completing this module, you should be able to: Understand the concept of

Database, Access MySQL DB , Create socket for sending short messages and Create GUI

Topics

Why Python is called Object oriented language? Network programming

Class and Objects Multithreading

MySQL DB access GUI programming

Hands-on/Demo

Network Creation Create GUI

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 07

R Statistics for Data Science Course

Course Curriculum

Module 1 : Understanding the Data

Learning Objectives

At the end of this Module, you should be able to: Understand various data types, Learn

Various variable types, List the uses of Variable types, Explain Population and Sample,

Discuss Sampling techniques and Understand Data representation

Topics

Introduction to Data Types Sensitivity

Numerical parameters to represent data Information Gain

Mean Entropy

Mode Statistical parameters to represent data

Median

Hands-on/Demo

Estimating mean, median and mode using R

Calculating Information Gain and Entropy

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 08

Module 2 : Probability and its Uses

Learning Objectives

At the end of this Module, you should be able to: Understand rules of probability, Learn

about dependent and independent events, Implement conditional, marginal and joint

probability using Bayes Theorem, Discuss probability distribution and Explain Central

Limit Theorem

Topics

Uses of probability Density Concepts

Need of probability Normal Distribution Curve

Bayesian Inference

Hands-on/Demo

Calculating probability using R

Conditional, Joint and Marginal Probability using R

Plotting a Normal distribution curve

Module 3 : Statistical Inference

Learning Objectives

At the end of this Module, you should be able to: Understand the concept of point

estimation using confidence margin, Demonstrate the use of Level of Confidence and

Confidence Margin, Draw meaningful inferences using margin of error and Explore

hypothesis testing and its different levels

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 09

Topics

Point Estimation Hypothesis Testing

Confidence Margin Levels of Hypothesis Testing

Hands-on/Demo

Calculating and generalizing point estimates using R

Estimation of Confidence Intervals and Margin of Error

Module 4 : Testing the Data

Learning Objectives

At the end of this module, you should be able to: Understand Parametric and

Non-Parametric testing, Learn various types of Parametric testing and Explain A/B testing

Topics

Parametric Test Non- Parametric Test

Parametric Test Types A/B testing

Hands-on/Demo

Perform P test and T tests in R

Module 5 : Data Clustering

Learning Objectives

At the end of this module, you should be able to: Understand the concept of Association

and Dependence, Explain Causation and Correlation, Learn the concept of Covariance,

Discuss Simpson’s paradox and Illustrate Clustering Techniques

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 10

Topics
Association and Dependence Simpson’s Paradox

Causation and Correlation Clustering Techniques

Covariance

Hands-on/Demo

Correlation and Covariance in R

Hierarchical clustering in R

K means clustering in R

Module 6 : Regression Modelling

Learning Objectives

At the end of this module, you should be able to: Understand the concept of Linear

Regression, Explain Logistic Regression, Implement WOE, Differentiate between

heteroscedasticity and homoscedasticity and Learn concept of residual analysis

Topics
Residual Analysis
Logistic and Regression Techniques
Heteroscedasticity
Problem of Collinearity
Homoscedasticity
WOE and IV

Hands-on/Demo

Perform Linear and Logistic Regression in R

Analyze the residuals using R

Calculation of WOE values using R

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 11

Data Science Certification Training


Course Curriculum

Module 1 : Introduction to Data Science

Learning Objectives

Get an introduction to Data Science in this module and see how Data Science helps to

analyze large and unstructured data with different tools.

Topics

What is Data Science? Tools of Data Science

What does Data Science involve? Introduction to Big Data and Hadoop

Era of Data Science Introduction to R

Business Intelligence vs Data Science Introduction to Spark

Life cycle of Data Science Introduction to Machine Learning

Module 2 : Statistical Inference

Learning Objectives

In this module, you will learn about different statistical techniques and terminologies

used in data analysis.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 12

Topics

What is Statistical Inference? Probability

Terminologies of Statistics Normal Distribution

Measures of Centers Binary Distribution

Measures of Spread

Module 3 : Data Extraction, Wrangling and Exploration

Learning Objectives

Discuss the different sources available to extract data, arrange the data in structured

form, analyze the data, and represent the data in a graphical format.

Topics

Data Analysis Pipeline Data Wrangling

What is Data Extraction Exploratory Data Analysis

Types of Data Visualization of Data

Raw and Processed Data

Hands-on/Demo

Loading different types of dataset in R

Arranging the data

Plotting the graphs

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 13

Module 4 : Introduction to Machine Learning

Learning Objectives

Get an introduction to Machine Learning as part of this module. You will discuss the

various categories of Machine Learning and implement Supervised Learning Algorithms.

Topics

What is Machine Learning? Machine Learning Categories

Machine Learning Use-Cases Supervised Learning algorithm: Linear

Machine Learning Process Flow Regression and Logistic Regression

Hands-on/Demo

Implementing Linear Regression model in R

Implementing Logistic Regression model in R

Module 5 : Classification Techniques

Learning Objectives

In this module, you should learn the Supervised Learning Techniques and the implementation of

various techniques, such as Decision Trees, Random Forest Classifier, etc.

Topics

What are classification and its use cases? Confusion Matrix

What is Decision Tree? What is Random Forest?

Algorithm for Decision Tree Induction What is Navies Bayes?

Creating a Perfect Decision Tree Support Vector Machine: Classification

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 14

Hands-on/Demo

Implementing Decision Tree model in R

Implementing Linear Random Forest in R

Implementing Navies Bayes model in R

Implementing Support Vector Machine in R

Module 6 : Unsupervised Learning

Learning Objectives

Learn about Unsupervised Learning and the various types of clustering that can be used

to analyze the data.

Topics

What is Clustering & its use cases What is Canopy Clustering?

What is K-means Clustering? What is Hierarchical Clustering?

What is C-means Clustering?

Hands-on/Demo

Implementing K-means Clustering in R

Implementing C-means Clustering in R

Implementing Hierarchical Clustering in R

Module 7 : Recommender Engines

Learning Objectives
In this module, you should learn about association rules and different types of

Recommender Engines.

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 15

Topics

What is Association Rules & its use cases? Item-Based Recommendation

What is Recommendation Engine & it’s Difference: User-Based and

working? Item-Based Recommendation

Types of Recommendations Recommendation use cases

User-Based Recommendation

Hands-on/Demo

Implementing Association Rules in R

Building a Recommendation Engine in R

Module 8 : Text Mining

Learning Objectives

Discuss Unsupervised Machine Learning Techniques and the implementation of different

algorithms, for example, TF-IDF and Cosine Similarity in this Module.

Topics

The concepts of text-mining Quantifying text

Use cases TF-IDF

Text Mining Algorithms Beyond TF-IDF

Hands-on/Demo

Implementing Bag of Words approach in R

Implementing Sentiment Analysis on Twitter Data using R

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 16

Module 9 : Time Series

Learning Objectives

In this module, you should learn about Time Series data, different component of Time

Series data, Time Series modeling - Exponential Smoothing models and ARIMA model for

Time Series Forecasting.

Topics

What is Time Series data? Exponential smoothing models

Time Series variables Identifying different time series scenario


based on which different Exponential
Different components of Time Series data
Smoothing model can be applied
Visualize the data to identify Time Series
Implement respective ETS model for
Components
forecasting
Implement ARIMA model for forecasting

Hands-on/Demo

Visualizing and formatting Time Series data

Plotting decomposed Time Series data plot

Applying ARIMA and ETS model for Time Series Forecasting

Forecasting for given Time period

Module 10 : Deep Learning

Learning Objectives
Get introduced to the concepts of Reinforcement learning and Deep learning in this module.
These concepts are explained with the help of Use cases. You will get to discuss Artificial Neural
Network, the building blocks for Artificial Neural Networks, and few Artificial Neural Network
terminologies.

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 17

Topics

Reinforced Learning Understand Artificial Neural Networks

Reinforcement learning Process Flow Building an Artificial Neural Network

Reinforced Learning Use cases How ANN works

Deep Learning Important Terminologies of ANN’s

Biological Neural Networks

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 18

Python Certification Training


for Data Science
Course Curriculum

Module 1 : Introduction to Python

Learning Objectives

You will get a brief idea of what Python is and touch on the basics

Topics

Overview of Python Operands and Expressions

The Companies using Python Conditional Statements

Different Applications where Python is used Loops

Discuss Python Scripts on UNIX/Windows Command Line Arguments

Values, Types, Variables Writing to the screen

Hands-on/Demo

Creating “Hello World” code

Variables

Demonstrating Conditional Statements

Demonstrating Loops

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 19

Module 2 : Sequences and File Operations

Learning Objectives

Learn different types of sequence structures, related operations and their usage. Also

learn diverse ways of opening, reading, and writing to files

Topics

Python files I/O Functions Lists and related operations

Numbers Dictionaries and related operations

Strings and related operations Sets and related operations

Tuples and related operations

Hands-on/Demo

Tuple - properties, related operations, compared with a list

List - properties, related operations

Dictionary - properties, related operations

Set - properties, related operations

Module 3 : Deep Dive – Functions, OOPs, Modules, Errors


and Exceptions

Learning Objectives

In this Module, you will learn how to create generic python scripts, how to address

errors/exceptions in code and finally how to extract/filter content using regex.

Topics

Functions Function Parameters

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 20

Global Variables The Import Statements

Variable Scope and Returning Values Module Search Path

Lambda Functions Package Installation Ways

Object-Oriented Concepts Errors and Exception Handling

Standard Libraries Handling Multiple Exceptions

Modules Used in Python

Hands-on/Demo

Functions - Syntax, Arguments, Keyword Arguments, Return Values

Lambda - Features, Syntax, Options, Compared with the Functions

Sorting - Sequences, Dictionaries, Limitations of Sorting

Errors and Exceptions - Types of Issues, Remediation

Packages and Module - Modules, Import Options, sys Path

Module 4 : Introduction to NumPy, Pandas and Matplotlib

Learning Objectives

This Module helps you get familiar with basics of statistics, different types of measures

and probability distributions, and the supporting libraries in Python that assist in these

operations. Also, you will learn in detail about data visualization.

Topics

NumPy - arrays Pandas - data structures & index

Operations on arrays operations

Indexing slicing and iterating Reading and Writing data from

Reading and writing arrays on files Excel/CSV formats into Pandas

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 21

matplotlib library Types of plots - bar graphs, pie

Grids, axes, plots charts, histograms

Markers, colours, fonts and styling Contour plots

Hands-on/Demo

NumPy library- Creating NumPy array, operations performed on NumPy array

Pandas library- Creating series and dataframes, Importing and exporting data

Matplotlib - Using Scatterplot, histogram, bar graph, pie chart to show

information, Styling of Plot

Module 5 : Data Manipulation

Learning Objectives

Through this Module, you will understand in detail about Data Manipulation

Topics

Basic Functionalities of a data object Types of Joins on data objects

Merging of Data objects Exploring a Dataset

Concatenation of data objects Analysing a dataset

Hands-on/Demo

Pandas Function- Ndim(), axes(), Aggregation

values(), head(), tail(), sum(), std(), Concatenation

iteritems(), iterrows(), itertuples() Merging

GroupBy operations Joining

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 22

Module 6 : Introduction to Machine Learning with Python

Learning Objectives

In this module, you will learn the concept of Machine Learning and its types.

Topics

Python Revision (numpy, Pandas, Machine Learning Process Flow

scikit learn, matplotlib) Machine Learning Categories

What is Machine Learning? Linear regression

Machine Learning Use-Cases Gradient descent

Hands-on/Demo

Linear Regression – Boston Dataset

Module 7 : Supervised Learning - I

Learning Objectives

In this module, you will learn Supervised Learning Techniques and their implementation, for

example, Decision Trees, Random Forest Classifier etc.

Topics
What are Classification and its use cases? Creating a Perfect Decision Tree

What is Decision Tree? Confusion Matrix

Algorithm for Decision Tree Induction What is Random Forest?

Hands-on/Demo

Implementation of Logistic regression Random forest

Decision tree

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 23

Module 8 : Dimensionality Reduction

Learning Objectives

In this module, you will learn about the impact of dimensions within data. You will be
taught to perform factor analysis using PCA and compress dimensions. Also, you will be
developing LDA model.

Topics

Introduction to Dimensionality Factor Analysis

Why Dimensionality Reduction Scaling dimensional model

PCA LDA

Hands-on/Demo
PCA Scaling

Module 9 : Supervised Learning - II

Learning Objectives
In this module, you will learn Supervised Learning Techniques and their implementation, for
example, Decision Trees, Random Forest Classifier etc.

Topics

What is Naïve Bayes? Hyperparameter Optimization

How Naïve Bayes works? Grid Search vs Random Search

Implementing Naïve Bayes Classifier Implementation of Support Vector

What is Support Vector Machine? Machine for Classification

Illustrate how Support Vector Machine works?

Hands-on/Demo
Implementation of Naïve Bayes, SVM

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 24

Module 10 : Unsupervised Learning

Learning Objectives
In this module, you will learn about Unsupervised Learning and the various types of

clustering that can be used to analyze the data.

Topics

What is Clustering & its Use Cases? What is C-means Clustering?

What is K-means Clustering? What is Hierarchical Clustering?

How does K-means algorithm work? How Hierarchical Clustering works?

How to do optimal clustering

Hands-on/Demo

Implementing K-means Clustering

Implementing Hierarchical Clustering

Module 11 : Association Rules Mining and Recommendation


Systems

Learning Objectives

In this module, you will learn Association rules and their extension towards recommendation

engines with Apriori algorithm.

Topics
What are Association Rules? How does Recommendation Engines work?

Association Rule Parameters Collaborative Filtering

Calculating Association Rule Parameters Content-Based Filtering

Recommendation Engines

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 25

Hands-on/Demo

Apriori Algorithm Market Basket Analysis

Module 12 : Reinforcement Learning

Learning Objectives

In this module, you will learn about developing a smart learning algorithm such that the learning

becomes more and more accurate as time passes by. You will be able to define an optimal

solution for an agent based on agent-environment interaction.

Topics
What is Reinforcement Learning Markov Decision Process (MDP)

Why Reinforcement Learning Q values and V values

Elements of Reinforcement Learning Q – Learning

Exploration vs Exploitation dilemma α values

Epsilon Greedy Algorithm

Hands-on/Demo
Hands-On/Demo: Calculating Optimal quantities

Calculating Reward Implementing Q Learning

Discounted Reward Setting up an Optimal Action

Module 13 : Time Series Analysis

Learning Objectives
In this module, you will learn about Time Series Analysis to forecast dependent variables

based on time. You will be taught different models for time series modeling such that

you analyze a real time-dependent data for forecasting.

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 26

Topics

What is Time Series Analysis? MA model

Importance of TSA ARMA model

Components of TSA ARIMA model

White Noise Stationarity

AR model ACF & PACF

Hands-on/Demo
Checking Stationarity Plot ACF and PACF

Converting a non-stationary data to Generating the ARIMA plot

stationary TSA Forecasting

Implementing Dickey-Fuller Test

Module 14 : Model Selection and Boosting

Learning Objectives
In this module, you will learn about selecting one model over another. Also, you will learn about
Boosting and its importance in Machine Learning. You will learn on how to convert weaker
algorithms into stronger ones.

Topics

What is Model Selection? How Boosting Algorithms work?

The need for Model Selection Types of Boosting Algorithms

Cross-Validation Adaptive Boosting

What is Boosting?

Hands-on/Demo
Cross-Validation AdaBoost

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 27

Apache Spark and Scala


Certification Training
Course Curriculum

Module 1 : Introduction to Scala for Apache Spark

Learning Objectives

In this module of Spark training, you will understand the basics of Scala that are required

for programming Spark applications. You can learn about the basic constructs of Scala

such as variable types, control structures, collections, and more.

Topics

What is Scala? Variable Types in Scala

Why Scala for Spark? Control Structures in Scala

Scala in other frameworks Foreach loop, Functions and Procedures

Introduction to Scala REPL Collections in Scala- Array

Basic Scala operations ArrayBuffer, Map, Tuples, Lists, and more

Module 2 : OOPS and Functional Programming in Scala

Learning Objectives

In this module, you will learn about object-oriented programming concepts like classes,

functions, constructors, etc. and functional programming techniques in Scala.

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 28

Class in Scala Extending a Class

Getters and Setters Overriding Methods

Custom Getters and Setters Traits as Interfaces and Layered Traits

Properties with only Getters Programming

Auxiliary Constructor and Primary Higher Order Functions

Constructor Anonymous Functions, and more

Singletons

Module 3 : Introduction to Big Data and Hadoop

Learning Objectives

In this module, you will understand Big Data, the limitations of the existing solutions for
Big Data problem, how Hadoop solves the Big Data problem, Hadoop ecosystem
components, Hadoop Architecture, HDFS, Rack Awareness and Replication. You will learn
about the Hadoop Cluster Architecture, important configuration files in a Hadoop
Cluster. You will get an overview of Apache Sqoop and how it is used in importing and
exporting tables from RDBMS to HDFS & vice versa.

Topics

What is Big Data? Hadoop Core Components

Big Data Customer Scenarios Rack Awareness and Block Replication

Limitations and Solutions of Existing Data HDFS Read/Write Mechanism

Analytics Architecture with Uber Use Case YARN and Its Advantage

How Hadoop Solves the Big Data Problem Hadoop Cluster and Its Architecture

What is Hadoop? Hadoop: Different Cluster Modes

Hadoop’s Key Characteristics Data Loading using Sqoop

Hadoop Ecosystem and HDFS

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 29

Module 4 : Apache Spark Framework

Learning Objectives

In this module, you will understand the difference between batch and real-time

processing. You will get an introduction to Apache Spark, its Ecosystem and Spark web

interface. You will also learn how to build and run Spark application.

Topics

Big Data Analytics with Batch & Spark’s Place in Hadoop Ecosystem

Real-Time Processing Spark Components & it’s Architecture

Why Spark is Needed? Running Programs on Scala IDE & Spark Shell

What is Spark? Spark Web UI

How Spark Differs from Its Competitors? Configuring Spark Properties

Spark at eBay

Module 5 : Playing with RDDs

Learning Objectives

In this Module, you will learn how to create generic python scripts, how to address

errors/exceptions in code and finally how to extract/filter content using regex.

Topics

Challenges in Existing Computing What is RDD, It’s Functions,

Methods Transformations & Actions?

Probable Solution & How RDD Solves Data Loading and Saving Through RDDs

the Problem RDD Partitioning & How It Helps Achieve


Key-Value Pair RDDs and Other Pair
Parallelization
RDDs o RDD Lineage

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 30

RDD Persistence WordCount Program Using RDD Concepts

Module 6 : DataFrames and Spark SQL

Learning Objectives

In this module, you will learn about Spark SQL which is used to process structured data
with SQL queries. You will learn about data-frames and datasets in Spark SQL and
perform SQL operations on data-frames. At the end of the module you will also be
working on Stock Market Analysis using Spark SQL.

Topics

Need for Spark SQL Data Frames & Datasets

What is Spark SQL? Interoperating with RDDs

Spark SQL Architecture JSON and Parquet File Formats

SQL Context in Spark SQL Loading Data through Different Sources

Module 7 : Machine Learning using Spark MLlib

Learning Objectives

In this module of Spark Training, you will learn about what is the need for machine
learning, types of ML concepts, clustering and MLlib (i.e. Spark’s machine learning library),
various algorithms supported by MLlib and implement K-Means Clustering.

Topics

What is Machine Learning? Face Detection: USE CASE

Where is Machine Learning Used? Features of Saprk MLlib and MLlib Tools

Different Types of Machine Learning Various ML algorithms supported by

Techniques Spark MLlib

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 31

Understanding MLlib Analysis on US Election Data: K-Means

K-Means Clustering & How It Works Spark MLlib USE CASE

with MLlib

Module 8 : Understanding Apache Kafka and Kafka Cluster

Learning Objectives

In this module, you will acquire in-depth knowledge of Kafka and Kafka Architecture. You
will go through the details of Kafka Cluster and you will also learn how to configure
different types of Kafka Cluster such as Single Node Single Broker, Single Node Multi
Broker, etc.

Topics

Need for Kafka Understanding the Components of

What is Kafka? Kafka Cluster

Core Concepts of Kafka Configuring Kafka Cluster

Kafka Architecture Producer and Consumer

Where is Kafka Used?

Module 9 : Capturing Data with Apache Flume and Integration


with Kafka

Learning Objectives

In this module of spark online training you will get an introduction to Apache Flume and
its components like source, channel, sink, etc. You will also learn about architecture of
Flume and it is integrated with Apache Kafka for event processing.

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 32

Topics

Need of Apache Flume Flume Channels

What is Apache Flume Flume Configuration

Basic Flume Architecture Integrating Apache Flume and

Flume Sources Apache Kafka

Flume Sinks

Module 10 : Apache Spark Streaming

Learning Objectives

In this module of Spark Training you will get an opportunity to work on Spark streaming
which is used to build scalable fault-tolerant streaming applications. You will learn about
DStreams and various Transformations performed on it. You will get to know about main
streaming operators, Sliding Window Operators and Stateful Operators.

Topics

Drawbacks in Existing Computing How Uber Uses Streaming Data

Methods Streaming Context & DStreams

Why Streaming is Necessary? Transformations on DStreams

What is Spark Streaming? WordCount Program using Spark

Spark Streaming Features Streaming

Spark Streaming Workflow

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 33

AI & Deep Learning with TensorFlow


Course Curriculum

Module 1 : Introduction to Deep Learning

Learning Objectives

In this module, you’ll get an introduction to Deep Learning and understand how Deep

Learning solves problems which Machine Learning cannot. Understand fundamentals of

Machine Learning and relevant topics of Linear Algebra and Statistics.

Topics

Deep Learning: A revolution in Artificial 3 Reasons to go for Deep Learning

Intelligence Real-Life use cases of Deep Learning

Limitations of Machine Learning Review of Machine Learning: Regression,

What is Deep Learning? Classification, Clustering, Reinforcement

Advantage of Deep Learning over Machine Learning, Underfitting and Overfitting,

learning Optimization

Hands-On

Implementing a Linear Regression model for predicting house prices from Boston dataset

Implementing a Logistic Regression model for classifying Customers based on a

Automobile purchase dataset

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 34

Module 2 : Understanding Neural Networks with TensorFlow

Learning Objectives

In this module, you’ll get an introduction to Neural Networks and understand it’s working

i.e. how it is trained, what are the various parameters considered for its training and the

activation functions that are applied.

Topics

How Deep Learning Works? TensorFlow code-basics

Activation Functions Graph Visualization

Illustrate Perceptron Constants, Placeholders, Variables

Training a Perceptron Creating a Model

Important Parameters of Perceptron Step by Step - Use-Case Implementation

What is TensorFlow?

Hands-On

Building a single perceptron for classification on SONAR dataset

Module 3 : Deep dive into Neural Networks with TensorFlow

Learning Objectives

In this module, you’ll understand backpropagation algorithm which is used for training

Deep Networks. You will know how Deep Learning uses neural network and

backpropagation to solve the problems that Machine Learning cannot.

Topics

Understand limitations of a Single Understand Neural Networks in Detail

Perceptron Illustrate Multi-Layer Perceptron

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 35

Backpropagation – Learning Algorithm MLP Digit-Classifier using TensorFlow

Understand Backpropagation – Using TensorBoard

Neural Network Example

Hands-On

Building a multi-layered perceptron for classification of Hand-written digits

Module 4 : Master Deep Networks

Learning Objectives

In this module, you’ll get started with the TensorFlow framework. You will understand

how it works, its various data types & functionalities. In addition, you will create an image

classification model.

Topics

Why Deep Networks How Backpropagation Works?

Why Deep Networks give better accuracy? Illustrate Forward pass, Backward pass

Use-Case Implementation on SONAR dataset Different variants of Gradient Descent

Understand How Deep Network Works? Types of Deep Networks

Hands-On

Building a multi-layered perceptron for classification on SONAR dataset

Module 5 : Convolutional Neural Networks (CNN)

Learning Objectives

In this module, you’ll understand convolutional neural networks and its applications. You

will learn the working of CNN, and create a CNN model to solve a problem.

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 36

Topics

Introduction to CNNs Convolution and Pooling layers in a CNN

CNNs Application Understanding and Visualizing a CNN

Architecture of a CNN

Hands-On

Building a convolutional neural network for image classification. The model should predict

the difference between 10 categories of images.

Module 6: Recurrent Neural Networks (RNN)

Learning Objectives

In this module, you’ll understand Recurrent Neural Networks and its applications. You will

understand the working of RNN, how LSTM are used in RNN, what is Recursive Neural

Tensor Network Theory, and finally you will learn to create a RNN model.

Topics

Introduction to RNN Model Long Short-Term memory (LSTM)

Application use cases of RNN Recursive Neural Tensor Network Theory

Modelling sequences Recurrent Neural Network Model

Training RNNs with Backpropagation

Hands-On

Building a recurrent neural network for SPAM prediction.

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 37

Module 7 : Restricted Boltzmann Machine (RBM) and Autoencoders

Learning Objectives

In this module, you’ll understand RBM & Autoencoders along with their applications. You

will understand the working of RBM & Autoencoders, illustrate Collaborative Filtering

using RBM and understand what are Deep Belief Networks.

Topics

Restricted Boltzmann Machine Introduction to Autoencoders

Applications of RBM Autoencoders applications

Collaborative Filtering with RBM Understanding Autoencoders

Hands-On

Building a Autoencoder model for classification of handwritten images extracted from the

MNIST Dataset

Module 8 : Keras API

Learning Objectives

In this module, you’ll understand how to use Keras API for implementing Neural

Networks. The goal is to understand various functions and features that Keras provides

to make the task of neural network implementation easy.

Topics

Define Keras Predefined Neural Network Layers

How to compose Models in Keras What is Batch Normalization

Sequential Composition Saving and Loading a model with Keras

Functional Composition Customizing the Training Process

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 38

Topics

Using TensorBoard with Keras Use-Case Implementation with Keras

Hands-On

Build a model using Keras to do sentiment analysis on twitter data reactions on GOP

debate in Ohio

Module 9 : TFLearn API

Learning Objectives

In this module, you’ll understand how to use TFLearn API for implementing Neural

Networks. The goal is to understand various functions and features that TFLearn

provides to make the task of neural network implementation easy.

Topics

Define TFLearn What is Batch Normalization

Composing Models in TFLearn Saving and Loading a model with TFLearn

Sequential Composition Customizing the Training Process

Functional Composition Using TensorBoard with TFLearn

Predefined Neural Network Layers Use-Case Implementation with TFLearn

Hands-On

Build a recurrent neural network using TFLearn to do image classification on

hand-written digits

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 39

Module 10 : In-Class Project

Learning Objectives

In this module, you should learn how to approach and implement a project end to end.

The instructor will share his industry experience and related insights that will help you

kickstart your career in this domain. In addition, we will be having a QA and doubt

clearing session for you.

Topics

How to approach a project? Industry insights for the Machine Learning

Hands-On project implementation domain

What Industry expects? QA and Doubt Clearing Session

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 40

Tableau Training & Certification


Course Curriculum

Module 1 : I Introduction to Data Visualization

Learning Objectives

Identify the prerequisites, goal, objectives, methodology, material, and agenda for the

course. Discuss the basic of Data Visualization. Get a brief idea about Tableau, establish

connection with the dataset, perform Joins operation on the data set.

Topics

Data Visualization Joins and Union

Introducing Tableau 10.0 Data Blending

Establishing Connection

Hands-On

Establishing connection with the files, Introducing important UI components (ShowMe,

Fit Axes)

Perform Cross Joins between the dataset

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 41

Module 2 : Visual Analytics

Learning Objectives
Manage extracts and metadata (by creating hierarchy and folders). Describe what is Visual

Analytics, why to use it, and it’s various scopes. Explain aggregating and disaggregating data and

how to implement data granularity using marks card on aggregated data. Describe what is

highlighting, with the help of a use-case. Illustrate basic graphs including bar graph, line graph,

pie chart, dual axis graph, and area graph with dual axis.

Topics

Managing Extracts Data Granularity using Marks Card

Managing Metadata Highlighting

Visual Analytics Introduction to basic graphs

Hands-On

Creating Extracts, Hierarchy, Folders

All the features of Marks Card Shelve with use case provided

Power of Highlighting in the visualization using the Use-case

How to create basic graphs in Tableau10.x

Module 3 : Visual Analytics in depth I

Learning Objectives
Perform sorting techniques including quicksort, using measures, using header and legend, and

sorting using pill with the help of a use case. Master yourself into various filtering techniques

such as Parametrized filtering, Quick Filter, Context Filter. Learn about various filtering option

available with the help of use case and different scenarios. Illustrate grouping using

data-window, visual grouping, and Calculated Grouping (Static and Dynamic). Illustrate some

more graphical visualization including Heat Map, Circle Plot, Scatter Plot, and Tree Maps.

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 42

Sorting. Grouping

Filtering. Graphical Visualization

Hands-On

Quick Sort, Sorting using measure, Sorting using header and legends, sorting using

pill(use-case).

Filtering Use cases covering different options (General, Wildcard, Conditional).

Interactive Filter, Quick Filter, Context Filter.

Grouping using Data Window, Visual Grouping, Calculated Grouping (Static and Dynamic).

Module 4 : Visual Analytics in depth II

Learning Objectives

Explain the basic concepts of sets followed by Creating sets using Marks Card, computation sets

and combined sets. Describe the concepts of forecasting with the help of Forecasting problem

as a use-case. Discuss the basic concept of clustering in Tableau. Add Trend lines and reference

line to your visualization.Discuss about Parameter in depth using Sets and Filter

Topics

Sets Trend Lines.

Forecasting Reference Lines.

Clustering Parameters

Hands-On

Create sets using marks card, Computation sets, and Combined sets

Forecasting using Precise Range

Methods of clustering

Adding trend line and reference line (along with various options available for them)

Parameter using sets and filter

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 43

Module 5 : Dashboard and Stories

Learning Objectives

Describe the basic concepts of Dashboard and its UI. Build a dashboard by adding

sheets and object into it. Modify the view and layout. Edit your dashboard, how it should

appear on phones or tablets. Create an interactive dashboard using actions (filter,

highlighting, URL). Create stories for your Visualization and Dashboards.

Topics

Introduction to Dashboard. Dashboard Interaction - Using Action.

Creating a Dashboard Layout. Introduction to Story Point.

Designing Dashboard for Devices.

Hands-On

Creating Dashboard and learning its UI component.

Changing the layout of the dashboard.

Using Device Designer to create dashboard for devices.

Create an interactive dashboard using actions (Filter, Highlight, URL).

Creating story with dashboard.

Module 6 : Mapping

Learning Objectives

Map the coordinates on the map, plot geographic data, and use layered view to get the

street view of the area. Edit the ambiguous and unrecognized location plotted on the

map. Customize territory in a polygon map. Connect to the WMS Server, use a WMS

background map and saving it. Add a background image and generate its coordinate and

plot the points.

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 44

Topics

Introduction to Maps. Polygon Maps.

Editing Unrecognized Locations. Web Mapping Services.

Custom Geocoding. Background Images.

Hands-On

Plot the coordinate points on the map, plotting the geographic data, Street View using the

layered view.

Editing Unrecognized and ambiguous location

Custom Geocoding.

Creating a custom territory, building a polygon map.

Establishing connection with the WMS Server a WMS background map and saving it.

Adding a background image and generate coordinates and finally plotting points.

Module 7 : Calculation

Learning Objectives

Perform Calculations using various types of functions such as Number, String, Date,

Logical, and Aggregrate. In addition, you will get to know about Quick Table Calculation.

Cover the following LOD expressions – Fixed, Included, and Excluded.

Topics

Introduction to Calculation : Number Introduction to Table Calculation.

Functions, String Functions , Date Introduction to LOD expression : Fixed LOD ,

Functions, Logical Functions, Aggregate Included LOD, Excluded LOD

Functions.

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 45

Hands-On

All Functions (Number, String, Date, Logical, Aggregate)

Table Calculation.

LOD expressions.

Module 8 : LOD Problem Sets & Hands on

Learning Objectives

Tackle complex scenarios by using LOD expressions.

Hands-On

Use Case I - Count Customer by Order.

Use Case II - Profit per Business Day.

Use Case III - Comparative Sales.

Use Case IV - Profit Vs Target

Use Case V - Finding the second order date.

Use Case VI - Cohort Analysis

Module 9 : Charts

Topics

Box and Whisker's Plots Pareto Charts

Gantt Charts Control Charts

Waterfall Charts Funnel Charts

Hands-On

Extensive hands-on on the above topics

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 46

Module 10 : Integrating Tableau with R and Hadoop

Learning Objectives

You will know the basics of Big Data, Hadoop, and R. You will discuss the integration

between Hadoop and R and will integrate R with Tableau. In addition, you will get to

publish your workbook on Tableau Server.

Topics

Introduction to Big Data Calculating measure using R

Introduction to Hadoop Integrating Tableau with R

Introduction to R Integrated Visualization using Tableau

Integration among R and Hadoop

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 47

Data Science Master Program


Capstone Project
Course Curriculum

Auto Insurance Case Study

Learning Objectives

The capstone project will provide you with a business case. You will need to solve this by

applying all the skills you’ve learned in the courses of the master’s program. This

Capstone project will require you to apply the following skills

Data Exploration

Checking Data Size Note the important features

Data Wrangling

Handling Imbalanced Data Rectify Missing Variable

MetaData Creation One Hot Encoding

Statistics on the Data Scaling: Standard Scaler & Min Max Scaler

Identify Missing Variable

Data Exploration

Data Visualization

www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
edureka! 48

Machine Learning

PCA Linear SVC Classifier

Logistic Regression XG Boost Classifier

Generating F1 Score Metric AdaBoost Classifier

Deep Learning Learning

MLP Classifier MLP Classifier with Cross Validation

Probability Distributions & Regression Modeling


www.edureka.co © 2017 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.

You might also like