You are on page 1of 11

b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

Introduction
Understanding the Excel Interface
Slicing and Dicing Data - Sort and Filter
Report Making I: Basic Formatting

Introduction to Excel Report Making II: Conditional Formatting


Report Making III: Advanced Formatting
Printing and Page Layout
Passwords and Naming Files
Ungraded Assignment

Introduction
Delimited Files
Discovering Shortcuts
Introduction to Formulae
Taught by one of the most
renowned data scientists in the Complex Functions

Data Analysis country (S.Anand, CEO, Gramener), Data Analysis in Excel - I Cell Referencing and Text Functions
in Excel this module takes you from a begin-
Logical Formulae
ner level Excel user to an almost
professional user. Anand's Anecdotes
Creating and Formatting Charts
Types of Charts
Anecdotes - II

Introduction
Creating a Pivot Table
DATA S C I E N C E TO O L K I T

Analysing Data in a Pivot Table


Filtering Data in a Pivot Table
Anand's Anecdotes - Pivot Tables
Data Analysis in Excel - II
VLOOKUP - Linking Data from multiple files & tables
Anand's Anecdotes - VLOOKUP
Common Errors in Excel
Anand's Anecdotes
Ungraded Assignment

Introduction
Data Formats and Tableau Interface

Data Exploration in Tableau Connecting to the Data


Data Preparation in Tableau
Hierarchies and Drill Down

Introduction
Bar Charts
Visualising and Analysing
Visualisation Learn an important and widely used Scatter Plots and Pie Charts
Data in Tableau
using Tableau tool for Data Analysts - Tableau. Tree Maps
Dual Axes Charts

Introduction
Histograms
Box Plots
Visualising and Analysing Area Maps
Data with Tableau - II
Calculations in Tableau
Dashboard and Stories

Introduction
Define the Business Problem -
Business Understanding
The CRISP-DM Framework -
This module covers concepts of the Business and Data Understanding Owning an IPL Team - Business Understanding
Analytics CRISP - DM framework for business
Problem Solving Understanding Raw Data
problem solving.
Preparing Data for Analysis

CRISP-DM Framework - Data Preparation, The Heart of Data Analysis: Modelling


Modelling, Evaluation and Deployment Model Evaluation and Deployment
b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

Introduction
Understanding
Understanding Primary Actions
UpGrad Coding Console
Understanding Statuses & Important Pointers

Introduction
Getting Started - Installation
Introduction to Jupyter Notebook
The Basics
Data Structures in Python
Sharpen your Data Analysis skills Lists
Introduction with Python, which is the choice of
Tuples
to Python language for simplicity, readability
and quick deployment. Dictionaries
Sets

Introduction
If-Elif-Else
Loops
Control Structures and Functions
Comprehensions
Functions
Map, Filter, and Reduce

Introduction
NumPy Basics
Creating NumPy Arrays
Structure and Content of Arrays
Introduction to NumPy
Subset, Slice, Index and Iterate through Arrays
Multidimensional Arrays
Computation Times in NumPy and Standard
Python Lists

Introduction
Basic Operations
Operations on NumPy Arrays
Operations on Arrays

Learn to clean and manipulate the Basic Linear Algebra Operations


Python for
data using Python's powerful data
Data Science Introduction
analysis libraries - NumPy and
Pandas. Pandas Basics
Indexing and Selecting Data
Introduction to Pandas
Merge and Append
Grouping and Summarizing Dataframes
Lambda function & Pivot tables

Introduction
Reading Delimited and Relational Databases
Reading Data From Websites
Getting and Cleaning Data
Getting Data From APIs
Reading Data From PDF Files
Cleaning Datasets

Introduction: Inferential Statistics


Introduction: Basics of Probability
Random Variables
Probability Distributions - I
Basics of Probability
Probability Distributions - II
Expected Value - I
Expected Value - II
Practice Questions

Introduction: Discrete Probability Distributions


Probability Without Experiment - I
Probability Without Experiment - II
Discrete Probability Distributions Binomial Distribution
Binomial Distribution (Examples)
Cumulative Probability
Practice Questions

Introduction: Continuous Probability Distributions


Probability Density Functions - I
Probability Density Functions - II
Continuous Probability Distributions
Normal Distribution
Standard Normal Distribution
Let us learn how to use a random
sample data to describe and make Practice Questions
Inferential Statistics
inference about the population.
Introduction: Central Limit Theorem
Samples
Sampling Distributions
Properties of Sampling Distributions
Sampling Distributions
Central Limit Theorem
Central Limit Theorem
Practice Questions - Part I
Estimating Mean Using CLT
Confidence Interval - Example
Practice Questions - Part II

Introduction to Module
Basics of Probability
Joint Probability and Conditional Probability
Bayes' Theorem
Assessments I
Inferential Statistics - Practice Session Standardized Normal Distribution and Z- Score
Assessments II
S T A T I S T I C S A N D E X P L O R A T O R Y D A T A A N A LY S I S

Introduction to Sampling Methods


Sampling and Estimation
Assessments III

Introduction
Understanding Hypothesis Testing
Null and Alternate Hypotheses
Concepts of Hypothesis Testing - I
Making a Decision
Critical Value Method
Critical Value Method - Examples

Introduction
p-value Method
Concepts of Hypothesis Testing - II
p-value Method - Examples
Types of Errors

You'll learn how to test whether Introduction


your assumptions about population
Hypothesis testing T Distribution
data are true or not using the
sample data. Two-Sample Mean Test
Industry Demonstration of
Two-Sample Proportion Test
Hypothesis Testing
A/B Testing Demonstration
Industry Relevance
Hypothesis testing in Python

Introduction
Z-test

Hypothesis Testing - T-Test


Additional Resources Chi-Square Test
P-Value Approach

F-Test

General Guidelines

Assignment: Statistics Assignment based on the concepts Problem Statement


learnt in Inferential Statistics and Assignment
and Hypothesis testing Assignment Rubrics
Hypothesis Testing
Submission

Introduction to EDA
Introduction
Public and Private Data
Data Sourcing
Private Data
Public Data
Public Data Exercise

Introduction
Fixing Rows and Columns
Missing Values
Data Cleaning
Standardising Values
Invalid Values
Filtering Data

Introduction
Data Description
Unordered Categorical Variables - Univariate Analysis
Univariate Analysis
Ordered Categorical Variables - Univariate Analysis

In this module, we shall learn a set Quantitative Variables - Univariate Analysis


Exploratory of techniques to display data in Quantitative Variables - Summary Metrics
Data Analysis such a way that interesting features
will become apparent Introduction
Introduction to Segmented Univariate Analysis
Basis of Segmentation
Segmented Univariate
Quick way of Segmentation
Comparison of Averages
Comparison of Other Metrics

Introduction
Bivariate Analysis on Continuous Variables
Bivariate Analysis Business Problems Involving Correlation
Practice Questions
Bivariate Analysis on categorical variables

Introduction
What are Derived Metrics?
Types of Derived Metrics: Type Driven Metrics
Derived Metrics
Types of Derived Metrics: Business Driven Metrics
Practice Questions
Types of Derived Metrics: Data Driven Metrics

Course Overview
Introduction: Data Visualisation

Introduction to Data Visualisation Visualisations - Some Examples


Visualisations - The World of Imagery
Understanding Basic Chart Types I
Understanding Basic Chart Types II

Introduction
Data Visualisation Toolkit

Basics of Visualisation Components of a Plot


Visualise distributions and summa-
Sub-Plots
Data Visualisation ry statistics of data using Python's
in Python visualisation libraries matplotlib Functionalities of Plots
and seaborn.
Introduction
Univariate Distributions
Plotting Data Distributions Univariate Distributions - Rug Plots
Bivariate Distributions
Bivariate Distributions - Plotting Pairwise Relationships

Introduction

Plotting Categorical and Plotting Distributions Across Categories


Time-Series Data Plotting Aggregate Values Across Categories

Time Series Data

Assignment: An assignment to study, visualise Problem Statement

Uber Suppy-Demand and solve uber supply-demand Uber Supply-Demand Gap Evaluation Rubric
Gap gap problem
Submission

Use the concepts of EDA to Problem Statement


EDA case study decipher which types of customers Gramener Case Study Evaluation Rubric
default on a loan
Final Submission

Course Wrap - EDA and Statistics Course Wrap - EDA and Statistics

Pre-Reads
Here, you will find all the addition- Basics of Probability
Optional Questions
al content for the course as and
when they are added to this
Additional resources Pre-Reads
module Discrete Probability Distributions
Optional Questions

Power Law
Exploratory Data Analysis Recommended Additional Content

Election Data : Case Study


b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

Introduction
Introduction to Machine Learning
Regression Line
Simple Linear Regression Best Fit Line
Strength of Simple Linear Regression
Simple Linear Regression in Python
Coding Practice - Simple Linear Regression
Introduction
Multiple Linear Regression
Modelling in Python - I
Modelling in Python - II
Housing Case Study
Derived Variables
Multiple Linear Regression VIF - Variance Inflation Factor
Regression helps us to determine the
strength of the relationship between Housing Case Study Predictions
Linear Regression
one dependent variable and a series Variable Selection Using RFE
of other changing variables.
Assumptions of Linear Regression
Feature Selection
Coding Practice - Building a Multiple Linear
Regression Model

Introduction
Linear Regression: Revision
Prediction vs Projection
Media Company Case Study

Industry relevance of Exploratory Data Analysis


linear regression Model Building I
Model Building II
Model Building III
Assessing the Model
Interpreting the Results

Problem Statement - Part I


Build a model to understand the
Linear Regression factors car prices vary on and Problem Statement - Part II
Assignment- Linear Regression
assignment help a Chinese company enter Evaluation Rubric
the US car market.
Final Submission

Module Introduction: Logistic Regression


Introduction: Univariate Logistic Regression
Binary Classification
Univariate Logistic Regression Sigmoid Curve
Finding the Best Fit Sigmoid Curve - I
Finding the Best Fit Sigmoid Curve - II
Odds and Log Odds

Introduction
Making Predictions
Model Building - Coding Exercise
Model Evaluation

Multivariate Logistic Regression - Sensitivity and Specificity-I


Model Building Sensitivity and Specificity - II
Model Evaluation Metrics - Comprehension
Model Evaluation Metrics - Coding Exercise
In this module, you will study the Gain and Lift Charts
theory of logistic regression, a
Logistic Regression KS Statistic
machine learning technique for
binary classification Introduction
Getting familiar with Logistic Regression
Nuances of Logistic Regression - Sample Selection
Nuances of Logistic Regression - Segmentation
Nuances of Logistic Regression - Variable
Logistic Regression - Transformation-I
Industry Applications - Part I
Nuances of Logistic Regression - Variable
Transformation-II
Nuances of Logistic Regression - Variable
Transformation-III
Nuances of Logistic Regression - Variable
Transformation-IV (Optional)

Introduction
Commonly Faced Challenges in Implementation
MACHINE LEARNING I

of Logistic Regression
Logistic Regression:
Industry Applications - Part II Model Evaluation (A Second Look)
Model Validation and Importance of Stability
Tracking of Model Performance Over Time

Introduction
Understanding Clustering
Introduction to Clustering
Practical Example of Clustering - Customer
Segmentation

Introduction
Steps of the Algorithm
K Means Algorithm
K Means as Coordinate Descent
K Means Clustering
K Means++ Algorithm
Visualising the K Means Algorithm
Practical Consideration in K Means Algorithm
Cluster Tendency

Introduction
Data Preparation

Executing K Means in Python Making the Clusters


Here you will learn how to group
Unsupervised elements into different clusters when Let's Have Some Fun
learning: Clustering you don't have any pre-defined Other Behavioural Segmentation Types
labels to classify them.
Introduction
Hierarchical Clustering Algorithm
Interpreting the Dendrogram
Hierarchical Clustering Types of Linkages
Cutting the Dendrogram & Analyzing the Clusters
Industry Insights
Let's have some fun

Introduction
K-Mode Clustering
K-Mode in Python
Other Forms of Clustering K-Prototype in Python
DB Scan Clustering
Practice Question
Gaussian Mixture Model

Introduction
The Why And What of PCA
Building Blocks of PCA
Illustration - Finding Principal Components
Principal Component Analysis Comprehension - Calculating the Principal
This module will cover the concepts Components
of PCA, which is an unsupervised
Unsupervised Singular Value Decomposition
machine learning technique mainly
Learning: Principal used in dimensionality reduction. It SVD Example - Image Compression
Component Analysis will also cover practical applications Practice Questions
of PCA in Python.
Introduction
PCA: Python Implementation
PCA in Python Practical Considerations and Alternatives
Optional Assignment (MNIST Dataset)
Comprehension: PCA, SVD and Eigenvectors

Problem Statement
Use your skills to predict which
HR analytic
employee is going to leave the HR Analytics Case Study Evaluation Rubric
case study
company in the near future. Submission

Introduction
Introduction to SVM
Concept of a Hyperplane in 2D
SVM - Maximal Margin Classifier
Practice Questions
Concept of a Hyperplane in 3D
Maximal Margin Classifier

Introduction
The Soft Margin Classifier
The Slack Variable
SVM - Soft Margin Classifier
Comprehension-1: Notion of Slack Variables
Learn the fundamentals of SVMs
Support Vector Cost of Misclassification
and use them to detect spam emails,
Machine (Optional) recognise alphabets and more! SVM R-Lab

Introduction
Introduction to Kernels
Mapping Nonlinear Data to Linear Data
Feature Transformation
Kernels The Kernel Trick
R Lab - Kernels
Shiny App - Types of kernels
Choosing a Kernel Function
Letter Recognition Using SVM
b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

Introduction
Introduction to Decision Trees
Interpreting a Decision Tree
Introduction to Decision Trees
Comprehension - Decision Tree Classification
in Python
Regression with Decision Trees

Introduction
Concept of Homogeneity

Algorithms for Decision Gini Index


Tree Construction Entropy and Information Gain
Comprehension - Information Gain
Splitting by R-squared

Introduction
Tree models represent the way we
make decisions. Learn how decisions Advantages and Disadvantages
Tree Model are made in this powerful Tree Truncation
classification algorithm.
Tree Pruning
Truncation and Pruning
Building Decision Trees in Python
Choosing Tree Hyperparameters in Python
Coding Practice Questions
Comprehension - Hyperparameters

Introduction
Ensembles
Comprehension - Ensembles
Creating a Random Forest
Random Forests Comprehension - OOB (Out-of-Bag) Error
Comprehension - Time Taken to Build a Random
Forest
Random Forests Lab
Coding Practice Questions

Introduction
Introduction to Boosting

Introduction to Boosting Weak Learners


and AdaBoost AdaBoost Algorithm
AdaBoost Distribution and Parameter Calculation
This module will cover the concepts
of boosting and different boosting AdaBoost Lab
Boosting algorithms- Adaboost, GBM and
Introduction
XGBoost.
Understanding Gradient Boosting
Gradient in Gradient Boosting
Gradient Boosting
Gradient Boosting Algorithm
XGBoost
Kaggle Practice Exercise

Time Series VS Regression


Intro to Time Series
Components of Time Series

Understanding Stationarity
Understanding White Noise
Acf & Pacf Plots
Working with Stationary Time Series
Ar & Ma Modelling
MACHINE LEARNING II

In this module, you will learn how to


Time Series
analyse and forecast a series that Arma Modelling
(Optional) varies with time.
Model Evaluation

Time Series Differencing


Differencing VS Classical Decomposition
End-to-end Analysis Additive VS Multiplicative Model
Time Series Smoothing
Making Time Series Forecast

Inspiration from Human Brain


Working of A Neuron
Hyper Parameters of Neural Networks
Structure of Neural Networks Simplifying Neural Networks
Specifying the Hyperparameters
Activation Function
Building a Sample Network on MNIST Data

Layers in Neural Networks


Inspired by the most sophisticated Information Flow in Neural Networks Information Flow in Neural Networks
Neural Networks
machine in the world - the human
(Optional) Information Flow - Image Recognition
brain, NNs help machines learn.
What does Training a Network Mean?
Training a Neural Network
Complexity of the Cost Function
Updating the Weight & Biases
Updating the Weights & Biases

Stochastic Gradient Descent


Training in Batches
Exploration & Exploitation

Dealing with Sequential Data


Recurrent Neural Networks
Regularisation in Neural Networks

Introduction
Introduction to Model Selection
Model and Learning Algorithm
Principles of Model Selection Simplicity, Complexity and Overfitting
Bias-Variance Tradeoff
Comprehension - Bias Variance Tradeoff
You are preparing for a competitive Regularization
exam. Should you learn some tricks
Model Selection for it or focus on the fundamentals? Introduction
Model Selection has the answer Regularization and Hyperparameters
Model Evaluation and Cross Validation
Model Evaluation
Model Evaluation: Python Demonstration-I
Model Evaluation: Python Demonstration-II
Cross-Validation: Motivation
Cross-Validation: Python Demonstration
Cross-Validation: Hyperparameter Tuning

Introduction
Understanding the Business Problem
Comprehension - Logistic Regression
Comparing Different Machine Learning Models - I
Given a business problem, how do Comparing Different Machine Learning Models - II
Model Selection - you choose the best algorithm?
Practical Learn a few practical tips for doing Model Selection - Best Practices Pros and Cons of Different Machine Learning Models
Considerations this here End-to-End Modelling - I
CART and CHAID Trees
Choosing between Trees and Random Forests - I
Choosing between Trees and Random Forests - II
End-to-End Modelling - II

Introduction
Generalized Regression
Generalized Regression Framework-1
Generalized Linear Regression Generalized Regression Framework-2
Systems of Linear Equations
Generalized Regression Framework-3
Generalized Regression in Python

Introduction
Regularized Regression
Ridge and Lasso Regression - I
Advanced This course takes a more advanced
Regression look at linear regression models. Ridge and Lasso Regression - II
Ridge and Lasso Regression in Python
Model Selection Criteria-I

Regularized Regression Model Selection Criteria-II


Feature Selection
Comprehension - Model Selection Parameters
Comprehension: Features' Subset Selection -
Best Subset Selection
Comprehension: Features' Subset Selection -
Stepwise Selection
Optional Assignment

Solve the most crucial business Problem Statement


Telecom Churn problem for a leading telecom operator Telecom Churn -
Evaluation Rubrics
Case Study in India and southeast Asia - predicting ML Group Case Study
customer churn. Submissions
b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

An introduction to RDBMS and SQL


Basics of SQL
Data Retrieval with SQL

Basics of SQL Compound Functions and Relational Operators


Pattern Matching with Wildcards
Basics of Sorting
Session Summary
Learn the basic and advanced
Data Analysis concepts of SQL and add another Order by Clause
using SQL language to your programming Aggregate Functions
toolkit!
Group by Clause
Having Clause
Advanced SQL
Nested Queries
Inner Join
Multi Join
Outer Join
Summary

Introduction
Defining Data Warehouse
Structure of Data Warehouse
Database design OLAP vs. OLTP
Star Schema
How to Use a Star Schema - A Demonstration
Data Warehouse Schema- Industry Example

Introduction
Adding and Deleting Columns
Changing Column Name and Data Type
Creating Table from existing table
Updating Table
Changing Constraints (Primary key)
Changing Constraints (Foreign key)
String Manipulation
Date Manipulation
Learn the advanced concepts of
Advanced SQL SQL and gain mastery over this Introduction
programming language.
Introduction to Windowing Functions
Window Functions Frames
Named Windows
Window Functions' Restrictions

Introduction
Introduction to User defined Functions
User Defined Functions and Stored
User defined functions (Application)
Procedures
Introduction to Stored Procedures
Stored Procedures (Application)

Introduction
Optimisation in Select Clause
Optimisation in Where Clause
Query Optimisation
Optimisation in Group by and Order by
Optimisation in Joins
Optimisation in Window Function

Problem Introduction
Apply the basics of investing
and your knowledge of Data Science Data Set
Assignment SQL Assignment - Stock Market Analysis
to determine when to buy and sell a Grading Criteria
stock.
Submission

Course Introduction
Introduction to Understand the big data ecosystem Fundamentals of Big Data
Big Data and the various types of job roles in Understanding Big Data
Identifying Big Data
the industry.
Conventional Data Processing Systems and Big Data

Introduction
History of Hadoop
Distributed Computing
Hadoop Terminologies
Master and Slave
Big Data Storage in Hadoop
B I G DATA & S Q L

Hadoop Distributed File System


Interaction Between Nodes in a Hadoop Cluster
Advantages of Distributed File Systems
Big Data Learn the basics of Hadoop and its
Storage and Processing architecture - a distributed computing Comprehension — Hadoop Distributed File System
platform. (HDFS)
Framework - Hadoop
Introduction
YARN - Yet Another Resource Negotiator
MapReduce

Big Data Processing In Hadoop Comprehension - MapReduce


Hadoop Vendors
Hadoop Ecosystem
Comprehension -Yet Another Resource Negotiator
(YARN)

Introduction
Data Ingestion with Apache Sqoop
Advantages and Industry Use Cases of Sqoop
How Sqoop Performs Import
Comprehension: How Sqoop Import Works
Creating an RDS
Introduction to Apache Sqoop
Migrating Databases to the RDS
Running Sqoop in AWS
Adding a MySQL Connector
Sqoop Commands: Listing Databases and Tables
Sqoop Commands: Import and Import-All-Tables
Sqoop Commands: Job and Eval

Introduction
Introduction to Apache Hive
Key Features of Apache Hive
Use Cases of Apache Hive
The Hive Metastore
Big Data In big data ingestion and processing, Introduction to Apache Hive
Ingestion and learn to use various tools for getting Hive Data Models
Processing and processing data. Creating Tables in Hive
Understanding and Analysing the Data Stored
in Hive Tables
Solution - Movies Graded Questions

Introduction
Partitions
Hive Data Models - Partitions
Creating and Querying Partitioned Tables
and Buckets
Buckets
Comprehension: Data Models (Graded Assessment)

Introduction
File Formats in Apache Hive File Formats in Apache Hive
ORC and Compression Algorithms

Introduction
EDA and UDFs in Hive
Advanced Data Analysis in Hive Advanced Data Analysis using Hive
Basic Text Analysis using Hive
Handling Complex Data Types using Hive

Introduction
Overview of Spark
Spark vs MapReduce
Concepts and Fundamentals of Spark Resilient Distributed Datasets (RDDs)
In-memory Processing
RDD Operations
Programming & Debugging in PySpark

Introduction: Setting Up
Schema-on-Read v/s Schema-on-Write

Big Data Learn Apache Spark, the newest big Comparing Spark With Hive
Processing using data framework with unprecedented Analysis with Spark - I: Reading & Summarising Data
Apache Spark performance and ease of use.
Analysis with Spark - II: Plotting Data
Analysis with Spark - III: Filtering & Grouping
Analysis with Spark - IV: Model-building
Working with Spark
Practice Analysis: Airlines Data
MLlib - I: An Overview
MLlib - II: Preparation for Model Building
MLlib - III: Building ML models
PySpark: An Alternative Library to PySpark
Solution to PySpark Practice Questions
Hive LLAP

NYC Parking Problem Statement


Apply machine learning algorithms to NYC Parking Tickets:
Case Study: Apache Rubric
Big Data using Spark An Exploratory Analysis
Spark Submission
b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

NLP: Areas of Application

Understanding Text

Text Encoding

Regular expressions: Quantifiers - I

Regular Expressions: Quantifiers - II

Comprehension: Regular Expressions


Introduction to NLP
Regular Expressions: Anchors and Wildcard

Regular Expressions: Characters Sets

Greedy versus Non-greedy Search

Commonly Used RE Functions

Regular Expressions: Grouping

Regular Expressions: Use Cases

Get started with NLP by knowing


Word Frequencies and Stop Words
about all the essential text
Lexical Processing preprocessing and text cleaning Tokenisation
techniques.
Bag-of-Words Representation

Stemming and Lemmatization


Basic Lexical Processing
Final Bag-of-Words Representation

TF-IDF Representation

Building a Spam Detector - I

Building a Spam Detector - II

Canonicalisation

Phonetic Hashing

Edit Distance
Advanced Lexical Processing Spell Corrector - I

Spell Corrector - II

Pointwise Mutual Information - I

Pointwise Mutual Information - II

The What and Why of Syntactic Processing

Parsing

Parts-of-Speech

Different Approaches to POS Tagging

Lexicon and Rule-based POS Tagging

Stochastic Parsing
Introduction to
The Viterbi Heuristic
Syntactic Processing
Markov Chain and HMM

Explanation Problem

Learning HMM Model Parameters

HMM and the Viterbi Algorithm: Pseudocode

HMM & the Viterbi Algorithm: Python Implementation

Deep Learning Based POS Taggers

Why Shallow Parsing is Not Sufficient

Constituency Grammars

Top-Down Parsing
Learn algorithms to parse grammar of Parsing Bottom-Up Parsing
sentences - HMMs, CFGs, PCFGs and
Syntactic Processing Probabilistic CFG
build a smart flight-booking NLU
system using techniques such as NER.
Chomsky Normal Form

Dependency Parsing

Understanding the ATIS data

Information Extraction
E L E C T I V E - N AT U R A L L A N G UAG E P R O C E S S I N G

POS Tagging

Rule-Based Models

Information Extraction Probabilistic Models for Entity Recognition

Naive Bayes Classifier for NER

Decision Tree Classifiers for NER

HMM and IOB labelling

CRFs - Another Probabilistic Approach

CRF Model Architecture - I

CRF Model Architecture - II

Conditional Random Fields Training a CRF model

Predicting using CRF

Python Implementation of CRF

POS tagging is a crucial part of Problem Statement


Syntactic Syntactic
Syntactic Analysis. Build a POS tagger
Processing -Assignment Assignment - Syntactic Analysis Evaluation Rubric
using a CRF classifier and by
Processing modifying Viterbi Final Submission

Concepts and Terms

Entity and Entity Types

Arity and Reification

Schema
Introduction to
Semantic Associations
Semantic Processing
Databases - WordNet and ConceptNet

Word Sense Disambiguation - Naive Bayes

Word Sense Disambiguation - Lesk Algorithm

Lesk Algorithm Implementation

Occurrence Matrix

Co-occurrence Matrix

Word Vectors

Word Embeddings

Latent Semantic Analysis (LSA)

Comprehension - Latent Semantic Analysis

Skipgram Model
Distributional Semantics
Comprehension - Word2Vec

Generate Vectors using LSA

Word2vec in Python - I

Word2vec and GloVe in Python - II

Semantic Extract meaning Word2vec and GloVe in Python - III

Processing from the text Basics of Topic Modelling with ESA

Introduction to Probabilistic Latent Semantics Analysis


(PLSA)

The Output of a Topic Model

Defining a Topic

Matrix Factorisation Based Topic Modelling

Probabilistic Model

Probabilistic Latent Semantic Analysis (PLSA)

Expectation Maximization in PLSA

Comprehension - Multinomial Distribution in


Topic Modelling Topic Modelling

Latent Dirichlet Allocation (LDA)

LDA - An extension of PLSA

Use LDA to Generate a Corpus

Parameter Estimation using Gibbs Sampling

LDA in Python - I

LDA in Python - II

LDA in Python - III

The Problem Statement


Social Media Opinion Mining -
Project Pipeline
Semantic Processing
Case Study Python code - I

Python code - II

Building Chatbots with Rasa

Installation Guide - Rasa

Natural Language Understanding (NLU)

Training the NLU Model

Building Chatbots Dialogue-Flow Management


With Rasa Creating Conversational Stories & Defining Actions
Learn the fundamentals of building
Building Chatbots Training the Dialogue Management Model
chatbots using open source chatbot
With Rasa building framework -Rasa Interactive Learning

Chatbot Deployment

ML and AI in Business

Problem Statement
NLP Course Project -
Evaluation Rubric
Building a Chatbot
Final Submission
b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

Neural Networks - Inspiration from the Human Brain

Introduction to Perceptron

Binary Classification using Perceptron

Perceptrons - Training

Multiclass Classification using Perceptrons

Structure of Neural Networks Working of a Neuron

Inputs and Outputs of a Neural Network I

Inputs and Outputs of a Neural Network II

Assumptions made to Simplify Neural Networks

Parameters and Hyperparameters of Neural Networks

Activation Functions

Flow of Information in Neural Networks -


Between 2 Layers

Information Flow - Image Recognition

Comprehension - Count of Pixels


Feed Forward in Neural Networks
Learning the Dimensions Weight Matrices

Feedforward Algorithm

Vectorized Feedforward Implementation

Understanding Vectorized Feedforward Implementation

What Does Training a Network Mean?

Complexity of the Loss Function

Comprehension - Training a Neural Network

Updating the Weights and Biases - I


In this module, you'll be introduced to
the basics of Neural Networks and Updating the Weights and Biases - II
Introduction to
various concepts related to Deep
Neural Networks Updating the Weights and Biases - III
Neural Network which will be used Backpropagation in Neural Networks
in the future modules. Sigmoid Backpropagation

Updating the Weights and Biases - IV

Updating the Weights and Biases - V

Updating the Weights and Biases - VI

Batch in Backpropagation

Training in Batches

Regularization

Dropouts
Modifications to Neural Networks
Batch Normalization

Introduction to Keras

Loss Function I

Loss Function II

Minibatch Gradient Descent

Gradient Descent I

Gradient Descent II

Gradient Descent III


Hyperparameter Tuning in
Gradient Descent IV
Neural Networks
Momentum based methods I

Momentum based methods II

Momentum based methods III

Dropouts -The Bayesian Approach

Vanishing and Exploding Gradients

Initializations

Problem Statement
Neural Networks - Implementing multiclass Introduction to Neural Networks-
Assignment classification on MNIST dataset Assignment Evaluation Rubric
using raw NN model in Numpy.
Final Submission

A Specialised Architecture for Visual Data

Applications of CNNs
ELECTIVE- DEEP LEARNING AND NEURAL NETWORKS

Understanding the Visual System of Mammals - I

Understanding the Visual System of Mammals -II

Introduction to CNNs

Reading Digital Images

Video Analysis
Introduction to Convolutional
Understanding Convolutions - I
Neural Networks
Understanding Convolutions - II

Stride and Padding

Important Formulas

Weights of a CNN

Feature Maps

Pooling

Putting the Components Together

Building CNNs in Keras - MNIST

Comprehension - VGG16 Architecture


Building CNNs with
CIFAR-10 Classification with Python - I
Learn how to solve state-of-the-art Python and Keras
Convolutional CIFAR-10 Classification with Python - II
computer vision problems using
Neural Networks CNNs. CIFAR-10 Classification with Python - III

Overview of CNN Architectures

AlexNet and VGGNet

GoogleNet

Residual Net

Introduction to Transfer Learning


CNN Architectures and
Use Cases of Transfer Learning
Transfer Learning
Transfer Learning With Pre-Trained CNNs

Practical Implementation of Transfer Learning

Transfer Learning in Python

An Analysis of Deep Learning Models - I

An Analysis of Deep Learning Models - II

Introduction to Style Transfer

Style Loss and the Gram Matrix

Loss Function
Style Transfer and Object Detection
Style Transfer Notebook

Object Detection - I

Object Detection - II

Examining the Flowers Dataset

Data Preprocessing: Shape, Size and Form

Data Preprocessing: Normalisation

Data Preprocessing: Augmentation

Industry Demo: Data Preprocessing: Practice Exercise Solutions


Using CNNs with Flowers Images ResNet: Original Architecture and Improvements

Building the Network


Convolutional Neural
Learn about how CNNs are
Networks - Ablation Experiments
used in industry
Industry Applications Hyperparameter Tuning

Training and Evaluating the Model

Examining X-ray images

CXR Data Preprocessing - Augmentation


Industry Demo:
CXR Data Preprocessing - Normalisation
Using CNNs with X-ray Images
CXR: Network Building

CXR: Final Run

What are Sequences?

What Makes the Network Recurrent

Architecture of an RNN

Feeding Sequences to RNNs


What Makes a
Comprehension: RNN Architecture
Neural Network Recurrent?
Types of RNNs - I

Training RNNs

Types of RNNs - II

Vanishing and Exploding Gradients in RNNs

Bidirectional RNNs

Long, Short-term Memory Networks


Learn how to use neural networks on
Recurrent Characteristics of an LSTM Cell
sequence problems using recurrent Variants of RNNs
Neural Networks neural networks. Structure of an LSTM Cell

LSTM Network: Feedforward Equations

GRUs and Other Variants

POS Tagging Using RNN - I

POS Tagging Using RNN - II

POS Tagging Using RNN -III

POS Tagging Using RNN -IV


Building RNNs in Python POS Tagging Using RNN -V

Generating C Code - I

Generating C Code - II

Generating C Code - III

RNNs in Python

Problem Statement

Two Architectures: 3D Convs and CNN-RNN Stack


Neural Networks In this module, you'll experiment and
Deep Learning Course Project - Understanding Generators
Project - Gesture create a model that identifies the
Gesture Recognition
Recognition gestures with considerable accuracy. Starter Code Walkthrough

Evaluation Rubric

Final Submission
b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

What to expect What to expect from Healthcare Domain Elective

Introduction
Understanding the Healthcare Market

With all the necessary DA knowledge, Stakeholders of the Primary Healthcare Ecosystem:
Process
Understanding the it is time to get into the domain
Healthcare Domain details. Learn about the healthcare Stakeholders of the Primary Healthcare Ecosystem:
landscape in the US. Introduction to the Healthcare Space Drivers and Metrics
Stakeholders of the Secondary Healthcare Ecosystem:
Process
Stakeholders of the Secondary Healthcare Ecosystem:
Drivers and Metrics
Other Stakeholders of the Healthcare Ecosystem

Introduction
Analytics Related to Patient-Physician Interactions
Clinical Decision Support Systems
Analytics Related to Patient-Hospital Interactions
In this module, you will explore the Management of Patient Traffic - I
different analytics opportunities that
Provider Analytics exist in the healthcare provider Provider Analytics Management of Patient Traffic - II (Comprehension)
space. Management of Patient Traffic - III
Hospital Performance Analysis - I
Hospital Performance Analysis - II (Comprehension)
Hospital Performance Analysis - III
Hospital Compare

Introduction
Payers in the US
Types of Health Insurance
Types of Insurance Plans
Benefits
Getting Familiar with the
Analytics Opportunities in Benefits
US Payer Market
Coordination of Benefits
E L E C T I V E - H E A LT H C A R E

Provider Management - I
Provider Management - II
Pay for Performance (P4P)
Analytics Opportunities in Provider Management
In this module, you will explore the
Payer Analytics different analytics opportunities that Introduction
exist in the healthcare payer space.
Life Cycle of a Health Insurance Claim
Healthcare Coding
Claims Adjudication
Analytics Opportunities in Claims Management
Analytics to Detect Fraudulent Claims
Claims and Care Management
Care Management
Care Management Framework
Risk Stratification
Evaluating a Care Management Program
Accountable Care Organisations (ACOs)
Analytics Opportunities in Care Management

Stratify patients according to the risk Problem Statement


Assignment -
of cost they pose to the healthcare Assignment - Risk Stratification
Payer Analytics Submission
payer

Introduction
Pharmaceutical Market Overview
Drug Development Life Cycle
Areas of Analytics in Pharma
Drug Development and Sales Analytics Pharmaceutical-Selling Process
Field Activity
Analytics in Sales
Analytics in the
Learn how pharmaceutical companies
Pharmaceutical Sales Data
harness the power of data analytics.
Industries Customer Segmentation

Introduction
Structure of a Marketing Organisation
Multichannel Marketing (MCM) Management
Marketing Analytics
Patient Journey Analytics
Analytics Opportunities in Commercial Operations
Market Forecasting

Get a brief overview of how all that Healthcare Course Wrap by Prof. RC
Course Wrap for you have studied in the healthcare Course Wrap
Healthcare domain, finds application in the real
world. Interview tips by Rohit

Problem Statement
Decipher the CMS hospital star rating
Capstone Project system using supervised and unsuper- Capstone- Healthcare Mid Submission
vised models.
Final Submission
b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

Introduction to ecommerce domain Introducing Analytics in E-Commerce

Introduction
Business of ecommerce
Get acquainted with the various Inventory Management
Introduction to
applications of Data Analytics in
E-commerce Marketing in ecommerce
E-commerce business Data Analytics in ecommerce
Improving User Experience
Fraud Detection
Shipment Delivery
Customer Feedback

Introduction
Understanding Recommendation Systems
Content Based Filtering
Learn about the algorithms that
Recommendation
power the recommendation engines Recommendation Systems User Based Collaborative Filtering
systems of the E-commerce sites
Item Based Collaborative Filtering
Issues in Recommendation Systems
Recommender System in Python

Assignment - Build a recommendation engine Problem Statement


Assignment - Recommendation System
Recommendation based on Beer preferences of users.
Submission
systems

Introduction
Understanding Price Markup & Markdown

Price Markup & Markdown Why are Markdowns done?


ELECTIVE- E-COMMERCE

Why are Markups done?


Effect of Price Markup or Markdown

Learn how prices are dynamically Introduction


Price Optimization optimised on an e-commerce platform The Four Forces of Price Optimisation
Demand elasticity
The Four-Force Model Competitive benchmark
Internal economics
Category dynamics
Goal of Price Optimisation

Introduction
What is Market Mix Modelling (MMM)?

Factors that Impact Sales How Does Advertising Impact Revenue?


How Do Pricing & Promotions Impact Revenue?
How Does Product Assortment Impact Revenue?

Introduction
Market Mix Learn how to optimise your marketing Modelling the Advertising Effects - Part I
Modelling spends in order to maximise the ROI.
Modelling the Advertising Effects - Part II (Optional)
Modelling the Advertising Effects - Part III

Modelling the Impact of KPIs Creating AdStocks


Modelling Different Pricing Effects
Overview of KPIs
Presenting the Results
Other Topics

Introduction
Understand the concept behind A/B
A\B Testing Understanding A/B testing
test and also learn how to execute an A/B Testing
(Optional) A/B test in Optimizely Steps in A/B testing
Setting up an A/B Test in Optimizely

Get a brief overview of how all that


Course wrap for you have studied in the ecommerce
Course Wrap Ecommerce Course Wrap by Ujjyaini
E-commerce domain, finds application in the real
world.

Model the impact of different Problem Statement


Capstone Project marketing levers on the sales figure Capstone- E-com Mid Submission
of ElecKart.
Final Submission
b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

What to Expect in BFS Domain Elective Introducing Analytics in Banking and


Financial Services

Introduction
Banking Products - Deposits & Lending
Learn how banks make money How Banks Make Money
Profitability of Credit Cards
Introduction to through various banking products and
Banking and also understand the customer P&L of Banks and Financial Institutions
Financial Services lifecycle.
Introduction
Customer Lifecycle
Customer Lifecycle Customer Lifecycle - Acquisition Analytics
Customer Lifecycle - Engagement Analytics
Customer Lifecycle - Risk Analytics

Introduction

Understand the component of Acquisition - Types of Datasets


Acquisition Strategies & practice Components of Acquisition Strategy
Acquisition
hands-on exercise of Data analytics Introduction to Acquisition Analytics
Analytics for acquiring the potential Market Segmentation
customers Segment Prioritisation
Channel Preferences

Build a response model based on Problem Statement


Assignment- the clients, campaign and economic
Assignment - Acquisition Analytics
Acquisition Analytics information provided by the
Submission
portuguese bank.

Introduction
Engagement Analytics Framework
Cross-Selling Strategies
Engagement Strategies
Types of Cross-Selling
ELECTIVE - BFSI

Cross-Selling Opportunities

Now that you have learnt how to Customer Lifetime Value (CLV)
Engagement acquire customers, learn how to
Introduction
Analytics engage them and prevent their
attrition Cross-Selling Lab Cross-Selling - Business Objectives
Cross-Selling Analysis

Introduction
Types of Attrition
Retention and Loyalty Management
Attrition - Credit Card
Interpreting a Credit Card Attrition Model

Introduction to Risk Analytics


Types of Risk Analytics Types of Risk Analytics
Types of Credit Risk - Operational and Regulatory

Introduction to Operational Risk Analytics


Operational Risk Analytics Framework -
Customer Lifecycle

Operational Risk Analytics - Acquisition Introduction to Acquisition Risk Analytics


Metrics to Measure Acquisition Risk

Learn about the risk associated Implementing Acquisition Risk Models


with customers who default on Validating Acquisition Risk Models
Risk analytics
their loan or credit, and the
analytics related to it. Introduction

Operational Risk Analytics - Roll Rate Matrix


Existing Customer Management Data and Models in ECM
Validation of Behaviour Models

Operational Risk Analytics - Collection and Recovery Management


Collection & Recovery Validation of Collection and Recovery Models

Introduction
Regulatory Risk Analytics
Regulatory Risk Analytics - A Brief Introduction
(Optional)

Get a brief overview of customer


lifecycle that you have studied in
Course wrap for BFS Course Wrap BFS Course Wrap by Kalpana
the BFS domain, finds its
application in the real world.

Help CredX identify the ideal Problem Statement


applicants to provide credit cards
Capstone Project Capstone- BFS Mid Submission
to by building an application
scorecard. Final Submission
b

PG Program in Data Science


(Program Curriculum)

For Prep Sessions + Batch Start Dates:


Please refer to upgrad.com

Note: This curriculum is subject to change based on inputs from IIITB and Industry

COURSE MODULE NAME DESCRIPTION SESSION SEGMENT

Introduction to Kaggle

An introduction to the world of Creating an account


Introduction to
Kaggle. How it can be used to Introduction to Kaggle Datasets
Kaggle enhance visibility.
MINI-CAPSTONE

Kernels
Competitions

Basic Feature Engineering Basic Feature Engineering


Build general features to build a
Feature Engineering model for text analytics Advanced Feature Engineering
Advanced Feature Engineering
Model Building

Problem Statement
Solve a problem based on one of the
competitions held on Kaggle or on an Problem statement Evaluation Rubric
Mini Capstone industry dataset as a final test of what Final Submission
you have learned so far.
Solution Solution

You might also like