You are on page 1of 11

Training Details:

Course Outline:

Data Science training Using Python and R

# Basic Introduction

Introduction to Python
Intro to Jupyter Notebook & Colab
GitHub Introduction

# Python

What is Python
Identifiers and Keywords in Python
Data Types & Type casting
Basic Operation and Operator in Python
Operators in Python
Indentation, Statements and Comment
Data Structures in Python----> Array, String, Lists, Tuples, Set and Dictionaries
Conditional statements in Python
Loops in Python
Function in Python
Lambda functions
Classes and OOPs concept
Regular expression
File handling
Exception handling

# Advance Python

Numpy
Pandas
Matplotlib, Seaborn and Plotly
Command line Arguments
Handling Data and time
Image Processing using OpenCV
Advance Image Processing

# Statistics

Introduction to statistics
Types of statistics--- Descriptive and Inferential
Descriptive Statistics
Variables and Types of Variables
Measure of Center and Measure of Spread
Measures of Central Tendency
Measures of Dispersion
Mean, Mode, Median
Range, Standard Deviation, Variance, Quartile, IQR
Covariance and Correlation between data

# Advance Statistics

Inferential Statistics
Sample v/s Population
Hypothesis Testing
Null and Alternative hypotheses
Type I error vs Type II error
Establishing a rejection region and a significance level
What is the p-value and why is it one of the most useful tools for statisticians
Learning about T-test
One Sample, two Sample T-test
Anova, One Way Annova and Two way Annova
Chi-square Analysis
Parametric and Non-parametric tests

# Probability

Introduction to probability
Bayes Theorem
Bernoulli’s Theorem
Independent & Dependent Events
Conditional Probability
Distribution and its Types
What is Central Limit Theorem
Skewness & Kurtosis
Sampling and different sampling techniques
What is Outlier and its Significance

# Knowledge on Database and its types

Introduction to Database
Types of Databases
 Relational Database,
 Object-Oriented Database,
 Distributed Database,
 NoSQL Database,
 Graph Database,
 Cloud Database,
 Centralization Database,
 Operational Database
Components of Database

# SQL

Introduction to Structured Query Language


What is RDBMS-Relational Database Management System
Introduction and Types of SQL Operators

Creating Databases and Tables


Explore Entities and Relationships
DDL & DML Statement
Select Statement, Aggregate Functions
Insert into, Where, Order By, Distinct, Group By, Like, In, Between Operators,
Limit Aliases, and & or Clause
Update & Delete Query
SQL Joins-What are Joins, Inner Join, Left Join, Right Join, Full Join
Multiple Joins-Joining More than two tables

# Advance SQL

Subqueries
How to write Subqueries in SQL
Views, functions, and Stored Procedure
Transactions
String, transformation, and Regex
Date time manipulation

# Introduction to R programming

What is R?
Installing R and RStudio
RStudio Overview

R packages and scripts


Installing and loading packages
Setting up your working directory
Downloading and importing data
Working with missing data
Extracting a subset of a data frame
Writing R scripts
Adding comments and documentation
Creating reports

Conditional statements
If / else
Boolean logical operators

Iteration
while loops
for loop
repeat loop
Next, Break Statement

Data exploration and visualization


Using the ggplot2 package to visualize data

EDA
Exploratory Data Analysis to gather trends, insights from the data
Help businesses take better businesses decisions

Data Preparation
Data Manipulation Using Dplyr (filter, select, arrange, groupby-summarize)
Data Reshaping by Tidyr
Data Cleaning (Handling Missing Values)

Machine learning

Introduction to Machine learning


Types of Machine learning
Application of Machine learning

Linear models
Introduction to linear regression
Mathematics behind linear regression
Least square method
How to evaluate regression model
Python and R implementations
Interpretation of Model coefficient

Assumptions of linear regression


Statistical test in linear regression
Python and R implementations
Interpretation of Model diagnostic plot

Introduction to Multiple linear regression


Features Selection and its types
Python and R implementations

Regularization algorithm
What is regularization
Lasso regression and Its implementation in Python and R
Ridge regression and Its implementation in Python and R
ElasticNet regression and Its implementation in Python and R

Generalized Linear models


What is GLM model
Introduction to logistic regression
Mathematics behind logistic regression
Cost Function
Odds & Odds ratio
Model Evaluation
Python Implementation and Interpretation

Decision Tree algorithm


Decision Tree Algorithms
Attribute Selection Measures
Entropy & Information Gain
Steps to Estimate Entropy & Information Gain
Issues with Decision Trees
Bias Variance Trade Off
Decision Tree Applications
Python and R Implementation

Ensemble model
Introduction to ensemble
Bagging and boosting

Bagging model
Introduction to bagging
Random forest
Reason to use Random Forest
Random Forest Types
Random Forest Applications
Python and R Implementation
Hyperparameter Tuning

Stacking
Introduction to stacking
Implementation

Blending
Introduction to blending
Implementation

Support Vector Machines


Types of SVM
Hyperplane in the SVM algorithm
Large Margin Intuition
SVM Implementation in Python and R

Boosting algorithm
Introduction to Boosting
Gradient Boosting
Adaboost
Catboost
LightGBM
K nearest Neighbors
Introduction to KNN
Mathematics behind KNN
Implementation in Python and R

Naïve bayes classifier


Introduction to naïve bayes
Mathematics behind Naïve bayes
Curse of Dimensionality
Implementation in Python and R

Dimensionality reduction algorithm


What is dimensionality reduction
Introduction to Principal component analysis
Math behind it
Implementation in Python and R
Exploratory Factor Analysis
Introduction to singular value decomposition
Implementation

Introduction to linear discriminant analysis


Implementation

Segmentation techniques
What is Clustering?
K-Means Clustering
When to use K-Means Clustering?
What is K?
Euclidean Distance
K-Means Clustering Example
Implementation in python and R

Agglomerative clustering
Implementation in python and R

Association rule algorithm


Introduction to association rule
Apriori algorithm
Market basket optimization
Implementation

Feature engineering
Feature Encoding
Factor Analysis
Feature Scaling
Feature Selection
Outlier Treatment
Tableau tutorial

Introduction to Tableau
Introduction
Installing Tableau
Data Preparation

Connecting to Various DataSource like text, excel and database

Working with Metadata


Introduction
Data types
Rename, Hide, Unhide and Sort Columns
Dealing with NULL values

Data Blending
What are Blends
Steps for Blending
Understand Primary and Secondary Data Sources
Work Across Blended Data Sources
Define Blend Relationships for Blending
Establish a Link
Multiple Links
Blending Limitations

Filters
Introduction to filters
Types of filters

Charts and Graphs


Various types of charts
Bar charts and the stacked bar chart
Line chart
Scatter plot
Histogram
Dual axis chart
Combined-Axis Chart
Funnel Chart
Cross Tabs
Highlight Tables
Maps

Advance graphs and charts


Box and Whisker’s Plot
Bullet Chart
Bar in Bar Chart
Gantt Chart
Control Chart
Pareto Chart
Waterfall Chart
Funnel Chart
Bump Chart
Step and Jump Lines
Donut Chart
Word Cloud

Level of Details
Introduction
Syntax
Aggregation and replication with LOD expressions
Nested LOD expression
SVM
Naive Bayes
K-Means Clustering
Feature Selection

How to create Dashboard and tell story


Introduction to Dashboards
Building a Dashboard
How to tell story with data

Deep learning
Introduction
Real Life Applications of Deep Learning
Difference between Machine learning and deep learning
Challenges of Deep learning
Architecture of Deep learning projects
Various frameworks in deep learning

Deep learning with TensorFlow


Introduction to TensorFlow
Machine learning algorithms in TensorFlow

Artificial Neural network


Introduction
Backpropagation
Weight and Bias
Activation function
Deep Neural Networks

Convolution Neural Network


Introduction
Mathematics behind CNN
Famous CNN Architectures
Transfer Learning
Recurrent Neural Network
Introduction
Application of RNN
Problems with RNN

Introduction to Long short-term memory


Implementation

Introduction to GRU
Implementation

Introduction to Seq2seq
Implementation

Introduction to encoder and decoder


Implementation

Introduction to transformer and BERT


Implementation

Introduction to teacher forcing algorithm


Implementation

Restricted Boltzmann Machine


Introduction
Math behind RBM
Applications
Implementation

Autoencoder
Introduction
Architecture explanation
Application of autoencoder
Implementation

Self-Organizing maps
Introduction
SOM explanation and applications
Implementation

Advance deep learning algorithm


Introduction
Neural style transfer
Hybrid learning
Algorithm in Hybrid learning along with their implementations
Time series analysis and forecasting
Introduction
Stationarity
Check for the stationarity
How to make dataset stationary

Various time series algorithm with their implementations


Deep learning algorithm for time series forecasting

Natural language processing

Web scrapping
What is Web Scraping?
Application
Components of a Web Scraper
Python Libraries for web scraping
Extract data using Beautiful Soup
Full Table Scraping
Row Scraping Header Scraping

Natural language processing


Introduction to NLP
What is NLP?
Typical NLP Tasks
Understanding text data
Text preprocessing in Python and R
Extracting Features from Text
Bag-of-Words
TF-IDF Similarity score
Cosine similarity
Naïve Bayes Classifier

# Cloud platforms

Introduction
Various cloud platform

Azure Machine learning


Introduction
Set up AML workspace
Loading the data
How to create a machine learning pipeline in Azure
Data processing in Azure ML designer
Regression algorithm
Classification algorithm
AzureML with AzureML sdk
Run experiments and train machine learning model
Use Automated machine learning for creating optimal model
Hyperparameter tunning in azure hyperdrive
Model explainable
Model registration and deployment
Databricks with Azure machine learning

AWS Sage maker


Introduction
Understanding the sagemaker environment
Sagemaker in ML development
Setting up Sagemaker for project
Creating S3 bucket
Project #1- Sales forecasting using Linear learner
Project #2- Skin lesion classification using deep learning
Project #3- Water demand forecasting
Project #4- Electricity bill prediction

Free
Docker tutorial
Kubernetes tutorial

Number of Participants: 15 max

Course duration – 6months

More than 70+ project

You might also like