You are on page 1of 25

# .

## 1 Introduction to IRIS dataset and 2D scatter plot

26 min
9.2 3D scatter plot
06 min
9.3 Pair plots
14 min
9.4 Limitations of Pair Plots
02 min
9.5 Histogram and Introduction to PDF(Probability Density Function)
17 min
9.6 Univariate Analysis using PDF
06 min
9.7 CDF(Cumulative Distribution Function)
15 min
9.8 Mean, Variance and Standard Deviation
17 min
9.9 Median
10 min
9.10 Percentiles and Quantiles
09 min
9.11 IQR(Inter Quartile Range) and MAD(Median Absolute Deviation)
06 min
9.12 Box-plot with Whiskers
09 min
9.13 Violin Plots
04 min
9.14 Summarizing Plots, Univariate, Bivariate and Multivariate analysis
06 min
9.15 Multivariate Probability Density, Contour Plot
09 min
9.16 Exercise: Perform EDA on Haberman dataset
04 min
LINEAR ALGEBRA 0/1
It will give you the tools to help you with the other areas of mathematics required
to understand and build better intuitions for machine learning algorithms.

## 10.1 Why learn it ?

04 min
10.2 Introduction to Vectors(2-D, 3-D, n-D) , Row Vector and Column Vector
14 min
10.3 Dot Product and Angle between 2 Vectors
14 min
10.4 Projection and Unit Vector
05 min
10.5 Equation of a line (2-D), Plane(3-D) and Hyperplane (n-D), Plane Passing
through origin, Normal to a Plane
23 min
10.6 Distance of a point from a Plane/Hyperplane, Half-Spaces
10 min
10.7 Equation of a Circle (2-D), Sphere (3-D) and Hypersphere (n-D)
07 min
10.8 Equation of an Ellipse (2-D), Ellipsoid (3-D) and Hyperellipsoid (n-D)
06 min
10.9 Square ,Rectangle
06 min
10.10 Hyper Cube,Hyper Cuboid
03 min
10.11 Revision Questions
30 min
PROBABILITY AND STATISTICS 0/33
11.1 Introduction to Probability and Statistics
17 min
11.2 Population and Sample
07 min
11.3 Gaussian/Normal Distribution and its PDF(Probability Density Function)
27 min
11.4 CDF(Cumulative Distribution function) of Gaussian/Normal distribution
11 min
11.5 Symmetric distribution, Skewness and Kurtosis
24 min
11.6 Standard normal variate (Z) and standardization
05 min
11.7 Kernel density estimation
07 min
11.8 Sampling distribution & Central Limit theorem
19 min
11.9 Q-Q plot:How to test if a random variable is normally distributed or not?
23 min
11.10 How distributions are used?
17 min
11.11 Chebyshev�s inequality
20 min
11.12 Discrete and Continuous Uniform distributions
13 min
11.13 How to randomly sample data points (Uniform Distribution)
10 min
11.14 Bernoulli and Binomial Distribution
11 min
11.15 Log Normal Distribution
12 min
11.16 Power law distribution
12 min
11.17 Box cox transform
12 min
11.18 Applications of non-gaussian distributions?
26 min
11.19 Co-variance
14 min
11.20 Pearson Correlation Coefficient
13 min
11.21 Spearman Rank Correlation Coefficient
07 min
11.22 Correlation vs Causation
03 min
11.23 How to use correlations?
13 min
11.24 Confidence interval (C.I) Introduction
08 min
11.25 Computing confidence interval given the underlying distribution
11 min
11.26 C.I for mean of a normal random variable
14 min
11.27 Confidence interval using bootstrapping
17 min
11.28 Hypothesis testing methodology, Null-hypothesis, p-value
16 min
11.29 Hypothesis Testing Intution with coin toss example
27 min
11.30 Resampling and permutation test
15 min
11.31 K-S Test for similarity of two distributions
15 min
11.32 Code Snippet K-S Test
06 min
11.33 Hypothesis testing: another example
18 min
11.34 Resampling and Permutation test: another example
19 min
11.35 How to use hypothesis testing?
23 min
11.36 Proportional Sampling
18 min
11.37 Revision Questions
30 min
INTERVIEW QUESTIONS ON PROBABILITY AND STATISTICS 0/1
30 min
DIMENSIONALITY REDUCTION AND VISUALIZATION: 0/0
In machine learning and statistics, dimensionality reduction or dimension reduction
is the process of reducing the number of random variables under consideration, via
obtaining a set of principal variables. It can be divided into feature selection
and feature extraction.

## 13.1 What is Dimensionality reduction?

03 min
13.2 Row Vector and Column Vector
05 min
13.3 How to represent a data set?
04 min
13.4 How to represent a dataset as a Matrix.
07 min
13.5 Data Preprocessing: Feature Normalisation
20 min
13.6 Mean of a data matrix
06 min
13.7 Data Preprocessing: Column Standardization
16 min
13.8 Co-variance of a Data Matrix
24 min
13.9 MNIST dataset (784 dimensional)
20 min
13.10 Code to Load MNIST Data Set
12 min
PCA(PRINCIPAL COMPONENT ANALYSIS) 0/0
14.1 Why learn PCA?
04 min
14.2 Geometric intuition of PCA
14 min
14.3 Mathematical objective function of PCA
13 min
14.4 Alternative formulation of PCA: Distance minimization
10 min
14.5 Eigen values and Eigen vectors (PCA): Dimensionality reduction
23 min
14.6 PCA for Dimensionality Reduction and Visualization
10 min
14.7 Visualize MNIST dataset
05 min
14.8 Limitations of PCA
05 min
14.9 PCA Code example
19 min
14.10 PCA for dimensionality reduction (not-visualization)
15 min
(T-SNE)T-DISTRIBUTED STOCHASTIC NEIGHBOURHOOD EMBEDDING 0/1
15.1 What is t-SNE?
07 min
15.2 Neighborhood of a point, Embedding
07 min
15.3 Geometric intuition of t-SNE
09 min
15.4 Crowding Problem
08 min
15.5 How to apply t-SNE and interpret its output
38 min
15.6 t-SNE on MNIST
07 min
15.7 Code example of t-SNE
09 min
15.8 Revision Questions
30 min
INTERVIEW QUESTIONS ON DIMENSIONALITY REDUCTION 0/1
30 min
REAL WORLD PROBLEM: PREDICT RATING GIVEN PRODUCT REVIEWS ON AMAZON 0/17
17.1 Dataset overview: Amazon Fine Food reviews(EDA)
23 min
17.2 Data Cleaning: Deduplication
15 min
17.3 Why convert text to a vector?
14 min
17.4 Bag of Words (BoW)
18 min
17.5 Text Preprocessing: Stemming, Stop-word removal, Tokenization, Lemmatization.
15 min
17.6 uni-gram, bi-gram, n-grams.
09 min
17.7 tf-idf (term frequency- inverse document frequency)
22 min
14 min
17.9 Word2Vec.
16 min
17.10 Avg-Word2Vec, tf-idf weighted Word2Vec
09 min
17.11 Bag of Words( Code Sample)
19 min
17.12 Text Preprocessing( Code Sample)
11 min
17.13 Bi-Grams and n-grams (Code Sample)
05 min
17.14 TF-IDF (Code Sample)
06 min
17.15 Word2Vec (Code Sample)
12 min
17.16 Avg-Word2Vec and TFIDF-Word2Vec (Code Sample)
02 min
17.17 Assignment-2: Apply t-SNE
30 min
CLASSIFICATION AND REGRESSION MODELS: K-NEAREST NEIGHBORS 0/32
18.1 How �Classification� works?
10 min
18.2 Data matrix notation
07 min
18.3 Classification vs Regression (examples)
06 min
18.4 K-Nearest Neighbours Geometric intuition with a toy example
11 min
18.5 Failure cases of KNN
07 min
18.6 Distance measures: Euclidean(L2) , Manhattan(L1), Minkowski, Hamming
20 min
18.7 Cosine Distance & Cosine Similarity
19 min
18.8 How to measure the effectiveness of k-NN?
16 min
18.9 Test/Evaluation time and space complexity
12 min
18.10 KNN Limitations
09 min
18.11 Decision surface for K-NN as K changes
23 min
18.12 Overfitting and Underfitting
12 min
18.13 Need for Cross validation
22 min
18.14 K-fold cross validation
17 min
18.15 Visualizing train, validation and test datasets
13 min
18.16 How to determine overfitting and underfitting?
19 min
18.17 Time based splitting
19 min
18.18 k-NN for regression
05 min
18.19 Weighted k-NN
08 min
18.20 Voronoi diagram
04 min
18.21 Binary search tree
16 min
18.22 How to build a kd-tree
17 min
18.23 Find nearest neighbours using kd-tree
13 min
18.24 Limitations of Kd tree
09 min
18.25 Extensions
03 min
18.26 Hashing vs LSH
10 min
18.27 LSH for cosine similarity
40 min
18.28 LSH for euclidean distance
13 min
18.29 Probabilistic class label
08 min
18.30 Code Sample:Decision boundary .
23 min
18.31 Code Sample:Cross Validation
13 min
18.32 Revision Questions
30 min
INTERVIEW QUESTIONS ON K-NN(K NEAREST NEIGHBOUR) 0/1
30 min
CLASSIFICATION ALGORITHMS IN VARIOUS SITUATIONS 0/21
20.1 Introduction
05 min
20.2 Imbalanced vs balanced dataset
23 min
20.3 Multi-class classification
12 min
20.4 k-NN, given a distance or similarity matrix
09 min
20.5 Train and test set differences
22 min
20.6 Impact of outliers
07 min
20.7 Local outlier Factor (Simple solution :Mean distance to Knn)
13 min
20.8 k distance
04 min
20.9 Reachability-Distance(A,B)
08 min
20.10 Local reachability-density(A)
09 min
20.11 Local outlier Factor(A)
21 min
20.12 Impact of Scale & Column standardization
12 min
20.13 Interpretability
12 min
20.14 Feature Importance and Forward Feature selection
22 min
20.15 Handling categorical and numerical features
24 min
20.16 Handling missing values by imputation
21 min
20.17 curse of dimensionality
27 min
24 min
20.19 Intuitive understanding of bias-variance.
06 min
20.20 Revision Questions
30 min
20.21 best and wrost case of algorithm
06 min
PERFORMANCE MEASUREMENT OF MODELS 0/10
21.1 Accuracy
15 min
21.2 Confusion matrix, TPR, FPR, FNR, TNR
25 min
21.3 Precision and recall, F1-score
10 min
21.4 Receiver Operating Characteristic Curve (ROC) curve and AUC
19 min
21.5 Log-loss
12 min
21.6 R-Squared/Coefficient of determination
14 min
05 min
21.8 Distribution of errors
07 min
21.9 Assignment-3: Apply k-Nearest Neighbor
05 min
21.10 Revision Questions
30 min
INTERVIEW QUESTIONS ON PERFORMANCE MEASUREMENT MODELS 0/1
30 min
NAIVE BAYES 0/22
23.1 Conditional probability
13 min
23.2 Independent vs Mutually exclusive events
06 min
23.3 Bayes Theorem with examples
18 min
23.4 Exercise problems on Bayes Theorem
30 min
23.5 Naive Bayes algorithm
26 min
23.6 Toy example: Train and test stages
26 min
23.7 Naive Bayes on Text data
16 min
24 min
23.9 Log-probabilities for numerical stability
11 min
14 min
23.11 Feature importance and interpretability
10 min
23.12 Imbalanced data
14 min
23.13 Outliers
06 min
23.14 Missing values
03 min
23.15 Handling Numerical features (Gaussian NB)
13 min
23.16 Multiclass classification
02 min
23.17 Similarity or Distance matrix
03 min
23.18 Large dimensionality
02 min
23.19 Best and worst cases
08 min
23.20 Code example
07 min
23.21 Assignment-4: Apply Naive Bayes
06 min
23.22 Revision Questions
30 min
LOGISTIC REGRESSION 0/18
24.1 Geometric intuition of Logistic Regression
31 min
24.2 Sigmoid function: Squashing
37 min
24.3 Mathematical formulation of Objective function
24 min
24.4 Weight vector
11 min
24.5 L2 Regularization: Overfitting and Underfitting
26 min
24.6 L1 regularization and sparsity
11 min
24.7 Probabilistic Interpretation: Gaussian Naive Bayes
19 min
24.8 Loss minimization interpretation
24 min
24.9 hyperparameters and random search
16 min
24.10 Column Standardization
05 min
24.11 Feature importance and Model interpretability
14 min
24.12 Collinearity of features
14 min
24.13 Test/Run time space and time complexity
10 min
24.14 Real world cases
11 min
24.15 Non-linearly separable data & feature engineering
28 min
24.16 Code sample: Logistic regression, GridSearchCV, RandomSearchCV
23 min
24.17 Assignment-5: Apply Logistic Regression
06 min
24.18 Extensions to Generalized linear models
09 min
LINEAR REGRESSION 0/4
25.1 Geometric intuition of Linear Regression
13 min
25.2 Mathematical formulation
14 min
25.3 Real world Cases
08 min
25.4 Code sample for Linear Regression
13 min
SOLVING OPTIMIZATION PROBLEMS 0/13
26.1 Differentiation
29 min
26.2 Online differentiation tools
08 min
26.3 Maxima and Minima
12 min
10 min
19 min
26.6 Learning rate
08 min
26.7 Gradient descent for linear regression
08 min
26.8 SGD algorithm
09 min
26.9 Constrained Optimization & PCA
14 min
26.10 Logistic regression formulation revisited
06 min
26.11 Why L1 regularization creates sparsity?
17 min
26.12 Assignment 6: Implement SGD for linear regression ?
06 min
26.13 Revision questions
30 min
INTERVIEW QUESTIONS ON LOGISTIC REGRESSION AND LINEAR REGRESSION 0/1
30 min
SUPPORT VECTOR MACHINES (SVM) 0/16
28.1 Geometric Intution
20 min
28.2 Mathematical derivation
32 min
28.3 Why we take values +1 and and -1 for Support vector planes
09 min
28.4 Loss function (Hinge Loss) based interpretation
18 min
28.5 Dual form of SVM formulation
16 min
28.6 kernel trick
10 min
28.7 Polynomial Kernel
11 min
28.8 RBF-Kernel
21 min
28.9 Domain specific Kernels
06 min
28.10 Train and run time complexities
08 min
28.11 nu-SVM: control errors and support vectors
06 min
28.12 SVM Regression
08 min
28.13 Cases
09 min
28.14 Code Sample
14 min
28.15 Assignment-7: Apply SVM ?
04 min
28.16 Revision Questions
30 min
INTERVIEW QUESTIONS ON SUPPORT VECTOR MACHINE 0/1
30 min
DECISION TREES 0/16
30.1 Geometric Intuition of decision tree: Axis parallel hyperplanes
17 min
30.2 Sample Decision tree
08 min
30.3 Building a decision Tree:Entropy
19 min
30.4 Building a decision Tree:Information Gain
10 min
30.5 Building a decision Tree: Gini Impurity
07 min
30.6 Building a decision Tree: Constructing a DT
21 min
30.7 Building a decision Tree: Splitting numerical features
08 min
30.8 Feature standardization
04 min
30.9 Building a decision Tree:Categorical features with many possible values
07 min
30.10 Overfitting and Underfitting
08 min
30.11 Train and Run time complexity
07 min
30.12 Regression using Decision Trees
09 min
30.13 Cases
12 min
30.14 Code Samples
09 min
30.15 Assignment-8: Apply Decision Trees ?
03 min
30.16 Revision Questions
30 min
INTERVIEW QUESTIONS ON DECISION TREES 0/1
30 min
ENSEMBLE MODELS 0/20
32.1 What are ensembles?
06 min
32.2 Bootstrapped Aggregation (Bagging) Intuition
17 min
32.3 Random Forest and their construction
15 min
07 min
32.5 Train and run time complexity
09 min
32.6 Bagging:Code Sample
04 min
32.7 Extremely randomized trees
08 min
32.8 Random Forest :Cases
06 min
32.9 Boosting Intuition
17 min
32.10 Residuals, Loss functions and gradients
13 min
10 min
32.12 Regularization by Shrinkage
08 min
32.13 Train and Run time complexity
06 min
32.14 XGBoost: Boosting + Randomization
14 min
07 min
32.16 Stacking models
22 min
15 min
32.18 Kaggle competitions vs Real world
09 min
32.19 Assignment-9: Apply Random Forests & GBDT ?
04 min
32.20 Revision Questions
30 min
FEATURIZATION AND FEATURE ENGINEERING. 0/18
33.1 Introduction
17 min
33.2 Moving window for Time Series Data
25 min
33.3 Fourier decomposition
22 min
33.4 Deep learning features: LSTM
08 min
33.5 Image histogram
23 min
33.6 Keypoints: SIFT.
10 min
33.7 Deep learning features: CNN
04 min
33.8 Relational data
10 min
33.9 Graph data
12 min
33.10 Indicator variables
07 min
33.11 Feature binning
14 min
33.12 Interaction variables
08 min
33.13 Mathematical transforms
04 min
33.14 Model specific featurizations
09 min
33.15 Feature orthogonality
11 min
33.16 Domain specific featurizations
04 min
33.17 Feature slicing
10 min
33.18 Kaggle Winners solutions
07 min
MISCELLANEOUS TOPICS 0/12
34.1 Calibration of Models:Need for calibration
08 min
34.2 Productionization and deployment of Machine Learning Models
30 min
34.3 Calibration Plots.
17 min
34.4 Platt�s Calibration/Scaling.
08 min
34.5 Isotonic Regression
11 min
34.6 Code Samples
04 min
34.7 Modeling in the presence of outliers: RANSAC
13 min
34.8 Productionizing models
17 min
34.9 Retraining models periodically.
08 min
34.10 A/B testing.
22 min
34.11 Data Science Life cycle
17 min
34.12 VC dimension
22 min
CASE STUDY 1: QUORA QUESTION PAIR SIMILARITY PROBLEM 0/16
35.1 Business/Real world problem : Problem definition
06 min
05 min
35.3 Mapping to an ML problem : Data overview
05 min
35.4 Mapping to an ML problem : ML problem and performance metric.
04 min
35.5 Mapping to an ML problem : Train-test split
05 min
35.6 EDA: Basic Statistics.
07 min
35.7 EDA: Basic Feature Extraction
06 min
35.8 EDA: Text Preprocessing
10 min
31 min
35.10 EDA: Feature analysis.
09 min
35.11 EDA: Data Visualization: T-SNE.
03 min
35.12 EDA: TF-IDF weighted Word2Vec featurization.
06 min
06 min
35.14 ML Models: Random Model
07 min
35.15 ML Models : Logistic Regression and Linear SVM
11 min
35.16 ML Models : XGBoost
06 min
35.17 Assignments
04 min
CASE STUDY 2: PERSONALIZED CANCER DIAGNOSIS 0/21
36.1 Business/Real world problem : Overview
13 min
11 min
36.3 ML problem formulation :Data
05 min
36.4 ML problem formulation: Mapping real world to ML problem.
19 min
36.5 ML problem formulation :Train, CV and Test data construction
04 min
36.6 Exploratory Data Analysis:Reading data & preprocessing
07 min
36.7 Exploratory Data Analysis:Distribution of Class-labels
07 min
36.8 Exploratory Data Analysis: �Random� Model
19 min
36.9 Univariate Analysis:Gene feature
34 min
36.10 Univariate Analysis:Variation Feature
19 min
36.11 Univariate Analysis:Text feature
15 min
36.12 Machine Learning Models:Data preparation
08 min
36.13 Baseline Model: Naive Bayes
23 min
36.14 K-Nearest Neighbors Classification
09 min
36.15 Logistic Regression with class balancing
10 min
36.16 Logistic Regression without class balancing
04 min
36.17 Linear-SVM.
06 min
36.18 Random-Forest with one-hot encoded features
07 min
36.19 Random-Forest with response-coded features
06 min
36.20 Stacking Classifier
08 min
36.21 Majority Voting classifier
05 min
36.22 Assignments.
05 min
CASE STUDY 3:FACEBOOK FRIEND RECOMMENDATION USING GRAPH MINING 0/19
37.1 Problem definition.
06 min
37.2 Overview of Graphs: node/vertex, edge/link, directed-edge, path.
11 min
37.3 Data format & Limitations.
09 min
37.4 Mapping to a supervised classification problem.
09 min
07 min
37.6 EDA:Basic Stats
01 hour 01 min
37.7 EDA:Follower and following stats.
12 min
16 min
37.9 EDA:Train and test split.
39 min
37.10 Feature engineering on Graphs:Jaccard & Cosine Similarities
15 min
37.11 PageRank
14 min
37.12 Shortest Path
04 min
37.13 Connected-components
12 min
12 min
37.15 Kartz Centrality
06 min
37.16 HITS Score
10 min
37.17 SVD
11 min
37.18 Weight features
06 min
37.19 Modeling
10 min
CASE STUDY 4:TAXI DEMAND PREDICTION IN NEW YORK CITY 0/28
09 min
38.2 Objectives and Constraints
11 min
38.3 Mapping to ML problem :Data
08 min
38.4 Mapping to ML problem :dask dataframes
11 min
38.5 Mapping to ML problem :Fields/Features.
06 min
38.6 Mapping to ML problem :Time series forecasting/Regression
08 min
38.7 Mapping to ML problem :Performance metrics
06 min
38.8 Data Cleaning :Latitude and Longitude data
04 min
38.9 Data Cleaning :Trip Duration.
07 min
38.10 Data Cleaning :Speed.
05 min
38.11 Data Cleaning :Distance.
02 min
38.12 Data Cleaning :Fare
06 min
38.13 Data Cleaning :Remove all outliers/erroneous points
03 min
38.14 Data Preparation:Clustering/Segmentation
19 min
38.15 Data Preparation:Time binning
05 min
38.16 Data Preparation:Smoothing time-series data.
05 min
38.17 Data Preparation:Smoothing time-series data cont..
02 min
38.18 Data Preparation: Time series and Fourier transforms.
13 min
38.19 Ratios and previous-time-bin values
09 min
38.20 Simple moving average
08 min
38.21 Weighted Moving average.
05 min
38.22 Exponential weighted moving average
06 min
38.23 Results.
04 min
38.24 Regression models :Train-Test split & Features
08 min
38.25 Linear regression.
03 min
38.26 Random Forest regression
04 min
38.27 Xgboost Regression
02 min
38.28 Model comparison
06 min
38.29 Assignment.
06 min
CASE STUDY 5: STACKOVERFLOW TAG PREDICTOR 0/17
10 min
05 min
39.3 Mapping to an ML problem: Data overview
04 min
39.4 Mapping to an ML problem:ML problem formulation.
05 min
39.5 Mapping to an ML problem:Performance metrics.
21 min
39.6 Hamming loss
07 min
13 min
39.8 EDA:Analysis of tags
11 min
39.9 EDA:Data Preprocessing
11 min
39.10 Data Modeling : Multi label Classification
18 min
39.11 Data preparation.
08 min
39.12 Train-Test Split
02 min
39.13 Featurization
06 min
39.14 Logistic regression: One VS Rest
07 min
39.15 Sampling data and tags+Weighted models.
04 min
39.16 Logistic regression revisited
04 min
39.17 Why not use advanced techniques
03 min
39.18 Assignments.
05 min
CASE STUDY 6: MICROSOFT MALWARE DETECTION 0/20
40.1 Business/real world problem :Problem definition
06 min
40.2 Business/real world problem :Objectives and constraints
07 min
40.3 Machine Learning problem mapping :Data overview.
13 min
40.4 Machine Learning problem mapping :ML problem
12 min
40.5 Machine Learning problem mapping :Train and test splitting
04 min
40.6 Exploratory Data Analysis :Class distribution.
03 min
40.7 Exploratory Data Analysis :Feature extraction from byte files
08 min
40.8 Exploratory Data Analysis :Multivariate analysis of features from byte files
03 min
40.9 Exploratory Data Analysis :Train-Test class distribution
03 min
40.10 ML models � using byte files only :Random Model
11 min
40.11 k-NN
07 min
40.12 Logistic regression
05 min
40.13 Random Forest and Xgboost
07 min
40.14 ASM Files :Feature extraction & Multiprocessing.
11 min
40.15 File-size feature
02 min
40.16 Univariate analysis
03 min
40.17 t-SNE analysis.
02 min
40.18 ML models on ASM file features
07 min
40.19 Models on all features :t-SNE
02 min
40.20 Models on all features :RandomForest and Xgboost
04 min
40.21 Assignments.
04 min
UNSUPERVISED LEARNING/CLUSTERING 0/14
41.1 What is Clustering?
10 min
41.2 Unsupervised learning
04 min
41.3 Applications
16 min
41.4 Metrics for Clustering
13 min
41.5 K-Means: Geometric intuition, Centroids
08 min
41.6 K-Means: Mathematical formulation: Objective function
11 min
41.7 K-Means Algorithm.
11 min
41.8 How to initialize: K-Means++
24 min
41.9 Failure cases/Limitations
11 min
41.10 K-Medoids
19 min
41.11 Determining the right K
05 min
41.12 Code Samples
07 min
41.13 Time and space complexity
04 min
41.14 Assignment-10: Apply K-means, Agglomerative, DBSCAN clustering algorithms ?
05 min
HIERARCHICAL CLUSTERING TECHNIQUE 0/7
42.1 Agglomerative & Divisive, Dendrograms
13 min
42.2 Agglomerative Clustering
09 min
42.3 Proximity methods: Advantages and Limitations.
24 min
42.4 Time and Space Complexity
04 min
42.5 Limitations of Hierarchical Clustering
05 min
42.6 Code sample
03 min
42.7 Assignment-10: Apply K-means, Agglomerative, DBSCAN clustering algorithms
03 min
DBSCAN (DENSITY BASED CLUSTERING) TECHNIQUE 0/11
43.1 Density based clustering
05 min
43.2 MinPts and Eps: Density
06 min
43.3 Core, Border and Noise points
07 min
43.4 Density edge and Density connected points.
06 min
43.5 DBSCAN Algorithm
11 min
43.6 Hyper Parameters: MinPts and Eps
10 min
43.7 Advantages and Limitations of DBSCAN
10 min
43.8 Time and Space Complexity
03 min
43.9 Code samples.
03 min
43.10 Assignment-10: Apply K-means, Agglomerative, DBSCAN clustering algorithms
03 min
43.11 Revision Questions
30 min
RECOMMENDER SYSTEMS AND MATRIX FACTORIZATION 0/16
44.1 Problem formulation: Movie reviews
23 min
44.2 Content based vs Collaborative Filtering
11 min
44.3 Similarity based Algorithms
16 min
44.4 Matrix Factorization: PCA, SVD
23 min
44.5 Matrix Factorization: NMF
03 min
44.6 Matrix Factorization for Collaborative filtering
23 min
44.7 Matrix Factorization for feature engineering
09 min
44.8 Clustering as MF
21 min
44.9 Hyperparameter tuning
10 min
44.10 Matrix Factorization for recommender systems: Netflix Prize Solution
30 min
44.11 Cold Start problem
06 min
44.12 Word vectors as MF
20 min
44.13 Eigen-Faces
15 min
44.14 Code example.
11 min
44.15 Assignment-11: Apply Truncated SVD ?
07 min
44.16 Revision Questions
30 min
INTERVIEW QUESTIONS ON RECOMMENDER SYSTEMS AND MATRIX FACTORIZATION. 0/1
30 min
CASE STUDY 7: AMAZON FASHION DISCOVERY ENGINE(CONTENT BASED RECOMMENDATION) 0/28
46.1 Problem Statement: Recommend similar apparel products in e-commerce using
product descriptions and Images
12 min
46.2 Plan of action
07 min
10 min
46.4 Data folders and paths
06 min
46.5 Overview of the data and Terminology
12 min
46.6 Data cleaning and understanding:Missing data in various features
22 min
46.7 Understand duplicate rows
09 min
46.8 Remove duplicates : Part 1
12 min
46.9 Remove duplicates: Part 2
15 min
46.10 Text Pre-Processing: Tokenization and Stop-word removal
10 min
46.11 Stemming
04 min
46.12 Text based product similarity :Converting text to an n-D vector: bag of words
14 min
46.13 Code for bag of words based product similarity
26 min
46.14 TF-IDF: featurizing text based on word-importance
17 min
46.15 Code for TF-IDF based product similarity
10 min
46.16 Code for IDF based product similarity
09 min
46.17 Text Semantics based product similarity: Word2Vec(featurizing text based on
semantic similarity)
19 min
46.18 Code for Average Word2Vec product similarity
15 min
46.19 TF-IDF weighted Word2Vec
09 min
46.20 Code for IDF weighted Word2Vec product similarity
06 min
46.21 Weighted similarity using brand and color
09 min
46.22 Code for weighted similarity
07 min
46.23 Building a real world solution
05 min
46.24 Deep learning based visual product similarity:ConvNets: How to featurize an
image: edges, shapes, parts
11 min
46.25 Using Keras + Tensorflow to extract features
08 min
46.26 Visual similarity based product similarity
06 min
46.27 Measuring goodness of our solution :A/B testing
07 min
46.28 Exercise :Build a weighted Nearest neighbor model using Visual, Text, Brand
and Color
09 min
CASE STUDY 8:NETFLIX MOVIE RECOMMENDATION SYSTEM (COLLABORATIVE BASED
RECOMMENDATION) 0/27
06 min
47.2 Objectives and constraints
07 min
47.3 Mapping to an ML problem:Data overview.
04 min
47.4 Mapping to an ML problem:ML problem formulation
05 min
47.5 Exploratory Data Analysis:Data preprocessing
07 min
47.6 Exploratory Data Analysis:Temporal Train-Test split.
06 min
47.7 Exploratory Data Analysis:Preliminary data analysis.
15 min
47.8 Exploratory Data Analysis:Sparse matrix representation
08 min
47.9 Exploratory Data Analysis:Average ratings for various slices
08 min
47.10 Exploratory Data Analysis:Cold start problem
05 min
47.11 Computing Similarity matrices:User-User similarity matrix
20 min
47.12 Computing Similarity matrices:Movie-Movie similarity
06 min
47.13 Computing Similarity matrices:Does movie-movie similarity work?
06 min
47.14 ML Models:Surprise library
06 min
47.15 Overview of the modelling strategy.
08 min
47.16 Data Sampling.
05 min
47.17 Google drive with intermediate files
02 min
47.18 Featurizations for regression.
11 min
47.19 Data transformation for Surprise.
02 min
47.20 Xgboost with 13 features
06 min
47.21 Surprise Baseline model.
09 min
47.22 Xgboost + 13 features +Surprise baseline model
04 min
47.23 Surprise KNN predictors
15 min
47.24 Matrix Factorization models using Surprise
05 min
47.25 SVD ++ with implicit feedback
11 min
47.26 Final models with all features and predictors.
04 min
47.27 Comparison between various models.
04 min
47.28 Assignments.
04 min
DEEP LEARNING:NEURAL NETWORKS. 0/14
48.1 History of Neural networks and Deep Learning.
25 min
48.2 How Biological Neurons work?
10 min
48.3 Growth of biological neural networks
16 min
48.4 Diagrammatic representation: Logistic Regression and Perceptron
17 min
48.5 Multi-Layered Perceptron (MLP).
23 min
48.6 Notation
18 min
48.7 Training a single-neuron model.
28 min
48.8 Training an MLP: Chain Rule
40 min
48.9 Training an MLP:Memoization
14 min
48.10 Backpropagation.
26 min
48.11 Activation functions
17 min
23 min
10 min
48.14 Decision surfaces: Playground
15 min
DEEP LEARNING: DEEP MULTI-LAYER PERCEPTRONS 0/21
49.1 Deep Multi-layer perceptrons:1980s to 2010s
16 min
49.2 Dropout layers & Regularization.
21 min
49.3 Rectified Linear Units (ReLU).
28 min
49.4 Weight initialization.
24 min
49.5 Batch Normalization.
21 min
49.6 Optimizers:Hill-descent analogy in 2D
19 min
49.7 Optimizers:Hill descent in 3D and contours.
13 min
49.8 SGD Recap
18 min
49.9 Batch SGD with momentum.
25 min
08 min
15 min
10 min
11 min
49.14 Which algorithm to choose when?
05 min
10 min
49.16 Softmax and Cross-entropy for multi-class classification.
25 min
49.17 How to train a Deep MLP?
08 min
49.18 Auto Encoders.
27 min
49.19 Word2Vec :CBOW
19 min
49.20 Word2Vec: Skip-gram
14 min
49.21 Word2Vec :Algorithmic Optimizations.
12 min
DEEP LEARNING: TENSORFLOW AND KERAS. 0/14
50.1 Tensorflow and Keras overview
23 min
50.2 GPU vs CPU for Deep Learning.
23 min
05 min
50.4 Install TensorFlow
06 min
50.5 Online documentation and tutorials
06 min
50.6 Softmax Classifier on MNIST dataset.
32 min
50.7 MLP: Initialization
11 min
50.8 Model 1: Sigmoid activation
22 min
50.9 Model 2: ReLU activation.
06 min
50.10 Model 3: Batch Normalization.
08 min
50.11 Model 4 : Dropout.
05 min
50.12 MNIST classification in Keras.
18 min
50.13 Hyperparameter tuning in Keras.
11 min
50.14 Exercise: Try different MLP architectures on MNIST dataset.
05 min
DEEP LEARNING: CONVOLUTIONAL NEURAL NETS. 0/19
51.1 Biological inspiration: Visual Cortex
17 min
51.2 Convolution:Edge Detection on images.
28 min
19 min
51.4 Convolution over RGB images.
11 min
51.5 Convolutional layer.
23 min
51.6 Max-pooling.
12 min
51.7 CNN Training: Optimization
09 min
51.8 Example CNN: LeNet [1998]
11 min
51.9 ImageNet dataset.
06 min
51.10 Data Augmentation.
07 min
51.11 Convolution Layers in Keras
17 min
51.12 AlexNet
13 min
51.13 VGGNet
11 min
51.14 Residual Network.
22 min
51.15 Inception Network.
19 min
51.16 What is Transfer learning.
23 min
51.17 Code example: Cats vs Dogs.
15 min
51.18 Code Example: MNIST dataset.
06 min
51.19 Assignment: Try various CNN networks on MNIST dataset.
04 min
DEEP LEARNING: LONG SHORT-TERM MEMORY (LSTMS) 0/11
52.1 Why RNNs?
23 min
52.2 Recurrent Neural Network.
29 min
52.3 Training RNNs: Backprop.
16 min
52.4 Types of RNNs.
14 min
52.5 Need for LSTM/GRU.
10 min
52.6 LSTM.
34 min
52.7 GRUs.
07 min
52.8 Deep RNN.
07 min
52.9 Bidirectional RNN.
12 min
52.10 Code example : IMDB Sentiment classification
33 min
52.11 Exercise: Amazon Fine Food reviews LSTM model.
04 min
INTERVIEW QUESTIONS ON DEEP LEARNING 0/1
30 min
CASE STUDY 9: HUMAN ACTIVITY RECOGNITION 0/8
54.1 Human Activity Recognition Problem definition
09 min
54.2 Dataset understanding
22 min
54.3 Data cleaning & preprocessing
04 min
54.4 EDA:Univariate analysis.
05 min
54.5 EDA:Data visualization using t-SNE
05 min
54.6 Classical ML models.
13 min
54.7 Deep-learning Model.
15 min
54.8 Exercise: Build deeper LSTM models and hyper-param tune them
03 min
CASE STUDY 10: SELF DRIVING CAR 0/13
55.1 Self Driving Car :Problem definition.
14 min
55.2 Datasets.
09 min
55.3 Data understanding & Analysis :Files and folders.
04 min
55.4 Dash-cam images and steering angles.
05 min
55.5 Split the dataset: Train vs Test
03 min
55.6 EDA: Steering angles
06 min
55.7 Mean Baseline model: simple
05 min
55.8 Deep-learning model:Deep Learning for regression: CNN, CNN+RNN
10 min
06 min
55.10 NVIDIA�s end to end CNN model.
18 min
55.11 Train the model.
13 min
55.12 Test and visualize the output.
11 min
55.13 Extensions.
05 min
55.14 Assignment.
03 min
CASE STUDY 11: MUSIC GENERATION USING DEEP-LEARNING 0/11
56.1 Real-world problem
15 min
56.2 Music representation
17 min
56.3 Char-RNN with abc-notation :Char-RNN model
23 min
56.4 Char-RNN with abc-notation :Data preparation.
40 min
56.5 Char-RNN with abc-notation:Many to Many RNN ,TimeDistributed-Dense layer
18 min
56.6 Char-RNN with abc-notation : State full RNN
11 min
56.7 Char-RNN with abc-notation :Model architecture,Model training.
13 min
56.8 Char-RNN with abc-notation :Music generation.
11 min
56.9 Char-RNN with abc-notation :Generate tabla music
03 min
56.10 MIDI music generation.
04 min
56.11 Survey blog:
05 min
SQL 0/27
57.1 Introduction to Databases
21 min
57.2 Why SQL?
30 min
57.3 Execution of an SQL statement.
07 min
57.4 IMDB dataset
12 min
57.5 Installing MySQL
11 min
04 min
57.7 USE, DESCRIBE, SHOW TABLES
15 min
57.8 SELECT
20 min
57.9 LIMIT, OFFSET
10 min
57.10 ORDER BY
06 min
57.11 DISTINCT
10 min
57.12 WHERE, Comparison operators, NULL
13 min
57.13 Logical Operators
27 min
57.14 Aggregate Functions: COUNT, MIN, MAX, AVG, SUM
08 min
57.15 GROUP BY
13 min
57.16 HAVING
12 min
57.17 Order of keywords.
04 min
57.18 Join and Natural Join
12 min
57.19 Inner, Left, Right and Outer joins.
23 min
57.20 Sub Queries/Nested Queries/Inner Queries
24 min
57.21 DML:INSERT
07 min
57.22 DML:UPDATE , DELETE
06 min
57.23 DDL:CREATE TABLE
12 min
04 min
57.25 DDL:DROP TABLE, TRUNCATE, DELETE
03 min
57.26 Data Control Language: GRANT, REVOKE
10 min
57.27 Learning resources
03 min
INTERVIEW QUESTIONS 0/3
58.1 Revision Questions
30 min
58.2 Questions
30 min
58.3 External resources for Interview Questions
30 min
LIVE SESSION VIDEOS 0/2
59.1 Q&A: Probability and Statistics
02 hour 14 min
59.2 How to build a chatbot?