You are on page 1of 25

Machine Learning - Coursera

About this course: Machine learning is the science of getting computers to act without being
explicitly programmed. In the past decade, machine learning has given us self-driving cars,
practical speech recognition, effective web search, and a vastly improved understanding of the
human genome. Machine learning is so pervasive today that you probably use it dozens of times
a day without knowing it. Many researchers also think it is the best way to make progress
towards human-level AI. In this class, you will learn about the most effective machine learning
techniques, and gain practice implementing them and getting them to work for yourself. More
importantly, you'll learn about not only the theoretical underpinnings of learning, but also gain
the practical know-how needed to quickly and powerfully apply these techniques to new
problems. Finally, you'll learn about some of Silicon Valley's best practices in innovation as it
pertains to machine learning and AI. This course provides a broad introduction to machine
learning, datamining, and statistical pattern recognition. Topics include: (i) Supervised learning
(parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii)
Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep
learning). (iii) Best practices in machine learning (bias/variance theory; innovation process in
machine learning and AI). The course will also draw from numerous case studies and
applications, so that you'll also learn how to apply learning algorithms to building smart robots
(perception, control), text understanding (web search, anti-spam), computer vision, medical
informatics, audio, database mining, and other areas.

Created by: Stanford University


Welcome to Machine Learning! In this module, we introduce the core idea of teaching a
computer to learn concepts using datawithout being explicitly programmed. The Course Wiki
is under construction. Please visit the resources tab for the most complete and up-...
4 videos, 10 readings
Graded: Introduction
Linear Regression with One Variable
Linear regression predicts a real-valued output based on an input value. We discuss the
application of linear regression to housing price prediction, present the notion of a cost function,
and introduce the gradient descent method for learning.
7 videos, 8 readings
Graded: Linear Regression with One Variable
Linear Algebra Review
This optional module provides a refresher on linear algebra concepts. Basic understanding of
linear algebra is necessary for the rest of the course, especially as we begin to cover models with
multiple variables.
6 videos, 1 reading, 1 reading
Linear Regression with Multiple Variables
What if your input has more than one value? In this module, we show how linear regression can
be extended to accommodate multiple input features. We also discuss best practices for
implementing linear regression.
8 videos, 16 readings
Graded: Linear Regression with Multiple Variables
Octave/Matlab Tutorial
This course includes programming assignments designed to help you understand how to
implement the learning algorithms in practice. To complete the programming assignments, you
will need to use Octave or MATLAB. This module introduces Octave/Matlab and shows yo...
6 videos, 1 reading
Graded: Octave/Matlab Tutorial
Logistic Regression
Logistic regression is a method for classifying data into discrete outcomes. For example, we
might use logistic regression to classify an email as spam or not spam. In this module, we
introduce the notion of classification, the cost function for logistic regr...
7 videos, 8 readings
Graded: Logistic Regression
Machine learning models need to generalize well to new examples that the model has not seen in
practice. In this module, we introduce regularization, which helps prevent models from
overfitting the training data.
4 videos, 5 readings
Graded: Regularization
Neural Networks: Representation
Neural networks is a model inspired by how the brain works. It is widely used today in many
applications: when your phone interprets and understand your voice commands, it is likely that a
neural network is helping to understand your speech; when you cash a ch...
7 videos, 6 readings
Graded: Neural Networks: Representation
Neural Networks: Learning
In this module, we introduce the backpropagation algorithm that is used to help learn parameters
for a neural network. At the end of this module, you will be implementing your own neural
network for digit recognition.
8 videos, 8 readings
Graded: Neural Networks: Learning
Advice for Applying Machine Learning
Applying machine learning in practice is not always straightforward. In this module, we share
best practices for applying machine learning in practice, and discuss the best ways to evaluate
performance of the learned models.
7 videos, 2 readings
Graded: Advice for Applying Machine Learning
Machine Learning System Design
To optimize a machine learning algorithm, youll need to first understand where the biggest
improvements can be made. In this module, we discuss how to understand the performance of a
machine learning system with multiple parts, and also how to deal with skewe...
5 videos, 1 reading
Graded: Machine Learning System Design
Support Vector Machines
Support vector machines, or SVMs, is a machine learning algorithm for classification. We
introduce the idea and intuitions behind SVMs and discuss how to use it in practice.
6 videos, 1 reading
Graded: Support Vector Machines
Unsupervised Learning
We use unsupervised learning to build models that help us understand our data better. We discuss
the k-Means algorithm for clustering that enable us to learn groupings of unlabeled data points.
5 videos, 1 reading
Graded: Unsupervised Learning
Dimensionality Reduction
In this module, we introduce Principal Components Analysis, and show how it can be used for
data compression to speed up learning algorithms as well as for visualizations of complex
7 videos, 1 reading
Graded: Principal Component Analysis
Anomaly Detection
Given a large number of data points, we may sometimes want to figure out which ones vary
significantly from the average. For example, in manufacturing, we may want to detect defects or
anomalies. We show how a dataset can be modeled using a Gaussian distributi...
8 videos, 1 reading
Graded: Anomaly Detection
Recommender Systems
When you buy a product online, most websites automatically recommend other products that you
may like. Recommender systems look at patterns of activities between different users and
different products to produce these recommendations. In this module, we introd...
6 videos, 1 reading
Graded: Recommender Systems
Large Scale Machine Learning
Machine learning works best when there is an abundance of data to leverage for training. In this
module, we discuss how to apply the machine learning algorithms with large datasets.
6 videos, 1 reading
Graded: Large Scale Machine Learning
Application Example: Photo OCR
Identifying and recognizing objects, words, and digits in an image is a challenging task. We
discuss how a pipeline can be built to tackle this problem and how to analyze and improve the
performance of such a system.
5 videos, 1 reading
Graded: Application: Photo OCR

Intro to Machine Learning(Udacity)

Lesson 1
Welcome to Machine Learning

Learn what Machine Learning is and meet Sebastian Thrun!

Find out where Machine Learning is applied in Technology and Science.

Lesson 2
Naive Bayes

Use Naive Bayes with scikit learn in python.

Splitting data between training sets and testing sets with scikit learn.

Calculate the posterior probability and the prior probability of simple


Lesson 3
Support Vector Machines

Learn the simple intuition behind Support Vector Machines.

Implement an SVM classifier in SKLearn/scikit-learn.

Identify how to choose the right kernel for your SVM and learn about RBF and
Linear Kernels.

Lesson 4
Decision Trees

Code your own decision tree in python.

Learn the formulas for entropy and information gain and how to calculate
Implement a mini project where you identify the authors in a body of emails
using a decision tree in Python.

Lesson 5
Choose your own Algorithm

Decide how to pick the right Machine Learning Algorithm among K-Means,
Adaboost, and Decision Trees.

Lesson 6
Datasets and Questions

Apply your Machine Learning knowledge by looking for patterns in the Enron
Email Dataset.

You'll be investigating one of the biggest frauds in American history!

Lesson 7

Understand how continuous supervised learning is different from discrete


Code a Linear Regression in Python with scikit-learn.

Understand different error metrics such as SSE, and R Squared in the context
of Linear Regressions.

Lesson 8

Remove outliers to improve the quality of your linear regression predictions.

Apply your learning in a mini project where you remove the residuals on a
real dataset and reimplement your regressor.

Apply your same understanding of outliers and residuals on the Enron Email

Lesson 9

Identify the difference between Unsupervised Learning and Supervised


Implement K-Means in Python and Scikit Learn to find the center of clusters.
Apply your knowledge on the Enron Finance Data to find clusters in a real

Lesson 10
Feature Scaling

Understand how to preprocess data with feature scaling to improve your


Use a min mx scaler in sklearn.

Introduction to Machine Learning & Face Detection in Python(Udemy)


Basic python


This course is about the fundamental concepts of machine learning, focusing on neural networks,
SVM and decision trees. These topics are getting very hot nowadays because these learning
algorithms can be used in several fields from software engineering to investment banking.
Learning algorithms can recognize patterns which can help detect cancer for example or we may
construct algorithms that can have a very good guess about stock prices movement in the market.

In each section we will talk about the theoretical background for all of these algorithms then we
are going to implement these problems together.

The first chapter is about regression: very easy yet very powerful and widely used machine
learning technique. We will talk about Naive Bayes classification and tree based algorithms such
as decision trees and random forests. These are more sophisticated algorithms, sometimes works,
sometimes not. The last chapters will be about SVM and Neural Networks: the most important
approaches in machine learning.

Who is the target audience?

This course is meant for newbies who are not familiar with machine learning or students
looking for a quick refresher

Curriculum for This Course

1. Introduction
a. Introduction
b. Introduction to machine learning
2. Regression
a. Linear regression introduction
b. Linear regression example
c. Logistic regression introduction
d. Cross validation
e. Logistic regression example I - sigmoid function
f. Logistic regression example II
g. Logistic regression example III - credit scoring

3. K-Nearest Neighbor Classifier

a. K-nearest neighbor introduction
b. K-nearest neighbor introduction - normalize data
c. K-nearest neighbor example I
d. K-nearest neighbor example II

4. Naive Bayes Classifier

a. Naive Bayes introduction
b. Naive Bayes example I
c. Naive Bayes example II - text clustering

5. Support Vector Machine (SVM)

a. Support vector machine introduction
b. Support vector machine example I
c. Support vector machine example II - character recognition

6. Tree Based Algorithms

a. Decision trees introduction
b. Decision trees example I
c. Decision trees example II - iris data
d. Pruning and bagging
e. Random forests introduction
f. Boosting
g. Random forests example I
h. Random forests example II - enhance decision trees

7. Clustering
a. Principal component analysis introduction
b. Principal component analysis example
c. K-means clustering introduction
d. K-means clustering example
e. DBSCAN introduction
f. Hierarchical clustering introduction
g. Hierarchical clustering example

8. Neural Networks
a. Neural network introduction
b. Feedfordward neural networks
c. Training a neural network
d. Error calculation
e. Gradients calculation
f. Backpropagation
g. Applications of neural networks
h. Deep learning
i. Neural network example I - XOR problem
j. Neural network example II - face recognition

9. Face Detection
a. Face detection introduction
b. Installing OpenCV
c. CascadeClassifier
d. CascadeClassifier parameters
e. Tuning the parameters

10. Outro
a. Final words

11. Source Code & Data

a. Source code & CSV files
b. Data
c. Slides
d. Coupon codes - get any of my courses for a discounted price

CS 403/725: Foundations of Machine Learning (IIT-B)

Course Description
CS 403/725 provides a broad introduction to machine learning and various fields of application.
The course is designed in a way to build up from root level.
Topics include:

Supervised Classification (perceptron, support vector machine, loss functions,

kernels, neural networks and deep learning)

Supervised Regression (Least square regression, bayes linear regression)

Unsupervised classification (clustering, expectation maximization)

Introduction to learning theory (bias/variance tradeoffs).

The course will discuss the application of machine learning in devanagari script
recognition which is a developing field in the machine learning community.


CS 725: Foundations of machine learning


Remedial co-requisite: Mathematical foundations (Separately proposed by Prof. Saketh Nath)

Recommended parallel courses: CS709 (Convex optimization)

Course Content :

Supervised learning: decision trees, nearest neighbor classifiers, generative classifiers like naive
Bayes, linear discriminate analysis, loss regularization framework for classification, Support
vector Machines

Regression methods: least-square regression, kernel regression, regression trees

Unsupervised learning: k-means, hierarchical, EM, non-negative matrix factorization, rate

distortion theory.


1. Hastie, Tibshirani, Friedman The elements of Statistical Learning Springer Verlag.

2. Pattern recognition and machine learning by Christopher Bishop.

3. Selected papers.

Home Page



Introduction to Machine Learning

About The Course

This course provides a concise introduction to the fundamental concepts in machine learning and
popular machine learning algorithms. We will cover the standard and most popular supervised
learning algorithms including linear regression, logistic regression, decision trees, k-nearest
neighbour, an introduction to Bayesian learning and the nave Bayes algorithm, support vector
machines and kernels and neural networks with an introduction to Deep Learning. We will also
cover the basic clustering algorithms. Feature reduction methods will also be discussed. We will
introduce the basics of computational learning theory. In the course we will discuss various
issues related to the application of machine learning algorithms. We will discuss hypothesis
space, overfitting, bias and variance, tradeoffs between representational power and learnability,
evaluation strategies and cross-validation. The course will be accompanied by hands-on problem
solving with programming in Python and some tutorial sessions.

Intended Audience

Elective course
UG or PG


Basic programming skills (in Python), algorithm design, basics of probability & statistics

Industry Support - List of Companies/Industry that will Recognize/value this online


Data science companies and many other industries value machine learning skills.

Course Instructor

Sudeshna Sarkar is a Professor and currently the Head in the Department of Computer Science
and Engineering at IIT Kharagpur. She completed her B.Tech. in 1989 from IIT Kharagpur, MS
from University of California, Berkeley, and PhD from IIT Kharagpur in 1995. She served
briefly in the faculty of IIT Guwahati and at IIT Kanpur before joining IIT Kharagpur in 1998.
Her research interests are in Machine Learning, Natural Language Processing, Data and Text

The Teaching Assistants of this course are Anirban Santara and Ayan Das, both of whom are
PhD students in Computer Science & Engineering Department, IIT Kharagpur. They will take
active part in the course especially in running demonstration and programming classes as well as
tutorial classes.

Course layout
Week 1:
Introduction: Basic definitions, types of learning, hypothesis space and inductive
bias, evaluation, cross-validation
Week 2:
Linear regression, Decision trees, overfitting
Week 3:
Instance based learning, Feature reduction, Collaborative filtering based
Week 4:
Probability and Bayes learning
Week 5:
Logistic Regression, Support Vector Machine, Kernel function and Kernel SVM
Week 6:
Neural network: Perceptron, multilayer network, backpropagation, introduction to
deep neural network
Week 7:
Computational learning theory, PAC learning model, Sample complexity, VC
Dimension, Ensemble learning
Week 8:
Clustering: k-means, adaptive hierarchical clustering, Gaussian mixture model
suggested reading

1. Machine Learning. Tom Mitchell. First Edition, McGraw- Hill, 1997.

2. Introduction to Machine Learning Edition 2, by Ethem Alpaydin

More details about the course

Course url:
Course duration : 08 weeks
Start date and end date of course: 18 July 2016 - 09 September 2016
Dates of exams : 18 September 2016 & 25 September 2016
Time of exam : 2pm - 5pm
Final List of exam cities will be available in exam registration form.
Exam registration url - Will be announced shortly
Exam Fee: The online registration form has to be filled and the certification exam
fee of approximately Rs 1000(non-Programming)/1250(Programming)needs to
be paid.

E-Certificate will be given to those who register and write the exam. Certificate will have your
name, photograph and the score in the final exam. It will have the logos of NPTEL and IIT
It will be e-verifiable at

Introduction to Machine Learning


With the increased availability of data from varied sources there has been increasing
attention paid to the various data driven disciplines such as analytics and machine
learning. In this course we intend to introduce some of the basic concepts of machine
learning from a mathematically well motivated perspective. We will cover the different
learning paradigms and some of the more popular algorithms and architectures used in
each of these paradigms.

This is an elective course. Intended for senior UG/PG students. BE/ME/MS/PhD

We will assume that the students know programming for some of the assignments.If the
students have done introdcutory courses on probability theory and linear algebra it
would be helpful. We will review some of the basic topics in the first two weeks as well.


Any company in the data analytics/data science/big data domain would value this


Prof. Ravindran is currently an associate professor in Computer Science at IIT Madras. He

has nearly two decades of research experience in machine learning and specifically
reinforcement learning. Currently his research interests are centered on learning from
and through interactions and span the areas of data mining, social network analysis,
and reinforcement learning.

Week 1: Introductory Topics

Week 2: Linear Regression and Feature Selection
Week 3: Linear Classification
Week 4: Support Vector Machines and Artificial Neural Networks
Week 5: Bayesian Learning and Decision Trees
Week 6: Evaluation Measures
Week 7: Hypothesis Testing
Week 8: Ensemble Methods
Week 9: Clustering
Week 10: Graphical Models
Week 11: Learning Theory and Expectation Maximization
Week 12: Introduction to Reinforcement Learning

Certification Exam
The exam is optional. Exams will be on 24 April 2016 and 30 April 2016.
Time: 2pm-5pm

Tentative list of exam cities:

Registration url: Announcements will be made when the registration form is open for

The online registration form has to be filled and the certification exam fee of approximately Rs
1000 needs to be paid.

Certificate will be given to those who register and write the exam. Certificate will have your
name, photograph and the score in the final exam.

It will have the logos of NPTEL and IIT Madras. It will be e-verifiable at


1. T. Hastie, R. Tibshirani, J. Friedman. The Elements of Statistical Learning, 2e, 2008.

2. Christopher Bishop. Pattern Recognition and Machine Learning. 2e.

IIT Madras
CS5011: Introduction to Machine Learning
Home | Research & Publications | Teaching | Students | CV | Contact

Date Lecture Contents Reference

1 Aug 1, Introduction to machine learning Chapter 1 from Machine

2011 Learning by Tom Mitchell

2 Aug 2, Introduction to machine learning Chapter 1 from Machine

2011 Learning by Tom Mitchell

3 Aug 4, Overview of target function Chapter 1 from Machine

2011 representations Learning by Tom Mitchell

4 Aug 5, Hypothesis class, version space Chapter 1 and 2 from

2011 Machine Learning by Tom

5 Aug 8, Types of ML techniques, hypothesis Chapter 2 from Introduction

2011 selection through cross validation to Machine Learning by
Ethem Alppaydin

6 Aug 9, Noise, bias-variance trade-off, under- Chapter 2 from Introduction

2011 fitting and over-fitting concepts to Machine Learning by
Ethem Alppaydin

7 Aug 11, Q&A on over and under-fitting, bias- Chapter 2 from Principles of
2011 variance, Data: types of features, data Data Mining by David Hand
normalization et al.

8 Aug 12, Bias variance trade-off using regression

2011 example

9 Aug 16, Correlation, covariance, Mahalanobis Chapter 2 from Principles of

2011 distance Data Mining by David Hand
et al.

10 Aug 18, Mahalanobis distance, Minkowski Chapter 2, 3 from Principles

2011 distance, distance metric, Jaccard of Data Mining by David
coefficient, missing values, feature Hand et al.

11 Aug 19, Geometrical interpretation of Euclidean, Chapter 4 from Principles of

2011 Mahalanobis distance, dealing with Data Mining by David Hand
uncertainty et al.

12 Aug 22, Maximum likeliHood estimation (MLE) Chapter 4 from Principles of

2011 theory and example using binomial Data Mining by David Hand
distribution et al.

13 Aug 23, Maximum likeliHood estimation (MLE) of Chapter 4 from Principles of

2011 univariate Gaussian, generative vs Data Mining by David Hand
discriminative models et al.

14 Aug 25, Maximum likelihood estimation of Chapter 4 from Principles of

2011 bivariate Gaussian distribution, sufficient Data Mining by David Hand
statistics et al.

15 Aug 26, Bayesian Learning Chapter 2 from Pattern

2011 Recognition and Machine
Learning by Christopher M.

CS5011 - Machine Learning

Course Data :
Basic Maths : Probability, Linear Algebra, Convex Optimization
Background: Statistical Decision Theory, Bayesian Learning (ML, MAP, Bayes
estimates, Conjugate priors)
Regression : Linear Regression, Ridge Regression, Lasso
Dimensionality Reduction : Principal Component Analysis, Partial Least Squares
Classification : Linear Classification, Logistic Regression, Linear Discriminant
Analysis, Quadratic Discriminant Analysis, Perceptron, Support Vector Machines +
Kernels, Artificial Neural Networks + BackPropagation, Decision Trees, Bayes
Optimal Classifier, Naive Bayes.
Evaluation measures : Hypothesis testing, Ensemble Methods, Bagging Adaboost
Gradient Boosting, Clustering, K-means, K-medoids, Density-based Hierarchical,
Miscellaneous topics: Expectation Maximization, GMMs, Learning theory Intro to
Reinforcement Learning
Graphical Models: Bayesian Networks.

Machine Learning
Autumn 2016

Instructor: Piyush Rai: (office: KD-319, email: piyush AT cse DOT iitk DOT ac DOT in)
Office Hours: Tuesday 12-1pm (or by appointment)
Q/A Forum: Piazza (please register)
Class Location: L-16 (lecture hall complex)
Timings: WF 6:00-7:30pm
Background and Course Description
Machine Learning is the discipline of designing algorithms that allow machines (e.g.,
a computer) to learn patterns and concepts from data without being explicitly
programmed. This course will be an introduction to the design (and some analysis)
of Machine Learning algorithms, with a modern outlook, focusing on the recent
advances, and examples of real-world applications of Machine Learning algorithms.
This is supposed to be the first ("intro") course in Machine Learning. No prior
exposure to Machine Learning will be assumed. At the same time, please be aware
that this is NOT a course about toolkits/software/APIs used in applications of
Machine Learning, but rather on the principles and foundations of Machine Learning
algorithms, delving deeper to understand what goes on "under the hood", and how
Machine Learning problems are formulated and solved.

MSO201A/equivalent, CS210/ESO211/ESO207A; Ability to program in
MATLAB/Octave. In some cases, pre-requisites may be waived (will need instructor's

There will be 4 homework assignments (total 40%) which may include a
programming component, a mid-term (20%), a final-exam (20%), and a course
project (20%)

Reference materials
There will not be any dedicated textbook for this course. In lieu of that, we will have
lecture slides/notes, monographs, tutorials, and papers for the topics that will be
covered in this course. Some recommended, although not required, reference books
are listed below (in no particular order):

Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical

Learning, Springer, 2009 (freely available online)

Hal Daum III, A Course in Machine Learning, 2015 (in preparation; most
chapters freely available online)

Kevin Murphy, Machine Learning: A Probabilistic Perspective, MIT Press, 2012

Christopher Bishop, Pattern Recognition and Machine Learning, Springer,


Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning:

From Theory to Algorithms, Cambridge University Press, 2014

Schedule (Tentative)
Dat Deadli Slides/No
Topics Readings/References
e nes tes
Linear Algebra review,
Course Logistics and
July Probability review, Matrix
Introduction to Machine slides
28 Cookbook, MATLAB review,
[JM15], [LBH15]

Supervised Learning
Learning by Computing
Aug Distances: Distance from Distance from Means, CIML
3 Means and Nearest Chapter 2

Learning by Asking
Aug Questions: Decision Tree Book Chapter, Info Theory
5 based Classification and notes DT - visual illustration

Optional: Some notes, Some

Aug Learning as Optimization,
useful resources on slides
10 Linear Regression
optimization for ML

Learning via Probabilistic

Aug Murphy (MLAPP): Chapter 7
Modeling: Probabilistic slides
12 (sections 7.1-7.5)
Linear Regression

Learning via Probabilistic

Aug Murphy (MLAPP): Chapter 8
Modeling: Logistic and slides
17 (sections 8.1-8.3)
Softmax Regression

Online Learning via

Aug Murphy (MLAPP): Chapter 8
Stochastic Optimization, slides
19 (section 8.5)

Learning Maximum- Intro to SVM, Wikipedia Intro to

Margin Hyperplanes: SVM, Optional: Advanced Intro slides
Support Vector Machines to SVM, SVM Solvers

CIML Chapter 9 (section 9.1 and

Aug Nonlinear Learning with
9.4), Murphy (MLAPP): Chapter slides
26 Kernels
14 (up to section 14.4.3)

Unsupervised Learning
Bishop (PRML): Section 9.1.
Aug Data Clustering, K-means Optional reading: Data HW 1
31 and Kernel K-means clustering: 50 years beyond k- Due
Linear Dimensionality Bishop (PRML): Section 12.1.
Reduction: Principal Optional reading: PCA tutorial slides
Component Analysis paper

PCA (Wrap-up) and

Nonlinear Dimensionality Optional reading: Kernel PCA slides
Reduction via Kernel PCA

Optional Reading: Matrix

Sept Matrix Factorization and
Factorization for Recommender slides
21 Matrix Completion
Systems, Scalable MF

Sept Introduction to
23 Generative Models

Generative Models for

Sept Bishop (PRML): Section 9.2 and slides
Clustering: GMM and
26 9.3 (up to 9.3.2) (notes)
Intro to EM

Sept Maximization and Bishop (PRML): Section 9.3 (up
28 Generative Models for to 9.3.2) and 9.4
Dim. Reduction

Generative Models for

Bishop (PRML): Section 12.2 (up
Oct Dim. Reduction: HW 2
to 12.2.2). Optional reading: slides
5 Probabilistic PCA and Due
Mixtures of PPCA
Factor Analysis

Assorted Topics
Practical Issues:
Model/Feature Selection,
Oct On Evaluation and Model
Evaluating and slides
19 Selection
Debugging ML

Optional (but recommended)

Oct Introduction to Learning Mitchell ML Chapter 7 (sections
24 Theory 7.1-7.3.1, section 7.4 (up to

CIML Chapter 11, Optional:

Oct Ensemble Methods:
Brief Intro to Boosting, slides
26 Bagging and Boosting
Explaining AdaBoost

Oct Semi-supervised Reading: Brief SSL Intro, slides

28 Learning Optional: A (somewhat old but
recommended) survey on SSL

Deep Learning (1): Optional Readings: Feedforward

Nov HW 3
Feedforward Neural Nets Neural Networks, Convolutional slides
2 Due
and CNN Neural Nets

Deep Learning (2):

Optional Readings: RNN and
Nov Models for Sequence
LSTM, Understanding LSTMs, slides
4 Data (RNN and LSTM)
RNN and LSTM Review
and Autoencoders

Nov Learning from

5 Imbalanced Data

Online Learning
Nov Optional Reading: Foundations
(Adversarial Model and slides
9 of ML (Chapter 7)

Nov Survey of Other Topics

11 and Conclusions

Useful Links
- Machine Learning Summer Schools
- Scikit-Learn: Machine Learning in Python
- Awesome Machine Learning (a comprehensive list of various Machine Learning
libraries and softwares)

IISc Bangalore
shivani / Chiranjib
E0 270 (3:1) Machine Learning Bhattacharyya / Indrajit

Introduction to machine learning. Classification: nearest neighbour, decision trees, perceptron,

support vector machines, VC-dimension. Regression: linear least squares regression, support
vector regression. Additional learning problems: multiclass classification, ordinal regression,
ranking. Ensemble methods: boosting. Probabilistic models: classification, regression, mixture
models (unconditional and conditional), parameter estimation, EM algorithm. Beyond IID,
directed graphical models: hidden Markov models, Bayesian networks. Beyond IID, undirected
graphical models: Markov random fields, conditional random fields. Learning and inference in
Bayesian networks and MRFs: parameter estimation, exact inference (variable elimination, belief
propagation), approximate inference (loopy belief propagation, sampling). Additional topics:
semi-supervised learning, active learning, structured prediction.

Bishop. C M, Pattern Recognition and Machine Learning. Springer, 2006.

Duda, R O, Hart P E and Stork D G. Pattern Classification. Wiley-Interscience, 2nd

Edition, 2000.

Hastie T, Tibshirani R and Friedman J, The Elements of Statistical Learning: Data

Mining, Inference and Prediction. Springer, 2nd Edition, 2009.

Mitchell T, Machine Learning. McGraw Hill, 1997.

Current literature.


Probability and Statistics (or equivalent course elsewhere). Some background in linear
algebra and optimization will be helpful.

IIT Delhi
CSL341: Fundamentals of Machine Learning

General Information
Instructor: Parag Singla (email: parags AT

Class Timings (Slot B):

Monday, 9:30am - 10:55am

Thursday, 9:30am - 10:55am

Venue:WS 101 (Workshop Room 101) Bharti 101

Teaching Assistants
Name Email

Abhinav Kumar cs5090231 AT

Anuj Gupta agupta AT

Arpit Jain cs5090236 AT

Happy Mittal csz138233 AT

Shubham Gupta cs5090252 AT

Sudhanshu Sekhar cs5090255 AT

Yamuna Prasad yprasad AT


[Thu Oct 31]: Assignment 2, New Due Date: Monday Nov 4 (11:50 pm).

[Mon Sep 30]: Assignment 2 is out! Due Date: Thursday Oct 31 (11:50 pm).

[Fri Sep 27]: Assignment submission instructions have been updated (See

[Wed Sep 25]: Assignment 1 has been updated. New Due Date: Sunday Sep
29 (11:50 pm).

[Wed Sep 4]: The venue for the class on Thursday Sep 5 will be Bharti 101
(instead of WS 101).

[Sat Aug 10]: Assignment 1 is out! Due Date: Sunday Sep 15 (11:50 pm).

[Wed Jul 31]: The course website is up, finally!

Course Content
Wee Book
Topic Supplementary Notes
k Chapters

1 Introduction
Chapter 1

Linear and Logistic Regression,
2,3 Chapter 3.1, lin-log-reg.pdf, gda.pdf
Gaussian Discriminant Analysis

4,5 Support Vector Machines svm.pdf
Chapter 7.1

6 Neural Networks nnets.pdf nnets-hw.pdf
Chapter 4

7 Decision Trees dtrees.pdf
Chapter 3

8,9 Naive Bayes, Bayesian Statistics Mitchell, nb.pdf, bayes.pdf

Chapter 6 Conjugate Prior model.pdf

10,1 K-Means, Gaussian Mixture kmeans.pdf gmm.pdf

1 Models, EM em.pdf

12 PCA pca.pdf

13 Learning Theory, Model Selection theory.pdf model.pdf
Chapter 7

Application of ML to
14 crowd-ml.pdf nlp-ml.pdf
CrowdSourcing and NLP

Additional Reading

Induction of Decision Tress (Original Paper on the ID3 Algorithm by Ross


Beyond Independence: Conditions for the Optimality of the Simple Bayesian


A Unified Bias-Variance Decomposition for Zero-One and Squared Loss

Review Material
Topic Notes

Probability prob.pdf

Linear Algebra linalg.pdf

Gaussian Distribution gaussians.pdf

Convex Optimization (1) convex-1.pdf


1. Pattern Recognition and Machine Learning. Christopher Bishop. First Edition,

Springer, 2006.

2. Pattern Classification. Richard Duda, Peter Hart and David Stock. Second
Edition, Wiley-Interscience, 2000.

3. Machine Learning. Tom Mitchell. First Edition, McGraw-Hill, 1997.

Assignment Submission Instructions

1. You are free to discuss the problems with other students in the class. You
should include the names of the people you had a significant discussion with
in your submission.

2. All your solutions should be produced independently without referring to any

discussion notes or the code someone else would have written.

3. All the programming should be done in MATLAB. Include comments for


4. Code should be submitted using Sakai Page.

5. [Updated October 31, 2013]: Create a separate directory for each of the
questions named by the question number. For instance, for question 1, all
your submissions files (code/graphs/write-up) should be put in the directory
named Q1 (and so on for other questions). Put all the Question sub-
directories in a single top level directory. This directory should be named as
"yourentrynumber_firstname_lastname". For example, if your entry number is
"2009anz7535" and your name is "Nilesh Pathak", your submission directory
should be named as "2009anz7535_nilesh_pathak". You should zip your
directory and name the resulting file as
"" e.g. in the above example it will
be "". This single zip file should be submitted

6. Honor Code: Any cases of copying will be awarded a zero on the

assignment. More severe penalties may follow.

7. Late Policy: You will lose 20% for each late day in submission. Maximum of 2
days late submissions are allowed.


1. Assignment 2 New Due Date: 11:50 pm, Monday November 4, 2013.


o Problem 1:

o Problem 2:

o Problem 3:

2. Assignment 1. New Due Date: Sunday September 29, 2013.

o New Updated Version

o Original Version

Problem 1: q1x.dat q1y.dat

Problem 2: q2x.dat q2y.dat

IIT Kharagpur
Machine Learning (CS60050)
Instructor: Sourangshu Bhattacharya

Class Schedule: WED(09:30-10:30) , THURS(08:30-09:30) , FRI(10:30-11:30) , FRI(11:30-


Classroom: CSE-108


First Meeting: Wednesday, 24th July, at 09:30 am in CSE-108.

Basic Principles: Introduction, The concept learning task. General-to-specific ordering of
hypotheses. Version spaces. Inductive bias. Experimental Evaluation: Over-fitting, Cross-
Supervised Learning: Decision Tree Learning. Instance-Based Learning: k-Nearest neighbor
algorithm, Support Vector Machines, Ensemble learning: boosting, bagging. Artificial Neural
Networks: Linear threshold units, Perceptrons, Multilayer networks and back-propagation.
Probabilistic Models: Maximum Likelihood Estimation, MAP, Bayes Classifiers Naive Bayes.
Bayes optimal classifiers. Minimum description length principle. Bayesian Networks, Inference
in Bayesian Networks, Bayes Net Structure Learning.
Unsupervised Learning: K-means and Hierarchical Clustering, Gaussian Mixture Models, EM
algorithm, Hidden Markov Models.
Computational Learning Theory: probably approximately correct (PAC) learning. Sample
complexity. Computational complexity of training. Vapnik - Chervonenkis dimension,
Reinforcement Learning.


Tom Mitchell. Machine Learning. McGraw Hill, 1997.

Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer

Richard O. Duda, Peter E. Hart, David G. Stork. Pattern Classification. John
Wiley & Sons, 2006.