You are on page 1of 20

Module 1 : Introduction to

Machine Learning
Introduction

Google Search Engine

Amazon

E-mail
What is Machine Learning?
Machine Learning
Learn from past experiences
Improve the performances of intelligent
programs

Definition (Mitchell 1997)


A computer program is said to learn from
experience E with respect to some class of tasks
T and performance measure P, if its
performance at the tasks improves with the
experiences
The concept of learning in a ML system

• Learning = Improving performance with


experience at some task
– Improve over task T,
– With respect to performance measure, P
– Based on experience, E.

7
Motivating Example Learning to
Filter Spam

Example: Spam Filtering


Spam - is all email the user does not
want to receive and has not asked to
receive
T: Identify Spam Emails
P:
% of spam emails that were filtered
% of ham/ (non-spam) emails that
were incorrectly filtered-out
E: a database of emails that were
labelled by users
What is Machine Learning?

Traditional Programming

Data
Computer Output

Program
Machine Learning

Data
Computer Program
Real Time Applications

Face Book

Gmail

Paypal

Google Maps

Uber
Machine Learning Algorithms
Association Analysis
Supervised
Learning
Classification
Regression

Unsupervised Learning
Other
Reinforced Learning
Supervised Learning :
Classification

A Maligant
Example : Breast cancer g
e
Differentiating tumors as
malignant or benign from
patient’s age and tumor size
Benign

Tumor Size

Discriminant: IF age > θ1 AND tumor_size > θ2


THEN maligant ELSE benign
Classification Applications

Pattern recognition
Face recognition: Pose, lighting, occlusion (glasses,
beard), make-up, hair style
Character recognition: Different handwriting styles.
Speech recognition: Temporal dependency.
Use of a dictionary or the syntax of the
language.
Sensor fusion: Combine multiple modalities; eg,
visual (lip image) and acoustic for speech
Medical diagnosis: From symptoms to illnesses Web
Advertizing: Predict if a user clicks on an ad on
the Internet.
Supervised Learning :
Applications

Prediction of future cases: Use the rule to predict the output for future inputs

Knowledge extraction: Learning a rule from data

Compression: The rule is simpler than the data it explains

Outlier detection: Exceptions that are not covered by the rule, e.g., fraud

Novelty Detection : Previously unseen but valid case


Supervised Learning :
Regression
Example: Price of House
x : House attributes
y : price
y = g (x | θ )
g ( ) model,
θ parameters
400
Price ($)

300

200

100

0
0 500 1000 1500 2000 2500
Size in feet2
Unsupervised Learning

Learning “What normally happens”


No output
Clustering: Grouping similar
instances
Other applications: Summarization, Association
Analysis
Example applications
Customer segmentation in CRM
Image compression: Color
quantization Bioinformatics: Learning
motifs
Learning Associations

Basket analysis:
P (Y | X ) probability that somebody who buys X
also buys Y where X and Y are products/services.

Example: P ( Bread | Milk ) = 0.6


Market-Basket transactions
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Supervised Vs Unsupervised
Learning
Supervised Learning UnSupervised Learning

x2 x2

x1 x1
Unsupervised Learning :
Applications

Document grouping
Custering gene of Individual
Organizing Computing Clusters
Social Network
Market Segment
Reinforced Learning

Topics:
Policies: what actions should an agent take in a
particular situation
Utility estimation: how good is a state (  u s e d
by policy)
No supervised output but delayed reward
Credit assignment problem (what was responsible
for the outcome)
Applications:
Game playing
Robot in a maze
Multiple agents,
partial
Terminologies
Examples: Items or instances of data used for
learning or evaluation.
Features: The set of attributes associated to an
example.
Labels: Values or categories assigned to
examples.
Training sample: Examples used to train a
learning algorithm.
Validation sample: Examples used to tune the
parameters of a learning algorithm
Terminologies
Test sample: Examples used to evaluate the
performance of a learning algorithm.
Loss function: A function that measures the
difference, or loss, between a predicted label
and a true label.
Hypothesis set: A set of functions mapping
features (feature vectors) to the set of labels Y.

You might also like