Professional Documents
Culture Documents
CS-471
Machine Learning
Dr. Hammad Afzal
hammad.afzal@mcs.edu.pk
Resources
• Assignment: 10%
• Semester Project
• Syndicate Members: 1-3
• Will be announced after 4th Week
2
1
10/3/2022
Resources
Course Intro
• Pre-requisite
• Introductory knowledge of Probability, Statistics and Linear Algebra
• Course Resources
• Lectures slides, assignments (computer/written), solutions to
problems, research papers, projects, and announcements will be
uploaded on LMS page.
2
10/3/2022
Machine Learning
3
10/3/2022
Machine Learning
Machine Learning
• Learning = Improving with experience over some task
4
10/3/2022
10
10
5
10/3/2022
11
11
Machine Learning
• Nicolas learns about Apple and Oranges
12
12
6
10/3/2022
Machine learning
• But will he recognize others?
13
Machine Learning
• There is no need to “learn” to calculate payroll
14
7
10/3/2022
Machine learning
15
15
Applications
Credit Scoring
• Differentiating
between low-risk and
high-risk customers
from their income and
savings
8
10/3/2022
Applications
Autonomous driving
• ALVINN* – Drives 70mph on highways
17
Face recognition
Training examples of a person
Test images
18
18
9
10/3/2022
Template matching
• Problem: Recognize letters A to Z
19
19
Template Matching
Bitmap is represented by 12x12-matrix or by 144-vector
with 0 and 1 coordinates.
0 0 0 0 0 0 1 1 0 0 0 0
0 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 0 1 1 0 0 0
0 0 0 0 1 1 0 1 1 0 0 0
0 0 0 0 1 0 0 0 1 0 0 0
0 0 0 1 1 0 0 0 1 1 0 0
0 0 0 1 1 0 0 0 1 1 0 0
0 0 1 1 1 1 1 1 1 1 1 0
0 0 1 1 0 0 0 0 0 1 1 0
0 1 1 0 0 0 0 0 0 1 1 0
0 1 1 0 0 0 0 0 0 0 1 1
1 1 0 0 0 0 0 0 0 0 1 1
20
10
10/3/2022
Template matching
Training samples – templates with corresponding class:
t1 { (0,0,0,0,1,1,...,0), ' A '}
t 2 { (0,0,0,0,0,1,...,0), ' A '}
.........
t k { (0,0,1,1,1,1,...,0), ' B'}
..........
Template of the image to be recognized:
T { (0,0,0,0,1,1,...,0), ' A? '}
Algorithm:
1. Find ti , so that ti T . 21
21
Template Matching
Improvements?
22
22
11
10/3/2022
Features
• Features are the individual measurable properties of the signal being
observed.
23
23
Features
height
x1
x x
2
Class 1 weight Class 1
Class 2 Class 2
24
24
12
10/3/2022
Feature Extraction
• Feature extraction aims to create discriminative
features good for learning
• Good Features
• Objects from the same class have similar feature
values.
• Objects from different classes have different values.
25
25
Features
• Use fewer features if possible
• Use features that differentiate classes well
26
26
13
10/3/2022
• Supervised learning
• Classification
• Regression
• Unsupervised learning
• Reinforcement learning
27
27
CLASSIFICATION
28
28
14
10/3/2022
29
29
Classification
Apples Oranges
30
30
15
10/3/2022
Classification
• You had some training example
or ‘training data’
What is this???
31
Classification
Apple
Pear
Tomato
Cow
Dog
Horse
32
16
10/3/2022
Class 1
1. Find template closest to the Class 2
input pattern.
2. Classify pattern to the same class
as closest template.
33
33
Classifier
Class 1
Class 2
34
34
17
10/3/2022
Classifier
A classifier partitions feature space X into class-labeled
regions such that:
and X 1 X 2 X |Y | {0}
X X 1 X 2 X |Y |
X1
X1 X3 X1
X2
X2 X3
The classification consists of determining to which region a feature vector x belongs to.
Borders between regions are called decision boundaries 35
35
Classification
• Cancer Diagnosis – Tumor size for prediction
Malignant
Benign
Tumor Size
36
36
18
10/3/2022
Classification
• Cancer Diagnosis – Generally more than one
variables
Malignant
Benign
Age
Tumor Size
Why supervised – The algorithm is given a number of patients
with the RIGHT ANSWER and we want the algorithm to learn37 to
predict for new patients
37
Classification
• Cancer Diagnosis – Generally more than one
variables
Predict for this
patient
Malignant
Benign
Age
Tumor Size
Malignant or Benign
38
19
10/3/2022
39
Contents
• Supervised learning
• Classification
• Regression
• Unsupervised learning
• Reinforcement learning
40
40
20
10/3/2022
Course Outline
• Machine Learning: Theory and Applications
• Introduction to probability theory and Linear Algebra
• Bayesian Decision Theory
• Parametric Methods
• Dimensionality Reduction
• Frequent Pattern Analysis
• Clustering
• Decision Trees
• Artificial neural networks
• Advanced topics in Machine Learning: HMMs, Support
Vector Machines (SVM), … 41
41
REGRESSION
42
42
21
10/3/2022
Regression
CLASSIFICATION
The variable we are trying to predict is
DISCRETE
REGRESSION
The variable we are trying to predict is
CONTINUOUS
43
43
Regression
• Dataset giving the living areas and prices of 50
houses
44
44
22
10/3/2022
Regression
• We can plot this data
45
45
Regression
• The “input” variables – x(i) (living area in this example)
• The “output” or target variable that we are trying to predict – y(i)
(price)
• A pair (x(i), y(i)) is called a training example
• A list of m training examples {(x(i), y(i)); i =
• 1, . . . ,m}—is called a training set
• X denote the space of input values, and Y the space of output values
46
46
23
10/3/2022
Regression
Given a training set, to learn a function h : X → Y so
that h(x) is a “good” predictor for the corresponding
value of y. For historical reasons, this function h is
called a hypothesis.
47
47
Regression
48
48
24
10/3/2022
Regression
• Example: Price of a
used car
• x : car attributes
y : price
49
49
Contents
• Supervised learning
• Classification
• Regression
• Unsupervised learning
• Reinforcement learning
50
50
25
10/3/2022
CLUSTERING
51
51
UNSUPERVISED LEARNING
• CLUSTERING
52
52
26
10/3/2022
UNSUPERVISED LEARNING
• CLUSTERING
The data was not ‘labeled’ you did
not tell Nicolas which are apples
which are oranges
Groups - Clusters
53
53
Clustering
Age
Tumor Size
We have the data for patients but NOT the RIGHT ANSWERS.
The objective is to find interesting structures in data (in this
case two clusters)
54
54
27
10/3/2022
55
55
56
Source: http://cnl.salk.edu/~tewon/Blind/blind_audio.html
56
28
10/3/2022
Classification vs Clustering
• Challenges
• Intra-class variability
• Inter-class similarity
57
57
58
Same face under different expression, pose, illumination
58
29
10/3/2022
Identical twins
59
59
Contents
• Supervised learning
• Classification
• Regression
• Unsupervised learning
• Reinforcement learning
60
60
30
10/3/2022
REINFORCEMENT
LEARNING
61
61
Reinforcement Learning
• In RL, the computer is simply given a goal to achieve.
• The computer then learns how to achieve that goal by
trial-and-error interactions with its environment
62
31
10/3/2022
Reinforcement Learning
• Similar to training a pet dog
63
63
• The RL system begins riding the bicycle and performs a series of actions that
result in the bicycle being tilted 45 degrees to the right
• At this point two actions possible: turn the handle bars left or turn them right.
• RL system turns the handle bars to the left, immediately crashes to the
ground, and receives a negative reinforcement.
• The RL system has just learned not to turn the handle bars left when tilted 45
64
degrees to the right
64
32
10/3/2022
65
65
A fancy PR Example
66
66
33
10/3/2022
test Feature
Preprocessing Classification
pattern Measurement
training Feature
pattern Preprocessing Extraction/ Learning
Selection
67
Training Mode
67
A Fancy problem
Sorting incoming fish on a conveyor according to
species (salmon or sea bass) using optical sensing
It is a classification
problem. How to
solve it?
68
68
34
10/3/2022
Approach
Data Collection: Take some
images using optical sensor
69
69
Approach
• Data collection
70
70
35
10/3/2022
Approach
• Set up a camera and take some sample images to extract features
• Length
• Lightness
• Width
• Number and shape of fins
• Position of the mouth, etc…
• This is the set of all suggested features to explore for use in our
classifier!
71
71
• Test data: It is used to estimate the classification error of the chosen learner on
unseen data called generalization error. The test must be 72 kept inside a ‘vault’
and be brought out only at the end of data analysis
72
36
10/3/2022
Pre-processing
• If data is an image then apply image processing
• What is an image?
• A gray scale image z = f(x,y) is composed of pixels where x & y
are the location of the pixel and z is its intensity
• Image can be considered just a matrix of certain dimensions
a11 a1 n
A
a a mn
m1
Divided
into 8x8
blocks 73
73
Feature extraction
• Feature extraction: use domain knowledge
• The sea bass is generally longer than a salmon
• The average lightness of sea bass scales is greater than that of salmon
• Length of fish and average lightness may not be sufficient features i.e.
they may not guarantee 100% classification results
74
74
37
10/3/2022
Classification – Option 1
• Select the length of the fish as a possible feature
for discrimination between two classes
Decision
Boundary
75
75
76
38
10/3/2022
Evaluation of a classifier
• How to evaluate a certain classifier?
77
Classification – option 2
• Select the average lightness of the fish as a
possible feature for discrimination between two
classes
78
Histograms for the average lightness feature for the two categories
78
39
10/3/2022
Classification – Option 3
x x1 x2
• Use both length and average lightness features for
classification. Use a simple line to discriminate
Decision
Boundary
79
The two features of lightness and width for sea bass and salmon. The dark line
might serve as a decision boundary of our classifier
79
Classification – option 3
• Use both length and average lightness features for
classification. Use a complex model to discriminate
Overly complex models for the fish will lead to decision boundaries that are
80 (classification
complicated. While such a decision may lead to perfect classification
error is zero) of our training samples, it would lead to poor performance on future
patterns (generalization is poor) overfitting
80
40
10/3/2022
Comments
• Model selection
• A complex model seems not be correct one. It is learning the
training data by heart.
• Generalization error
• The minimization of classification error on train database does not
guarantee minimization of classification error on test database
(generalization error) 81
81
Classification – Option 3
• Decision boundary with good generalization
The decision boundary shown might represent the optimal tradeoff between
performance on the training set and simplicity of classifier.
82
82
41
10/3/2022
RESOURCES
83
83
Resources - Journals
• Journal of Machine Learning Research
• Machine Learning
• Pattern Recognition
• Pattern Recognition Letters
• Neural Computation
• Neural Networks
• IEEE Transactions on Neural Networks
• IEEE Transactions on Pattern Analysis and Machine Intelligence
• ...
84
84
42
10/3/2022
Resources – Conferences
• International Conference on Machine Learning (ICML)
• ...
85
85
The material in these slides has been taken from the following sources
Acknowledgements
• Dr. Imran Siddiqi: Bahria University, Islamabad
• Machine Intelligence, Dr M. Hanif, UET, Lahore
• Machine Learning, S. Stock, University of Nebraska
• Lecture Slides, Introduction to Machine Learning,
E. Alpyadin, MIT Press.
• Machine Learning, Andrew Ng – Stanfrod
University
• Fisher kernels for image representation &
generative classification models, Jakob Verbeek
86
86
43
10/3/2022
Courtesy
• Slides are prepared using material from Website of
• Jiawei Han, Micheline Kamber, and Jian Pei
• University of Illinois at Urbana-Champaign & Simon Fraser
University.
87
87
Thank You
88
88
44