Slide16-Machine Learning For Computer Vision

CS484: Introduction to Computer Vision
Machine Learning for Computer Vision

(Szeliski: 3, 4, 10)
Min H. Kim
KAIST School of Computing
Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision

MACHINE LEARNING FOR
COMPUTER VISION
Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 2

Outline
• Machine Learning for Computer Vision
– Machine learning overview
– Supervised machine learning framework
– Generalization error

Machine Learning
• Learn from and make predictions on data.
• Arguably the greatest export from computing to

other scientific fields.
• Statisticians might disagree with Computer

Science on the true origins…

ML for Computer Vision
• Face Recognition
• Object Classification
• Scene Segmentation

UNSUPERVISED
MACHINE LEARNING
Acknowledgment: Many slides from James Hays, Derek Hoiem and Grauman & Leibe 2008 AAAI Tutorial
Dimensionality Reduction
• So we decided we need fewer dimensions
• But which dimension? There are infinite choices
• Dimension of the greatest variability!
• Why?
Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision Shin Yoo 9
Dimensionality Reduction
• PCA, ICA, LLE, Isomap, etc.
• Principal component analysis

– Creates a basis where the axes
represent the dimensions of variance,
from high to low.
– Finds correlations in data dimensions
to produce best possible lower-
dimensional representation based on
linear projections.

Dimension of Greatest Variability
• Example: reduce 2D
data to 1D
d1
• Picking d with the

maximum variability:
this reduces cases
where points are
d2
close to each other in
d, but far away in the
original dimensions
Principal Component Analysis (PCA)
• Defines a set of principal components, i.e., directions of
different degrees of variance
– 1st: direction of the greatest variance in the data
– 2nd: direction of the second greatest variance in the data,
perpendicular to the 1st one
–…
• Find n<<d components (dimensions), and project every
data point to these dimensions
Eigenfaces The ATT face database (formerly the ORL
database), 10 pictures of 40 subjects each

Eigenfaces
Mean face
Basis of variance (eigenvectors)
M. Turk; A. Pentland (1991). "Face recognition using eigenfaces" (PDF).

14
Lecturer:
Proc. Min H.Conference
IEEE Kim (KAIST) on ComputerCS484:
VisionIntroduction
and PatterntoRecognition.
Computer Vision
pp. 586–591. R.P.W. Duin
Clustering: image segmentation
Goal: Break up the image into meaningful or
perceptually similar regions

Segmentation for feature support or effi
ciency
50x50
Patch
50x50
Patch
[Felzenszwalb and Huttenlocher 2004]
Superpixels!
Lecturer: Min H. Kim (KAIST) [ShiVision

CS484: Introduction to Computer and Malik 2001] 17
[Hoiem et al. 2005, Mori 2005] Derek Hoiem
Segmentation as a result

GrabCut, Rother et al. 2004
Types of segmentations
Oversegmentation Undersegmentation
Hierarchical Segmentations
Clustering
Group together similar ‘points’ and represent

them with a single token.
Key Challenges:
1) What makes two points/images/patches similar?
2) How do we compute an overall grouping from
pairwise similarities?

Derek Hoiem
Why do we cluster?
• Summarizing data
– Look at large amounts of data
– Patch-based compression or denoising
– Represent a large continuous vector with the cluster
number
• Counting
– Histograms of texture, color, SIFT vectors
• Segmentation
– Separate the image into different regions
• Prediction
– Images in theCS484:
Lecturer: Min H. Kim (KAIST) same cluster
Introduction may
to Computer Visionhave the same labels
Derek
21
Hoiem
How do we cluster?
• K-means
– Iteratively re-assign points to the nearest cluster
center
• Mean-shift clustering
– Estimate modes of pdf
• Agglomerative clustering, spectral clustering,

etc.

SUPERVISED
MACHINE LEARNING
Acknowledgment: Many slides from James Hays, Derek Hoiem and Grauman & Leibe 2008 AAAI Tutorial
Data, data, data!
• Norvig – “The Unreasonable Effectiveness of

Data” (IEEE Intelligent Systems, 2009)
– “... invariably, simple models and a lot of data trump
more elaborate models based on less data”
ImageNet
• Images for each

category of
WordNet
• 1000 classes
• 1.2mil. images
• 100k test
• Top 5 error
ImageNet Competition
• Krizhevsky, 2012
• Google,
Microsoft 2015
– Beat the best
human score in
the ImageNet
challenge.
CS484: Introduction to Computer Vision 29

Lecturer: Min H. Kim (KAIST)
NVIDIA
Classification vs. Regression
• Function Approximation
– Predictive modeling is the problem of developing a
model using historical data to make a prediction on
new data where we do not have the answer.
• Classification Predictive Modeling
– Classification predictive modeling is the task of
approximating a mapping function (f) from input
variables (X) to discrete output variables (y).
• Regression Predictive Modeling
– Regression predictive modeling is the task of
approximating a mapping function (f) from input
variables (X) to a continuous output variable (y).
Jason Brownlee
SUPERVISED MACHINE LEARNING
FRAMEWORK

Dataset split
Training Validation Testing

Images Images Images
- Train classifier - Measure error - Secret labels

- Tune model - Measure error
hyperparameters
Random train/validate splits = cross validation

Supervised Learning Framework
Training Training
Labels
Training
Images
Image Learned
Training
Features classifier
Testing
Image Features Apply classifier Prediction
Test Image
Slide credit: D. Hoiem and L. Lazebnik
Features
• Raw pixels
• Histograms
• Templates
• SIFT descriptors
– GIST
– ORB
– HOG….

L. Lazebnik
General Principles of Representation
• Coverage
– Ensure that all relevant info is
captured
• Concision
– Minimize number of features
without sacrificing coverage
• Directness
– Ideal features are independently
useful for prediction

Training Training
Labels
Training
Images
Image Learned
Training
Features classifier
Testing
Test Image
Recognition task and supervision
• Label: Images in the training set must be annotated
with the “correct answer” that the model is expected
to produce
Contains a motorbike

L. Lazebnik
Spectrum of supervision (label)
Less More
E.G., ImageNet E.G., MS Coco
Unsupervised “Weakly” supervised Fully supervised
‘Semi-supervised’: small partial labeling Fuzzy; definition depends on task

Lazebnik
Good training example?

Good labels?

http://mscoco.org/explore/?id=134918
Google guesses from the 1st caption
Google search: an elephant standing on top of a basket being held by a woman

Training Training
Labels
Training
Images
Image Learned
Training
Features classifier
Testing
Test Image
The machine learning framework
• Apply a prediction function to a feature
representation of the image to get the desired
output:
f( ) = “apple”
f( ) = “tomato”
f( ) = “cow”
Slide credit: L. Lazebnik
The machine learning framework
f(x) = y
Prediction function Image Output (label)
or classifier feature
Training: Given a training set of labeled examples:

{(x1,y1), …, (xN,yN)}
Estimate the prediction function f by minimizing
the prediction error on the training set.
Testing: Apply f to a unseen test example xu and output

the predicted value yu = f(xu) to classify xu.
Training Training
Labels
Training
Images
Image Learned
Training
Features classifier
Testing
Test Image
Features and distance measures
define visual similarity.
Training labels
dictate that examples are the same or different.
Classifiers
learn weights (or parameters) of features and
distance measures…
so that visual similarity predicts label similarity.

Least squares line fitting
• Suppose we want to find out the relationship
between the weight and height measurements
• How can we define the linear relationship

between these two measurements?

Linear regression
• Least squares
• Total least squares
• Maximum likelihood
• Random sample consensus (RANSAC)

Generalization
Training set (labels known) Test set (labels unknown)
How well does a learned model generalize from the data

it was trained on to a new test set?

Slide16-Machine Learning For Computer Vision

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Slide16-Machine Learning For Computer Vision

Uploaded by

Copyright:

Available Formats

CS484: Introduction to Computer Vision

Machine Learning for Computer Vision

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 2

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 3

• Arguably the greatest export from computing to

• Statisticians might disagree with Computer

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 4

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 5

• PCA, ICA, LLE, Isomap, etc.

• Principal component analysis

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 10

• Picking d with the

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 13

Basis of variance (eigenvectors)

M. Turk; A. Pentland (1991). "Face recognition using eigenfaces" (PDF).

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 16

[Felzenszwalb and Huttenlocher 2004]

Lecturer: Min H. Kim (KAIST) [ShiVision

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 18

Group together similar ‘points’ and represent

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 20

• Agglomerative clustering, spectral clustering,

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 22

• Norvig – “The Unreasonable Effectiveness of

• Images for each

CS484: Introduction to Computer Vision 29

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 31

Training Validation Testing

- Train classifier - Measure error - Secret labels

Random train/validate splits = cross validation

Image Features Apply classifier Prediction

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 34

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 35

Image Features Apply classifier Prediction

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 37

E.G., ImageNet E.G., MS Coco

Unsupervised “Weakly” supervised Fully supervised

‘Semi-supervised’: small partial labeling Fuzzy; definition depends on task

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 39

CS484: Introduction to Computer Vision 40

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 41

Image Features Apply classifier Prediction

Training: Given a training set of labeled examples:

Testing: Apply f to a unseen test example xu and output

Image Features Apply classifier Prediction

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 46

• How can we define the linear relationship

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 48

Lecturer: Min H. Kim (KAIST) CS484: Introduction to Computer Vision 49

Training set (labels known) Test set (labels unknown)

How well does a learned model generalize from the data

You might also like