0 Up votes0 Down votes

6 views75 pagestry

Mar 19, 2018

© © All Rights Reserved

PDF, TXT or read online from Scribd

try

© All Rights Reserved

6 views

try

© All Rights Reserved

- Clustering_via_KMeans_and_Kohonen_SOM
- Project Description
- Efficient Classification of Data Using Decision Tree
- kmeans __ Functions (Statistics Toolbox™)
- An Entropy-based Feature in Epileptic Seizure Prediction Algorithm
- Survey Detection of Crop Diseases Using Multiscaling Technique
- IJCSS-Volume13 2014 Edition1 FilipcicClassificationoftopmaletennisplayers
- clustering_kmeans
- 05442732
- 161-549-1-PB
- psyc993_09
- kmeans_em
- Cluster.ppt
- LSD1211 - Semi supervised Biased Maximum Margin Analysis.doc
- Cluster Analysis
- A6
- Implementing and compiling clustering using Mac Queens alias K-means apriori algorithm
- Data Mining Sem 2 Notes
- Ma Mru Dm Chapter11
- Spell Man Result

You are on page 1of 75

Cordelia Schmid

Category recognition

• Image classification: assigning a class label to the image

Car: present

Cow: present

Bike: not present

Horse: not present

…

Category

Tasks

recognition

• Image classification: assigning a class label to the image

Car: present

Cow: present

Bike: not present

Horse: not present

…

• Object localization: define the location and the category

L

Location

ti

Car Cow

Category

Difficulties: within object variations

Within-object variations

Difficulties: within class variations

Image classification

• Given

Positive training images containing an object class

• Classify

A test image as to whether it contains the object class or not

?

Bag-of-features

Bag of features – Origin: texture recognition

or textons

Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001;

Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Bag-of-features

Bag of features – Origin: texture recognition

histogram

Julesz, 1981; Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001;

Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003

Bag-of-features

Bag of features – Origin: bag

bag-of-words

of words (text)

• Orderless document representation: frequencies of words

from a dictionary

• Classification to determine document categories

Bag-of-words

Co

Commono 2 0 1 3

People 3 0 0 2

Sculpture 0 1 3 0

… … … … …

Bag-of-features

Bag of features for image classification

SVM

descriptors and frequencies matrix

[Zhang,Marszalek,Lazebnik&Schmid,IJCV’07]

Bag-of-features

Bag of features for image classification

SVM

descriptors and frequencies matrix

Step 1 Step 2 Step 3

Step 1: feature extraction

• Scale-invariant

Scale invariant image regions + SIFT (see previous lecture)

– Affine invariant regions give “too” much invariance

– Rotation invariance for many realistic collections “too”

too much

invariance

• Dense descriptors

– Improve results in the context of categories (for most categories)

– Interest

I t t points

i t do

d nott necessarily

il capture

t “all”

“ ll” features

f t

• Color-based

Color based descriptors

• Shape-based

Shape based descriptors

Dense features

Computation of the SIFT descriptor for each grid cells

-Computation

-Exp.: Horizontal/vertical step size 3 pixel, scaling factor of 1.2 per level

Bag-of-features

Bag of features for image classification

SVM

descriptors and frequencies matrix

Step 1 Step 2 Step 3

Step 2: Quantization

Visual vocabulary

Clustering

Examples

p for visual words

Airplanes

Motorbikes

Faces

Wild Cats

Leaves

People

Bikes

Step 2: Quantization

• Cluster descriptors

– K-means

– Gaussian mixture model

• Assign

g each visual word to a cluster

– Hard or soft assignment

K-means

K means clustering

• Minimizing

g sum of squared

q Euclidean distances

between points xi and their nearest cluster centers

• Algorithm:

– Randomly y initialize K cluster centers

– Iterate until convergence:

• Assign each data point to the nearest center

• R

Recomputet eachh cluster

l t center t as th

the mean off allll points

i t

assigned to it

Gaussian mixture model (GMM)

• Mixture of Gaussians: weighted sum of Gaussians

where

ee

Hard or soft assignment

• K-means

K means hard assignment

– Assign to the closest cluster center

– Count number of descriptors assigned to a center

g

– Estimate distance to all centers

– Sum over number of descriptors

cy

frrequenc Image representation

…..

codewords

normalization with L1/L2 norm

• fine grained – represent model instances

• coarse grained – represent object categories

Bag-of-features

Bag of features for image classification

SVM

descriptors and frequencies matrix

Step 1 Step 2 Step 3

Step 3: Classification

bag of

features representations of images to different classes

Decision Zebra

boundary

Non-zebra

Training data

Vectors are histograms, one from each training image

positive negative

Train classifier,e.g.SVM

Linear classifiers

• Find linear function (hyperplane) to separate positive and

negative

i examples l

x i positive : xi w b 0

x i negative : xi w b 0

Which hyperplane

is best?

Linear classifiers - margin

x2

(color)

• G

Generalization

li ti iis nott

good in this case:

x1 (roundness)

x2

(color)

• Better if a margin

is introduced: b/|w|

x1 (roundness)

Nonlinear SVMs

• Datasets that are linearly separable work out great:

0 x

0 x

higher-dimensional

dimensional space:

x2

0 x

Nonlinear SVMs

• General idea: the original input space can always be

mapped to some higher-dimensional feature space

where the training set is separable:

Φ: x → φ(x)

Nonlinear SVMs

transformation φ(x), define a kernel function K such that

K(xi ,xj ) = φ(xi ) · φ(xj)

j

feature

eatu e space:

space

y K ( x , x) b

i

i i i

Kernels for bags of features

N

• Histogram intersection kernel: I (h1 , h2 ) min(h (i), h (i))

i 1

1 2

1 2

K (h1 , h2 ) exp D(h1 , h2 )

A

• D can be Euclidean distance RBF kernel

• D can be χ2 distance

N

D(h1 , h2 )

h1 (i) h2 (i) 2

i 1 h1 (i ) h2 (i )

Combining features

•SVM with multi-channel chi-square kernel

1 m

Dc ( H1 , H 2 )

2

i 1

[ ( h1i h2i ) 2

(h1i h2i )]

Kernel Learning (MKL)

[J. Zhang, M. Marszalek, S. Lazebnik and C. Schmid. Local features and kernels for

classification of texture and object categories: a comprehensive study, IJCV 2007]

Combining features

• For linear SVMs

– Early fusion: concatenation the descriptors

– Late fusion: learning weights to combine the classification scores

• In p

practice late fusion g

give better results

– In particular if different modalities are combined

Multi-class

Multi class SVMs

• Various direct formulations exist

exist, but they are not widely

used in practice. It is more common to obtain multi-class

SVMs by combining two-class

two class SVMs in various ways

– Training: learn an SVM for each class versus the others

– Testing: apply each SVM to test example and assign to it the

class of the SVM that returns the highest decision value

• One

O versus one:

– Training: learn an SVM for each pair of classes

– Testing: each learned SVM “votes”

votes for a class to assign to the test

example

Why does SVM learning work?

Illustration

Correct − Image: 35 Correct − Image: 37

20 20

40 40

60 60

80 80

100 100

120 120

20 20

40 40

60 60

80 80

100 100

120 120

Illustration

A linear SVM trained from positive and negative window descriptors

+ lie on object boundary (= local shape structures common to many training exemplars)

Bag-of-features

Bag of features for image classification

• Excellent results in the presence of background clutter

Examples for misclassified images

Bag of visual words summary

• Advantages:

– largely unaffected by position and orientation of object in image

– fixed length vector irrespective of number of detections

– veryy successful in classifying

y g images

g according g to the objects

j they

y

contain

• Disadvantages:

– no explicit use of configuration of visual word positions

– no model of the object location

Evaluation of image classification

• PASCAL VOC [05

[05-12]

12] datasets

– Training and test dataset available

– Used to report state

state-of-the-art

of the art results

– Collected January 2007 from Flickr

– 500 000 images downloaded and random subset selected

– 20 classes

– Class labels per image + bounding boxes

– 5011 ttraining

i i iimages, 4952 ttestt iimages

PASCAL 2007 dataset

PASCAL 2007 dataset

Evaluation

Precision/Recall

Results for PASCAL 2007

• Winner of PASCAL 2007 [[Marszalek et al.]] : mAP 59.4

– Combination of several different channels (dense + interest

points, SIFT + color descriptors, spatial grids)

– Non-linear

N li SVM with

ith G

Gaussian

i kkernell

al. 2009] : mAP 62

62.2

2

– Combination of several features

– Group-based

p MKL approach

pp

[Harzallah et al.’09] : mAP 63.5

– Use detection results to improve classification

Spatial pyramid matching

• Add spatial information to the bag

bag-of-features

of features

• Perform

P f matching

t hi ini 2D iimage space

Related work

Similar approaches:

Subblock description [Szummer & Picard, 1997]

SIFT [Lowe, 1999]

GIST [Torralba et al., 2003]

SIFT Gist

(1999, 2004) Torralba et al

al. (2003)

Spatial pyramid representation

Locally orderless

representation

i at

several levels of

spatial resolution

level 0

Spatial pyramid representation

Locally orderless

representation

i at

several levels of

spatial resolution

level 0 level 1

Spatial pyramid representation

Locally orderless

representation

i at

several levels of

spatial resolution

Spatial pyramid matching

• Combination of spatial levels with pyramid match kernel

[Grauman & Darell’05]

• Intersect histograms, more weight to finer grids

Scene dataset [Labzenik et al.’06]

Coast Forest Mountain Open country Highway Inside city Tall building Street

Store Industrial

4385 images

155 categories

c ego es

Scene classification

L Single-level

Single level Pyramid

0(1x1) 72.2±0.6

1(2x2) 77.9±0.6 79.0 ±0.5

2(4x4) 79.4±0.3 81.1 ±0.3

3(8x8) 77.2±0.4 80.7 ±0.3

Retrieval examples

Category classification – CalTech101

L Single-level Pyramid

0(1x1) 41.2±1.2

1(2x2) 55.9±0.9 57.0 ±0.8

2(4x4) 63.6±0.9 64.6 ±0.8

3(8x8) 60 3±0 9

60.3±0.9 64 6 ±0.7

64.6 ±0 7

Evaluation BoF – spatial

Image classification results on PASCAL

PASCAL’07

07 train/val set

spatial layout

1 0.53

2x2

3x1

1,2x2,3x1

Evaluation BoF – spatial

Image classification results on PASCAL

PASCAL’07

07 train/val set

spatial layout

1 0.53

2x2 0.52

3x1 0.52

1,2x2,3x1 0.54

C bi i iimproves average results,

Combination l ii.e., iit iis appropriate

i ffor

some classes

Evaluation BoF - spatial

for individual categories

g

1 3x1

Sheep 0.339 0.256

Bird 0.539 0.484

DiningTable 0.455 0.502

Train 0.724 0.745

g y dependent!

p

Combination helps somewhat

Discussion

• Summary

– Spatial pyramid representation: appearance of local image

patches

t h + coarse global

l b l position

iti iinformation

f ti

– Substantial improvement over bag of features

– Depends on the similarity of image layout

• Recent extensions

– Flexible, object-centered grid

• Shape

p masks [[Marszalek’12]] => additional annotations

– Weakly supervised localization of objects

• [Russakovsky et al.’12]

Recent extensions

[Perronnin et al.

al.’10,

10, Maji and Berg’09,

Berg 09, A. Vedaldi and Zisserman’10]

Zisserman 10]

– Fisher vector [Perronnin & Dance ‘07]

– VLAD descriptor [Jegou, Douze, Schmid, Perez ‘10]

– Supervector [Zhou et al. ‘10]

– Sparse coding [Wang et al. ’10, Boureau et al.’10]

Fisher vector

Statistical measure of the descriptors of the image w.r.t the GMM

D i ti off likelihood

Derivative lik lih d w.r.t.

t GMM parameterst

GMM parameters:

weight

mean

co-variance (diagonal)

Translated cluster →

large derivative on for this

component

Fisher vector

- only

l ddeviation

i ti wrtt mean, di

dim: K*D [K number

b off Gaussians,

G i D di

dim off d

descriptor]

i ]

- variance does not improve for comparable vector length

Image classification with Fisher vector

• Dense SIFT

• Fisher vector (k=32 to 1024, total dimension from approx.

5000 to 160000)

• Normalization

– square-rooting

– L2 normalization

– [Perronnin’10], [Image categorization using Fisher kernels of non-iid

image models, Cinbis, Verbeek, Schmid, CVPR’12]

• Classification approach

– Linear classifiers

– One versus

ers s rest classifier

Image classification with Fisher vector

• Evaluation on PASCAL VOC’07 linear classifiers with

– Fisher vector

– Sqrt transformation of Fisher vector

– Latent GMM of Fisher vector

models lead to improvement

p

• State-of-the-art performance

obtained

bt i d with

ith lilinear classifier

l ifi

Evaluation image description

Fisher versus BOF vector + linear classifier on Pascal Voc’07

•Fisher comparable

p to BOF +

non-linear classifier

•Limited gain due to SPM

on PASCAL

•Sqrt helps for Fisher and BOF

•[Chatfield et al

al. 2011]

Large-scale

Large scale image classification

has 14M images

g from 22k classes

Standard Subsets

– ImageNet

I N t Large

L S

Scale

l Vi

Visuall R

Recognition

iti ChChallenge

ll 2010 (ILSVRC)

• 1000 classes and 1.4M images

– ImageNet10K dataset

• 10184 classes and ~ 9 M images

Large-scale

Large scale image classification

• Classification approach

– One-versus-rest classifiers

– Stochastic gradient descent (SGD)

– At each step choose a sample at random and update the

parameters using a sample-wise estimate of the regularized risk

• Data reweighting

– Wh

When some classes

l are significantly

i ifi tl more populated

l t d than

th others,

th

rebalancing positive and negative examples

– Empirical

p risk with reweighting

g g

Importance of re-weighting

re weighting

d h d one tto u-OVR

dashed OVR

for each positive, β=1 natural

rebalancing

• For very high dimensions little impact

Impact of the image signature size

• Fisher vector (no SP) for varying number of Gaussians +

different classification methods, ILSVRC 2010

• Performance

P f improves

i f higher

for hi h di

dimensional

i l vectors

t

Experimental results

• Features: dense SIFT,

SIFT reduced to 64 dim with PCA

• Fisher vectors

– 256 Gaussians, using mean and variance

– Spatial pyramid with 4 regions

– Approx. 130K dimensions (4x [2x64x256])

– Normalization: square-rooting and L2 norm

– 4960 dimensions

– Normalization: square-rooting and L2 norm

Experimental results for ILSVRC 2010

• Features

F t : dense

d SIFT,

SIFT reduced

d d to

t 64 dim

di with

ith PCA

• 256 Gaussian Fisher vector using mean and variance + SP

(3x1) (4x [2x64x256] ~ 130k dim), square-root + L2 norm

• BOF dim=1024 + SP (3x1) (dim 4000), square-root + L2 norm

• Different classification methods

Large-scale

Large scale experiment on ImageNet10k

16.7

Top-1 accuracy

dimensional Fisher vectors

• w-OVR > u-OVR

• Improves

Impro es oover

er state of the art

art: 6

6.4%

4% [Deng et

et. al] and

WAR [Weston et al.]

Large-scale

Large scale experiment on ImageNet10k

• Illustration of results obtained with w

w-OVR

OVR and 130K

130K-dim

dim

Fisher vectors, ImageNet10K top-1 accuracy

Conclusion

well-suited

suited for

large-scale datasets

classification

one-versus-rest strategy is a must for competitive

p

performance

Conclusion

classification

• Code on

on-line

line available at http://lear

http://lear.inrialpes.fr/software

inrialpes fr/software

• Future work

– Beyond a single representation of the entire image

– Take into account the hierarchical structure

- Clustering_via_KMeans_and_Kohonen_SOMUploaded byFunes Maga
- Project DescriptionUploaded byWeslley Caldas
- Efficient Classification of Data Using Decision TreeUploaded byBONFRING
- kmeans __ Functions (Statistics Toolbox™)Uploaded byCahyo Aji Nugroho
- An Entropy-based Feature in Epileptic Seizure Prediction AlgorithmUploaded byIOSRjournal
- Survey Detection of Crop Diseases Using Multiscaling TechniqueUploaded byEditor IJRITCC
- IJCSS-Volume13 2014 Edition1 FilipcicClassificationoftopmaletennisplayersUploaded byLu Irusta
- clustering_kmeansUploaded bykarkarkuzhali
- 05442732Uploaded byashikhmd4467
- 161-549-1-PBUploaded byprabhumalu
- psyc993_09Uploaded bySharvin Balakrishnan
- kmeans_emUploaded byanggawijayakusuma
- Cluster.pptUploaded byShashank Gangadharabhatla
- LSD1211 - Semi supervised Biased Maximum Margin Analysis.docUploaded bySujith Kattappana
- Cluster AnalysisUploaded byArpan Kumar
- A6Uploaded byPallavi Vetal
- Implementing and compiling clustering using Mac Queens alias K-means apriori algorithmUploaded byMaurice Lee
- Data Mining Sem 2 NotesUploaded byVishnu
- Ma Mru Dm Chapter11Uploaded byPop Roxana
- Spell Man ResultUploaded bysridharegsp
- 2014+A+general+framework+for+constructing+possibilistic+membership+functions+in+alternating+cluster+estimationUploaded byIrina Alexandra
- 0013_12_FUZZY_RENDSZEREK_SZAMITASI_INTELLIGENCIA_B.pdfUploaded bynoorahmed1991
- IJSET_2014_312Uploaded byInnovative Research Publications
- Annexure IIUploaded bysravan.oem
- IJCNCUploaded byAIRCC - IJCNC
- Summary Phd David Martens (1)Uploaded byHitesh Jogani
- machine-learning-basics-infographic-with-algorithm-examples.pdfUploaded byFlorencia Ruiz
- index-classifierUploaded bypostscript
- getPDF21Uploaded byKhairul
- Semisupervised Biased Maximum Margin Analysis for Interactive Image RetrievalUploaded byieeexploreprojects

- Recent Image Processing Techniques in Forensic OdontologyUploaded byBudi Purnomo
- 03.H-minima transform for segmentation of structured surface.pdfUploaded byBudi Purnomo
- 01.Improved Watershed Algorithm for Cell Image SegmentationUploaded byBudi Purnomo
- [IEEE] Human Identification Using Dental Biometric Analysis_RUBAH2Uploaded byBudi Purnomo
- [17] Dental Intraoral System Supporting Tooth SegmentationUploaded byBudi Purnomo
- Lecture 9 1 SegmentationUploaded byBudi Purnomo
- Watershed Segmentation Using Otsu Based ThresholdingUploaded byBudi Purnomo
- Teeth Feature Extraction and Matching for HumanUploaded byBudi Purnomo
- OtsuUploaded byfamtalu
- Snake-Based Segmentation of Teeth From Virtual Dental CastsUploaded byBudi Purnomo
- Tutorial Bow Iciap13 3Uploaded byBudi Purnomo
- Face Detection using ANN.pdfUploaded byBudi Purnomo
- Tutorial Bow Iciap13 1Uploaded byBudi Purnomo
- Dental Radiographs and Photographs in Human Forensic IdentificationUploaded byBudi Purnomo
- Color Image to Grayscale Image ConversionUploaded byBudi Purnomo
- 06701708Uploaded byBudi Purnomo
- Statistical Gaussian Model of Image Regions in Stochastic Watershed SegmentationUploaded byBudi Purnomo
- Lung Detection and Segmentation Using Marker Watershed and laplacian filtering.pdfUploaded byBudi Purnomo
- Morphological PCBUploaded byBudi Purnomo
- 1-Introduction 16x9 MatlabUploaded byBudi Purnomo
- entrophy_histogram.pdfUploaded byBudi Purnomo
- Feature Extraction Morplogy GradienUploaded byBudi Purnomo
- Centroid With K-MEANUploaded byBudi Purnomo
- adaptive_histogram.pdfUploaded byBudi Purnomo
- Bergen Boin Sanjani Blood Vessel SegmentationUploaded byBudi Purnomo
- cell-counting.pdfUploaded byBudi Purnomo
- cbir_ieeeUploaded byBudi Purnomo
- Caries Detection2Uploaded byBudi Purnomo

- Arraylist,Hashtable,DictionaryUploaded byManoj Kavedia
- Diamond WireUploaded bySantiago Rosero
- 2.2 Classification MATLAB DT Revised10SepUploaded byhorlandovragas
- A-401Uploaded byguitar13
- Calotropis Procera: The Miracle Shrub in the Arabian PeninsulaUploaded byInternational Journal of Science and Engineering Investigations
- Nova PHOx - Service ManualUploaded byArmando Aguilar
- Alup OIS 30-90kW leaflet English LR.pdfUploaded byIvan Letona
- 9781418837051pptch07-110808182305-phpapp02.pptUploaded byNaveed Khan Abbu
- 1-Syringe-Siliconisation-Trends-Methods-Analysis-Procedures.pdfUploaded bySofiaProtopsalti
- Vehicle Black Box Final ReportUploaded bypraveen_joseph_4
- 1967456882_Storage Equipment Information Bulletin No 1 Sep 09Uploaded bySebastien Cabot
- Analytical Techniques 4Uploaded byvamsikrishna14
- ADuC7019_7020_7021_7022_7024_7025_7026_7027_7028_7029Uploaded bydinaero
- EasyPipe ManualUploaded bykatarinakarac
- t3s3Uploaded byAshim Kumar
- 1071S13_Exam2ReviewUploaded byHoàng Long
- Ant Colony Optimization Algorithm Routing for MANET in Urban EnvironmentsUploaded bySharique Afridi Khan
- Paper 3 Answering TechniquesUploaded byChithiran Cullen
- 12A Dispatching Method for Trucks at Container Terminal by Using Fuzzy-CNP Concept.pdfUploaded byEmanuel Almeida Pastl
- Opnet TutorialUploaded byabdessalem_t
- HP CP2025.pdfUploaded byhatchett927
- 770 -- 50 Q, 46 v -- Content First, Then SpeedUploaded bybrokensoul706
- Exp 6 Free & Forced Convection Heat ExchangerUploaded byGriezmann Haziq
- MD213-2016-17-RKUUploaded bysimalaravi
- FMDS0409Uploaded byPedro Rezende Coelho
- Anilisis AnovaUploaded byReza Novianda
- aristotle-prior-analytics-hackett-1989.pdfUploaded byHumpt Vivacqua
- 1. Two Tank Non-Interacting System.pdfUploaded byRiki Biswas
- Lecture 21Uploaded bycadbury007010
- Brochure Precast WallsUploaded by蓉蓉

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.