You are on page 1of 52

19/11/2017

Segmentation and
Classification

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Similarity

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

1
19/11/2017

Similarity

Similarity:
Is the mathematical transposition of the concept of analogy. Analogy is used in
any moment of our life for pattern recognition, i.e. to recognize, to distinguish,
to classify.

Distances:
Are the starting point for evaluating similarity: close samples are considered
similar, far samples are considered dissimilar

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Similarity

Distances between two samples


There are many ways to calculate distances between 2 points

Manhattan /City block, Minkowsky, Chebychev, Camberra,…

Distance

θ
( )

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

2
19/11/2017

Similarity

Distances between two samples


There are many ways to calculate distances between 2 points

Manhattan /City block, Minkowsky, Chebychev, Camberra,…

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Similarity

Clusters:
Cluster methods search for the presence of groups (clusters) in the data,
based on distances.  UNSUPERVISED

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

3
19/11/2017

Similarity

Clusters

Do not confuse clustering and classification


Clustering methods
search for the presence of groups (clusters) in the data. They are unsupervised
and based on calculation of distances.

Classification
use the class information (supervised): they separate classes and their goal is to
find models able to correctly assign each sample to its proper class.

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Similarity

Ad-hoc classification of pattern recognition methods

Unsupervised Pattern Recognition


PCA, clustering

Linear LDA

Pure classification
QDA
Non-linear K-NN

Classification
SIMCA
Linear PLSDA
Class-modelling
Variations PLSDA
Non-linear ANN
SVM

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

4
19/11/2017

Clustering

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

Clustering families

1. Agglomerative or hierarchical methods 

Each pixel is considered as a class

Decreasing the number of classes-clusters

Many methods to solve the problem

2. Partitional methods 

Preselection of number of clusters

But this selection is not easy

Hard Clustering  KNN


Two main methods
Fuzzy Clustering  FCM

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

5
19/11/2017

Clustering

K-means clustering

K-means assigns each pixel xmn of the image to the kth cluster, whose center is
nearest, by minimizing the sum of the squared distances of each pixel to its
corresponding center

each pixel centroid

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

K-means clustering workflow

(1) Random selection of k pixels to be the initial centroids

(2) Generate k clusters depending on the cluster distances to the centroids.

Assign each pixel to the nearest cluster centroid

(3) Re-assignation of the centroids according to the minimization of J

(4) Re-assignation of the pixels in the clusters and re-calculation of J

(5) Repeat 2 to 4 until convergence

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

6
19/11/2017

Clustering

K-means. Calculation of number of clusters

The number of clusters could be the Rank (PCA)

But the discrimination capacity of K-means is higher than PCA

There are many methods to calculate the number of clusters:

PCA

Silhouette

Distances between - within

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

K-means. Calculation of number of clusters

Silhouette

Calculated for each xmn pixel and offers a measure about the similarity between
points in the same cluster compared to points in other clusters:

amn  average distance between each


mnth pixel and all the pixels included in
the same cluster

bmn  minimum average distance


between each mnth pixel and the pixels
included in other clusters

(Smn)k close to 1  Correct classification


(Smn)k with negative value  Missclasification

EXTREMELY SLOW!!!

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

7
19/11/2017

Clustering

K-means. Calculation of number of clusters

Silhouette

Silhouette 2 clusters

1
Cluster

0 0.2 0.4 0.6 0.8 1


Silhouette Value

Silhouette 3 clusters

1
Cluster

0 0.2 0.4 0.6 0.8 1


Silhouette Value

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

K-means. Calculation of number of clusters

Between – Within distances

For each clusters, it measures the influence of the rest of the clusters

Comparison between:
- The mean distance within one cluster and its centroid
- The distance between the rest of centroids and the cluster

The influence of cluster A to B it is not the same that the influence of cluster B to A

B
A

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

8
19/11/2017

Clustering

K-means. Benefits and drawbacks


Extremely simple to program

Resolving mixtures

Quantitation with the multi-set images

Extremely sensitive with noise and outliers. Trend to converge to local minima

Each pixel belongs to a cluster

Only valid for entities (objects). Not for mixtures

Difficult to assess the number of clusters

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

Fuzzy clustering

Each pixel is assigned a fractional degree of membership to all the clusters


simultaneously, rather than it belonging completely to one cluster (as in KM
clustering).

FCM allows pixels in the edge to belong to one cluster to a lesser degree than
pixels that are in the middle of the cluster.

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

9
19/11/2017

Clustering

Fuzzy clustering
Membership function coefficient ugmnk is calculated for each xmn pixel in such a
way that each coefficient is compressed between 0 and 1, and the sum of all the
coefficients is defined to be 1

g is the so-called ‘‘fuzzifier’’ constant, which determines the fuzziness of the


clustering result

A good value of g is 2, which indicates that the coefficients are linearly


normalized to make this sum 1

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

Fuzzy clustering

Now, the J function is calculated as follows:

and cluster center mk is calculated as the weighted mean:

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

10
19/11/2017

Clustering

Fuzzy clustering. Number of clusters

In this case, it is more difficult to calculate the number of clusters

Partition entropy

mnk and MN denotes the total amount of pixels of the image.

PE value ranges from 0 to log(K).

Values close to 0 indicate a good estimation of the number of clusters, whereas


PE values close to log K indicate that the number of clusters does not reflect
the real structure of the image.

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

Fuzzy clustering. Number of clusters

In this case, it is more difficult to calculate the number of clusters

Partition entropy

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

11
19/11/2017

Clustering

Fuzzy clustering. Benefits and drawbacks


Each pixel is assigned a belonging degree to all the clusters

Resolving mixtures

Extremely sensitive with noise and outliers. Trend to converge to local minima

Difficult to assess the number of clusters

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

Clustering. Example with ibuprofen poder mixture

Data in sample_demo.mat

The sample is a hyperspectral image composed by two compounds, ibuprofen


and starch. They are powders that have been mechanically mixed

Pure spectra
0.4
Starch
0.35

0.3

Mixture 0.25

0.2

0.15

0.1
Ibuprofen
0.05
1200 1300 1400 1500 1600 1700 1800 1900 2000

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

12
19/11/2017

Clustering

Clustering. Example with ibuprofen poder mixture

K-means with 2 clusters

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

Clustering. Example with ibuprofen poder mixture

K-means with 3 clusters

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

13
19/11/2017

Clustering

Clustering. Example with ibuprofen poder mixture

K-means with 4 clusters

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

Clustering. Example with ibuprofen poder mixture

K-means with 9 clusters

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

14
19/11/2017

Clustering

Clustering. Example with ibuprofen poder mixture

FCM with 2 clusters

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

Clustering. Example of the pastics

Kmeans with 4 clusters

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

15
19/11/2017

Clustering

Clustering. Example of the pastics

Kmeans with 5 clusters

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Clustering

Clustering. Example of the pastics

FCM with 4 clusters

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

16
19/11/2017

Classification
Overview

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification

Classification concept

Classification aims at finding a criterion to assign an object (sample) to one


category (class) based on a set of measurements performed on the object
itself.

Category or class is a (ideal) group of objects sharing similar characteristics

In classification, the categories are defined a priori

Classification stablishes boundaries depending of the criterion selected

The classification strongly depends on the representativeness of the measured


variables with respect to the samples

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

17
19/11/2017

Classification

Classification concept: (inspired by Federico Marini)

Criterion:
Nordic - Italian
Beer - Wine

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification

Classification methods
Chemometric techniques aimed at finding mathematical models able to
recognize the membership of each sample to its proper class on the basis of a
set of measurements (X).

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

18
19/11/2017

Classification

Classification methods
Chemometric techniques aimed at finding mathematical models able to
recognize the membership of each sample to its proper class on the basis of a
set of measurements (X).

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification

Classification methods
Distinctions can be made among classification techniques on the basis of the
mathematical form of the decision boundary, i.e. on the basis of the ability to
detect linear or non-linear boundaries

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

19
19/11/2017

Classification

Classification methods
Another important distinction can be made among pure classification and class-
modeling methods

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification

Classification methods

Classes can be defined in different ways:

- By theoretical knowledge or experimental evidences

- By Discretizing a quantitative response

< 5  Class A >= 5.1 – 7 <  Class B >= 7.1  Class C

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

20
19/11/2017

Classification

Classification methods

Once the classes have been defined, we construct a model

Variables

A
A
Samples

MODEL B
X (I x J) B
Class = f(X)
B
C
C
C

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification

Classification methods

And then we can predict the category of unknown samples

Variables

MODEL
A
Class = f(X)
Unknown sample
(1 x J)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

21
19/11/2017

Classification

Classification methods

Classification uses the class information to find models that associate each
sample to the assigned class  SUPERVISED

Linear Discriminant Analysis

SIMCA

PLS-DA

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification
K-NN

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

22
19/11/2017

Classification – K-NN

K-NN
It is the benchmark method for unsupervised classification based on measuring
distances (analogy – simmilarity).

Each sample is classified on the basis of the most represented classes of the k
nearest samples.

It is a non-linear method that needs class information

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification
Linear Discriminant Analysis

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

23
19/11/2017

Classification - LDA

Discriminant Analysis
Separates samples into classes by finding directions which:
 maximize the variance between classes
 minimize the variance within classes

PC1  GOOD
LD1  GOOD

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - LDA

Discriminant Analysis
Separates samples into classes by finding directions which:
 maximize the variance between classes
 minimize the variance within classes

PC1  BAD
LD1  GOOD

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

24
19/11/2017

Classification - LDA

Discriminant Analysis
Separates samples into classes by finding directions which:
 maximize the variance between classes
 minimize the variance within classes

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - LDA

Linear discriminant Analysis (LDA)

LDA is a method that separate samples into classes by finding directions which
maximize the variance between classes and minimize the variance whithin
classes.

Two main assumptions of our data:

Each k-class density is a multivariate gaussian

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

25
19/11/2017

Classification - LDA

Linear discriminant Analysis (LDA)

LDA is a method that separate samples into classes by finding directions which
maximize the variance between classes and minimize the variance whithin
classes.

Two main assumptions of our data:

All the class covariance matrices are presumed to be identical

Sg = S
FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - LDA

LDA is based on probabilities

It calculates the probability of belonging to each class by applying the Bayes


Theorem:

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

26
19/11/2017

Classification - LDA

LDA is based on probabilities

Once the probability has been calculated, LDA assigns samples to an especific
class with minimum discriminant score dg

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - LDA

LDA is based on probabilities

Once the probability has been calculated, LDA assigns samples to an especific
class with minimum discriminant score dg

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

27
19/11/2017

Classification - LDA

LDA is based on probabilities

Once the probability has been calculated, LDA assigns samples to an especific
class with minimum discriminant score dg

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - LDA

LDA benefits and drawbacks

Easy to use and reliable

The number of samples must be higher than the number of variables. But this is
not a problem on images if the calibration is made from standard images
LDA assumes that the data follows a Gaussian distribution

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

28
19/11/2017

Classification
SIMCA

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Definition

• Originally proposed by Wold in 1976

 SOFT: No assumption of the distribution of variable is made (bilinear


modeling)

 INDEPENDENT: Each category is modeled independently

 MODELING of CLASS ANALOGIES: Attention is focused on the


similarity between object from the same class rather then on
differentiating among classes.

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

29
19/11/2017

Classification - SIMCA

SIMCA. Definition

• Is a class-modelling method. Thus, it is a supervised method.


- A training set is needed to construct a model
- Projection of unknown samples to the model

• SIMCA is based on making independent PCA models for each class in the
training set, having each class the possibility of containing different number of
PCs.

• After the independent PCA models are constructed, the unknown samples are
projected onto them

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Graphical interpretation

1) Individual PCA model for each class. Each class pre-processed independently

PC1
PC1

PC2

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

30
19/11/2017

Classification - SIMCA

SIMCA. Graphical interpretation

1) Individual PCA model for each class. Each class pre-processed independently

PC1

PC2

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Graphical interpretation

2) Projection of other classes in the PCA class space of one class

PC1

PC2

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

31
19/11/2017

Classification - SIMCA

SIMCA. Class assignation

There are many implementations to decide if an unknown sample


belongs to a class or not.

Here we will talk about three variations of the same concept:

Hottelling T2 and residuals!

PC1

PC2

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Class assignation

There are many implementations to decide if an unknown sample


belongs to a class or not.

Here we will talk about three variations of the same concept:

?
residuals

Hotelling T2

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

32
19/11/2017

Classification - SIMCA

SIMCA. Class assignation


Strategy 1: Sample competition. One sample, one class

• SIMCA assigns samples to the nearest class.


• Samples are, therefore, always assigned.
• The distance of each i sample from each g class (dig) is calculated as

• where:

• Qig and T2ig are the Hotelling’s T2 and Q calculated in the PCA g-class model.
• Q0.95,g and T20.95,g are the confidence intervals within 95% of the g class

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Class assignation


Strategy 1: Sample competition. One sample, one class

Class A Class B
T2 T2

T20.95,A

T20.95,B

Q0.95,A Q Q0.95,B Q

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

33
19/11/2017

Classification - SIMCA

SIMCA. Class assignation


Strategy 1: Sample competition. One sample, one class

Class A Class B
T2 T2

T20.95,A

T20.95,B

Q0.95,A Q Q0.95,B Q

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Class assignation


Strategy 2: Conditioning

• SIMCA assigns a sample to the g class if

• Which is equivalent to:

• Samples can be:


• unassigned (i.e. outside the class spaces of all classes)
• classified in more than one class (confused)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

34
19/11/2017

Classification - SIMCA

SIMCA. Class assignation


Strategy 2: Conditioning

Class A Class B
T2 T2

T20.95,A

T20.95,B

Q0.95,A Q Q0.95,B Q

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Class assignation


Strategy 3: More restricted conditioning

• SIMCA assigns a sample to the g class if

• Similar approach to the second strategy


• Samples can be:
• unassigned (i.e. outside the class spaces of all classes)
• classified in more than one class (confused)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

35
19/11/2017

Classification - SIMCA

SIMCA. Class assignation


Strategy 3: More restricted conditioning

Class A Class B
T2 T2

T20.95,A

T20.95,B

Q0.95,A Q Q0.95,B Q

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Class assignation


Example with three samples

Class A Class B
T2 T2

T20.95,A

T20.95,B

Q0.95,A Q Q0.95,B Q

Strategy 1 2 3
A A A

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

36
19/11/2017

Classification - SIMCA

SIMCA. Class assignation


Example with three samples

Class A Class B
T2 T2

T20.95,A

T20.95,B

Q0.95,A Q Q0.95,B Q

Strategy 1 2 3
B AB B

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Class assignation


Example with three samples

Class A Class B
T2 T2

T20.95,A

T20.95,B

Q0.95,A Q Q0.95,B Q

Strategy 1 2 3
A None None

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

37
19/11/2017

Classification - SIMCA

SIMCA. Example
Development of SIMCA model with the 4 plastics

Calibration set: 655 spectra (between 150 – 200 spectra per class)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Example
Development of SIMCA model with the 4 plastics

Calibration development:

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

38
19/11/2017

Classification - SIMCA

SIMCA. Example
Development of SIMCA model with the 4 plastics

Calibration development:

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification - SIMCA

SIMCA. Example
Development of SIMCA model with the 4 plastics

Prediction:

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

39
19/11/2017

Classification - SIMCA

SIMCA. Benefits and drawbacks

Based on PCA

Single class modelling

Each class needs to be perfectly defined in the PCA space

The number of unassigned samples can be high, depending on the noise

One sample can belong to more than one class

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification
PLS-DA

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

40
19/11/2017

Classification – PLS-DA

PLS-DA. LDA with the covariance of PLS

PLS-DA is based on the same principles than PLS, covariance between


X and Y, but with the discriminant abbility

Unfolding
PLS-2 model
0
1

D Dummy
matrix

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – PLS-DA

PLS-DA. PLS-2 model

PLS-DA is based on the same principles than PLS, covariance between


X and Y, but with the discriminant abbility

PLS-2 model
0
1

D Dummy
matrix

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

41
19/11/2017

Classification – PLS-DA

PLS-DA. PLS-2 model

PLS-DA is based on the same principles than PLS, covariance between


X and Y, but with the discriminant abbility

The main difference is that Y is a dummy matrix with 0 and 1

PLS-2 model
0
1

D Dummy
matrix

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – PLS-DA

PLS-DA. Responses

The response of PLS-DA is still a number. Therefore, we need to find


rules to convert these numbers into classes

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

42
19/11/2017

Classification – PLS-DA

PLS-DA. Responses

The response of PLS-DA is still a number. Therefore, we need to find


rules to convert these numbers into classes

Bayes Theorem (like in LDA)

1) It assumes that the predicted values follow a normal distribution.


2) The treshold is selected where number of false positives and false
negatives is minimized

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – PLS-DA

PLS-DA. Responses

The response of PLS-DA is still a number. Therefore, we need to find


rules to convert these numbers into classes

Bayes Theorem (like in LDA)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

43
19/11/2017

Classification – PLS-DA

PLS-DA. Responses

The response of PLS-DA is still a number. Therefore, we need to find rules to


convert these numbers into classes

Bayes Theorem (like in LDA)

The rest, like in PLS (and SIMCA, LDA, KNN,... Any training-based method):

Cross-validation
Number of LVs
Etc...

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification
Assessment of classification
models

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

44
19/11/2017

Classification – Assessment

Confusion matrix

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – Assessment

Classifiers

TP  True positive
FP  False positive
FN  False negative
TN  True negative

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

45
19/11/2017

Classification – Assessment

Classifiers

TP  True positive
FP  False positive
FN  False negative
TN  True negative

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – Assessment

Classifiers

TP  True positive
FP  False positive
FN  False negative
TN  True negative

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

46
19/11/2017

Classification – Assessment

Classifiers

TP  True positive
FP  False positive
FN  False negative
TN  True negative

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – Assessment

Classifiers

TP  True positive
FP  False positive
FN  False negative
TN  True negative

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

47
19/11/2017

Classification – Assessment

Classifiers

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – Assessment

Classifiers
Receiver Operating Characteristics (ROC curves)

ROC curves is a graphical plot of Sp and Sn as X and Y axes respectively, for a


binary classification system as its discrimination threshold is changed. They are used
to estimate the best classification score.

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

48
19/11/2017

Classification – Assessment

Classifiers
Receiver Operating Characteristics (ROC curves)

ROC curves is a graphical plot of Sp and Sn as X and Y axes respectively, for a


binary classification system as its discrimination threshold is changed. They are used
to estimate the best classification score.

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – Assessment

Classifiers
Receiver Operating Characteristics (ROC curves)

ROC curves is a graphical plot of Sp and Sn as X and Y axes respectively, for a


binary classification system as its discrimination threshold is changed. They are used
to estimate the best classification score.

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

49
19/11/2017

Classification – PLS-DA

PLS-DA. Example
Development of PLS-DA model with the 4 plastics

Calibration set: 655 spectra (between 150 – 200 spectra per class)

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – PLS-DA

PLS-DA. Example
Development of a PLS-Da model with the 4 plastics

Calibration development:

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

50
19/11/2017

Classification – PLS-DA

PLS-DA. Example
Development of a PLS-Da model with the 4 plastics

Calibration development:

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – PLS-DA

PLS-DA. Example
Development of a PLS-Da model with the 4 plastics

Visualization of the prediction:

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

51
19/11/2017

Classification – PLS-DA

Comparison SIMCA - PLSDA

SIMCA PLS-DA

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

Classification – PLS-DA

PLS-DA. Benefits and drawbacks

Based on co-variance between X and Y


More reliable than SIMCA

Does not allow single class modelling

One sample can belong to more than one class

FOOD, U. Copenhagen, Denmark. www.models.life.ku.dk www.hypertools.org jmar@life.ku.dk

52

You might also like