1 views

Original Title: Block 07 Segmentation Classification

Uploaded by Paolo Gianfranco Luna Victoria Gutierrez

- tmpFCBE.tmp
- The World in a Nutshell Concise Range Queries
- ijgi-04-02660
- EDA_Lecture11.pdf
- Malhotra Mr05 Ppt 21
- Suzaimah_SPC_Paper.doc
- Farthest First Clustering in Links Reorganization
- Lx 3520322036
- 2015 - Face Detection State of the Art.docxwtye
- CJBAS-13-01-02-05
- An Approach of Secure Face recognition using Linear discriminant analysis in Network
- A Similarity Measure for Text Classification and Clustering
- Defamilialisation and Welfare State Regimes- A Cluster Analysis
- Article
- User Guide
- Cybersecurity Copula Classification
- Basepaper (2)
- Using Learning Vector Quantization in IDS Alert Management System
- Questions
- Nature 12160

You are on page 1of 86

Outline

1. Unsupervised Segmentation

1.2. Fuzzy clustering

2.1. LDA

2.2. SIMCA

2.3. PLS-DA

Linear Classification Methods

Outline

SIMCA

PLS-DA

Introduction

Classification concept:

to one category (class) based on a set of measurements performed on

the object itself.

characteristics

selected

Introduction

Criterion:

Nordic - Italian

Beer - Wine

Introduction

Classification concept:

Variables

the representativeness of the measured

Samples

X (I x J)

Introduction

mathematical models able to recognize the membership of each sample to

its proper class on the basis of a set of measurements (X).

Introduction

mathematical models able to recognize the membership of each sample to

its proper class on the basis of a set of measurements (X).

Introduction

techniques on the basis of the mathematical form of the decision boundary,

i.e. on the basis of the ability to detect linear or non-linear boundaries

Introduction

pure classification and class-modeling methods

Introduction

Introduction

a model

Variables

A

A

Samples

MODEL B

X (I x J) B

Class = f(X)

B

C

C

C

Introduction

samples

Variables

MODEL

A

Class = f(X)

Unknown sample

(1 x J)

Introduction

Similarity:

Is the mathematical transposition of the concept of analogy. Analogy is used

in any moment of our life for pattern recognition, i.e. to recognize, to

distinguish, to classify.

Distances:

Are the starting point for evaluating similarity: close samples are considered

similar, far samples are considered dissimilar

Introduction

Introduction

in the data, based on distances. UNSUPERVISED

Introduction

Clustering methods

search for the presence of groups (clusters) in the data. They are

unsupervised and based on calculation of distances.

Classification

use the class information (supervised): they separate classes and their goal

is to find models able to correctly assign each sample to its proper class.

measures

Introduction

PCA, clustering

Linear LDA

Pure classification

QDA

Non-linear K-NN

Classification

SIMCA

Linear PLSDA

Class-modelling

Variations PLSDA

Non-linear ANN

SVM

Segmentation

SIMCA

PLS-DA Class-modelling

LDA

PCA

Clustering

Distances, similarity and clustering

Distances The distances are the starting points for evaluating similarity:

Close samples Similar samples

Far samples Dissimilar samples

Centroid

image

( )

d-dimensions

Distances, similarity and clustering

in the data, based on distances. UNSUPERVISED

Distances, similarity and clustering

1. Agglomerative or hierarchical methods

Decreasing the number of classes-clusters

2. Partitional methods

Hard Clustering KNN

Two main methods

Fuzzy Clustering FCM

K-NN

on measuring distances (analogy simmilarity).

the k nearest samples.

K-NN

The KM algorithm assigns each pixel xmn of the image to the kth

cluster, whose center is nearest, by minimizing the sum of the

squared distances of each pixel to its corresponding center

K-NN

(1) Choose the number of clusters k

K-NN

Advantages

Simplicity

Drawbacks

Risk of converging to a local minimum in the iterations

K-NN

Silhouette index

Calculated for each xmn pixel and offers a measure about the similarity between

points in the same cluster compared to points in other clusters:

mnth pixel and all the pixels included in

the same cluster

between each mnth pixel and the pixels

included in other clusters

(Smn)k with negative value Missclasification

EXTREMELY SLOW!!!

K-NN

- Dataset sample_demo.mat

2

1

Cluster

Cluster

Cluster

3

2

2

3 4

0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Silhouette Value Silhouette Value Silhouette Value

K-NN

20 20 20

40 40 40

60 60 60

20 40 60 80 20 40 60 80 20 40 60 80

2 4 4

1

2 2

intensity

intensity

intensity

0

0 0

-1

-2 -2 -2

1200 1400 1600 1800 2000 1200 1400 1600 1800 2000 1200 1400 1600 1800 2000

wavelength wavelength wavelength

No of clusters = 5 No of clusters = 6 No of clusters = 7

K-NN

Centroids

No of clusters = 2 No of clusters = 3 No of clusters = 4

2 4 4

1

2 2

intensity

intensity

intensity

0

0 0

-1

-2 -2 -2

1200 1400 1600 1800 2000 1200 1400 1600 1800 2000 1200 1400 1600 1800 2000

wavelength wavelength wavelength

No of clusters = 5 No of clusters = 6 No of clusters = 7

Pure spectra

0.4

0.35

0.3

0.25

0.2

0.15

0.1

0.05

1200 1300 1400 1500 1600 1700 1800 1900 2000

Fuzzy Clustering

clusters simultaneously, rather than it belonging completely to one

cluster (as in KM clustering).

degree than pixels that are in the middle of the cluster.

Fuzzy Clustering

such a way that each coefficient is compressed between 0 and 1, and

the sum of all the coefficients is defined to be 1

of the clustering result

linearly normalized to make this sum 1

Fuzzy Clustering

Fuzzy Clustering

Advantages

Each pixel is assigned a belonging degree

Drawbacks

Risk of converging to a local minimum in the iterations (as in KM)

Fuzzy Clustering

Partition Entropy

clusters, whereas PE values close to log K indicate that the number

of clusters does not reflect the real structure of the image.

Fuzzy Clustering

Fuzzy Clustering

Fuzzy Clustering

K-means

Fuzzy Clustering

FCM

Fuzzy Clustering

- Dataset Sample_demo.mat

- Dataset brunel.mat

Fuzzy Clustering

plastics

Distances, similarity and clustering

that associate each sample to the assigned class SUPERVISED

SIMCA

PLS-DA

Linear Discriminant Analysis (LDA)

Discriminant Analysis:

Separates samples into classes by finding directions which:

maximize the variance between classes

minimize the variance within classes

PC1 GOOD

LD1 GOOD

Linear Discriminant Analysis (LDA)

Discriminant Analysis:

Separates samples into classes by finding directions which:

maximize the variance between classes

minimize the variance within classes

PC1 BAD

LD1 GOOD

Linear Discriminant Analysis

which maximize the variance between classes and minimize the variance

whithin classes.

Linear Discriminant Analysis

which maximize the variance between classes and minimize the variance

whithin classes.

Sg = S

Linear Discriminant Analysis

applying the Bayes Theorem:

Linear Discriminant Analysis

especific class with minimum discriminant score dg:

Linear Discriminant Analysis

especific class with minimum discriminant score dg:

Linear Discriminant Analysis

especific class with minimum discriminant score dg:

Linear Discriminant Analysis

Drawbacks:

1) The number of samples must be higher than the number of variables. This

is not a real problem with images

Soft Independent Method for Class Analogy

Standard Isolinear Method of Class Assignment

SIMCA

SIMCA

SIMCA

(bilinear modeling)

similarity between object from the same class rather then on

differentiating among classes.

SIMCA

SIMCA

- A training set is needed to construct a model

- Projection of unknown samples to the model

in the training set, having each class the possibility of containing

different number of PCs.

samples are projected onto them

SIMCA

independently

PC1

PC1

PC2

SIMCA

independently

PC1

PC2

SIMCA

PC1

PC2

SIMCA

belongs to a class or not.

PC1

PC2

SIMCA

belongs to a class or not.

?

residuals

Hotelling T2

SIMCA

Samples are, therefore, always assigned.

The distance of each i sample from each g class (dig) is calculated as

where:

Qig and T2ig are the Hotellings T2 and Q calculated in the

PCA g-class model.

Q0.95,g and T20.95,g are the confidence intervals within 95%

of the g class

SIMCA

Hotelling T2

T20.95,g

Q0.95,g residuals

SIMCA

unassigned (i.e. outside the class spaces of all classes)

classified in more than one class (confused)

SIMCA

Hotelling T2

T20.95,g

Q0.95,g residuals

SIMCA

Samples can be unclassified:

unassigned (i.e. outside the class spaces of all classes)

classified in more than one class (confused)

SIMCA

Hotelling T2

T20.95,g

Q0.95,g residuals

SIMCA

SIMCA

Drawbacks:

PLS-DA

Unfolding

PLS-2 model

0

1

D Dummy

matrix

PLS-DA

PLS-DA is based on the same principles than PLS Covariance

between X and Y

PLS-2 model

0

1

D Dummy

matrix

PLS-DA

The main difference is that Y is a dummy matrix with 0 and 1

PLS-2 model

0

1

D Dummy

matrix

PLS-DA

The response of PLS-DA when classifies is still a number. Therefore

we need to find rules to convert these numbers into classes

PLS-DA

The response of PLS-DA when classifies is still a number. Therefore

we need to find rules to convert these numbers into classes

Bayes Theorem (like in LDA):

2) The treshold is selected where number of false positives and

false negatives is minimized

PLS-DA

The response of PLS-DA when classifies is still a number. Therefore

we need to find rules to convert these numbers into classes

Bayes Theorem (like in LDA):

PLS-DA

The rest, it works like PLS:

Cross validation

Number of LVs

etc

Assessing the models

Confusion matrix

Assessing the models

Confusion matrix

TP True positive

FP False positive

FN False negative

TN True negative

Assessing the models

Confusion matrix

TP True positive

FP False positive

FN False negative

TN True negative

Assessing the models

Confusion matrix

TP True positive

FP False positive

FN False negative

TN True negative

Assessing the models

Confusion matrix

TP True positive

FP False positive

FN False negative

TN True negative

Assessing the models

Confusion matrix

TP True positive

FP False positive

FN False negative

TN True negative

Assessing the models

parameters of the following confusion matrix

Assessing the models

ROC curves is a graphical plot of Sp and Sn as X and Y axes respectively,

for a binary classification system as its discrimination threshold is changed.

They are used to estimate the best classification score.

Assessing the models

ROC curves is a graphical plot of Sp and Sn as X and Y axes respectively,

for a binary classification system as its discrimination threshold is changed.

They are used to estimate the best classification score.

Assessing the models

ROC curves is a graphical plot of Sp and Sn as X and Y axes respectively,

for a binary classification system as its discrimination threshold is changed.

They are used to estimate the best classification score.

Assessing the models

ALMONDS

PLASTICS

- tmpFCBE.tmpUploaded byFrontiers
- The World in a Nutshell Concise Range QueriesUploaded byVinaya Kumar S
- ijgi-04-02660Uploaded byMuhammad Salma Nabila Alibasyir
- EDA_Lecture11.pdfUploaded byFerda Özdemir
- Malhotra Mr05 Ppt 21Uploaded byABHISHEK CHAKRABORTY
- Suzaimah_SPC_Paper.docUploaded bySuzaimah Ramli
- Farthest First Clustering in Links ReorganizationUploaded byijwest
- Lx 3520322036Uploaded byAnonymous 7VPPkWS8O
- 2015 - Face Detection State of the Art.docxwtyeUploaded byExneider Moreno
- CJBAS-13-01-02-05Uploaded byCJBAS Manager
- An Approach of Secure Face recognition using Linear discriminant analysis in NetworkUploaded byIRJET Journal
- A Similarity Measure for Text Classification and ClusteringUploaded byInternational Journal for Scientific Research and Development - IJSRD
- Defamilialisation and Welfare State Regimes- A Cluster AnalysisUploaded byxaephyr
- ArticleUploaded byKenneth Shak
- User GuideUploaded byJorge Marques
- Cybersecurity Copula ClassificationUploaded byNadjet Rouini
- Basepaper (2)Uploaded byVishwanath Harnal
- Using Learning Vector Quantization in IDS Alert Management SystemUploaded byAI Coordinator - CSC Journals
- QuestionsUploaded bykeyexia
- Nature 12160Uploaded bytrickae
- A GIS-based zoning of illegal dumping potential for efficient surveillance.pdfUploaded byDafi 'Kalonk' Acosta
- Topic 2 Matlab ExamplesUploaded byKokyi Chan
- Data Mining and Warehouse MCQS with Answer Good.docxUploaded byKanwal Preet
- Support Vector Machines (SVM) Models in StataUploaded byjohn3963
- 2014 Evaluation of the Functional Movement Screen as an Injury Prediction Tool Among Active Adult PopulationsUploaded byAgustin Ezequiel Lomana
- revw-2Uploaded bysunantha
- Bacterial Factors That Predict Relapse After Tb TherapyUploaded byguftasafira
- Pycon 2019.pdfUploaded byWilliam Gómez Ortega
- Classifying Hyperspectral ImageriesUploaded byIJRASETPublications
- robocupUploaded byChefia

- Conservación de Alimentos Diseño y Construccion de Un Liofilizador (2)Uploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Trabajo de NutricionUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- 20191-01_B_Generacion_transmision.pdfUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- La Arquitectura No Es La SoluciónUploaded byFranck Kevin Gervacio Sevilla
- Capitulo 1 Estudio Mercado ConcluionUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- CALCULO DE LUMINARIAS WORD FINAL x5.docxUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Trabajo de NutricionUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Hoja de Opinión - CyTUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- SESIÓN 6Uploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Capitulo 1 Estudio Mercado Concluion (1)Uploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Capitulo 1 Estudio MercadoUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Capitulo 1 Estudio MercadoUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- EMPAQUES Y EMBALAJES.docxUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- EMPAQUES Y EMBALAJES.docxUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- SESIÓN 4.DOCXUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Estudio Cacao Peru Julio 2016Uploaded byWilmer Sanchez Guivar
- Diagnóstico de La Industria Cárnica en El PerúUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- boletin-mango-export.pdfUploaded byGabriel Camogliano
- Doc2.docxUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- lac_esUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- lac_esUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Dialnet-HuacaDelSolYHuacaDeLaLunaEnTrujilloPeru-283201Uploaded bySergio Duran Chacon
- lac_esUploaded byGuuss Bendezu
- Resolucion 26 Enfermeria.docx FinalUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Trabajo de NutricionUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Doc1.docxUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Hoja de Opinión - CyTUploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Doc1Uploaded byPaolo Gianfranco Luna Victoria Gutierrez
- Doc1Uploaded byPaolo Gianfranco Luna Victoria Gutierrez
- cdro_03.xlsUploaded byPaolo Gianfranco Luna Victoria Gutierrez

- Essay on LogisticsUploaded byRavenal De Jesus
- underage drinking among high school studentsUploaded byapi-301034352
- Zeke ResumeUploaded byEzekiel Gomez
- Archives and Records Management Ma UclUploaded byConstanza Acuña Cerda
- 1 French 8NCEE-001072 Subgrade Modulus.pdfUploaded byDemçe Florjan
- Article 23 - 100 Questions HRAnalyticsCanAnswerUploaded byKashif Mumtaz Ahmed
- Theoretical FrameworkUploaded byjomar
- Meta-análisis de las pruebas proyectivas.pdfUploaded byanon_191335492
- Determining the U-Value of Façades Using the Thermometric Method: Potentials and LimitationsUploaded byDaniel
- How to Analyze Stryd Power Data Example of 10K Road RaceUploaded byYeboto Piratadelcaribe
- Appendix 5Uploaded bykjheiin
- ENG105 Research Paper Final-TanvirUploaded bytans69
- Jan Husdal NECTAR 2006Uploaded byjanhusdal
- A Comparison of Forecasting Models for ASEAN Equity MarketsUploaded bySunway University
- Econometrics Simpler NoteUploaded byScorpie
- Euromonitor Consumer Shopper Types GlobalUploaded byUmaima Bint Dia
- Impact of Digitalization on the Indian Financial SystemUploaded byEditor IJTSRD
- PHA100 Portable Hydrocarbon Analyzer1 (2)Uploaded byHansSuarezCueva
- Babbie Ch8 ExperimentsUploaded byDeedee Sandu
- 201309 - FALL 2013 - PSYC506 COGNITIVE NEUROSCIENCE OF ATTENTION SYLLABUSUploaded bycreativelyinspired
- A Critical Review of Child Custody EvaluationUploaded byFrancisco Estrada
- OddyUploaded byNurhikmah Rn
- MAE533Schedule Summer2011 Mae533 3JSSUploaded byNourhan Abd ElAziz
- 5 Misleading Statistics Used by Media to Fuel Negative Stereotypes About Black PeopleUploaded byebitechukwudi
- IntroductionUploaded bybji
- 2008 Does Education Pay Off in Turkey an Ordered Logit AproachUploaded byOktay Öksüzler
- PREVALENCE OF ANAEMIA AMONG ADOLESCENTS IN SELECTED RURAL COMMUNITIES OF DISTRICT AMRITSAR, PUNJAB.Uploaded byIJAR Journal
- service brandingUploaded byHasan Gilani
- Theory ContingencyUploaded bychanritharun
- Effects of Stretching Before and After ExcerciseUploaded byIvanRibeiro