You are on page 1of 11

Multiclass Classification

The City College of New York


CSc 59929 – Introduction to Machine Learning
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Definitions

• Binary Classification is the problem of classifying


instances into one of two classes.

• Multiclass or Multinomial Classification is the problem of


classifying instances into one of more than two classes.

Multiclass classification should not be confused with multi-


label classification, where multiple labels are to be predicted
for each instance.
Ref.: Wikipedia

The City College of New York


CSc 59929 – Introduction to Machine Learning 2
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Binary Classifiers

While some classification algorithms naturally permit the use


of more than two classes, others are by nature binary
algorithms; these can, however, be turned into multinomial
classifiers by a variety of strategies.

Ref.: Wikipedia

The City College of New York


CSc 59929 – Introduction to Machine Learning 3
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Using binary classifiers
for multiclass classification

• One-vs.-Rest (OvR)
• One-vs.-All (OvA)
• One-against-All (OaA)

• One-vs.-One (OvO)

Ref.: Wikipedia

The City College of New York


CSc 59929 – Introduction to Machine Learning 4
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Comparison of OvR and OvO

The City College of New York


CSc 59929 – Introduction to Machine Learning 5
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Training OvR
fA fB fC fD
A A Not B Not C Not D

B Not A B Not C Not D

C Not A Not B C Not D

D Not A Not B Not C D

Ref.: Wikipedia
The City College of New York
CSc 59929 – Introduction to Machine Learning 6
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Predicting with OvR

A
fA ( x)
B
fB ( x) Choose the prediction that
has the highest cofidence.
C
fC ( x )
D fD ( x)

Ref.: Wikipedia
The City College of New York
CSc 59929 – Introduction to Machine Learning 7
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Training OvO f AD
f AB f AC
A A A A

B B C D

C f BC f BD fCD
B B C
D

C D D

The City College of New York


CSc 59929 – Introduction to Machine Learning 8
Spring 2020 – Erik K. Grimmelmann, Ph.D.
Predicting with OvO

A
f AB ( x )
f AC ( x )
B f AD ( x ) Choose the result that is
predicted most often.
f BC ( x )
C
f BD ( x )
D fCD ( x )

Ref.: Wikipedia
The City College of New York
CSc 59929 – Introduction to Machine Learning 9
Spring 2020 – Erik K. Grimmelmann, Ph.D.
One-vs.-Rest (OvR)
Inputs:
 L, a learner (a training algorithm for binary classifiers)
 X , a sample
 y, labels where yi ∈ {1,..., K }
Output:
 { f k } , a set of classifiers, for k ∈{1,..., K }
Procedure:
 For each k in {1,..., K }, construct a new label vector, z ,
where
= zi 1= if yi k and = zi 0 otherwise.
 Apply L to ( X , z ) to obtain f k
Making a decision consists of applying all classifiers to a sample and
predicting the label i for which the corresponding classifier reports
the highest confidence score, yˆ = argmax f i ( x) Ref.: Wikipedia
i∈{1…K }
The City College of New York
CSc 59929 – Introduction to Machine Learning 10
Spring 2020 – Erik K. Grimmelmann, Ph.D.
One-vs.-One (OvO)
Inputs:
 L, a learner (a training algorithm for binary classifiers)
 X , a sample
 y, labels where yi ∈ {1,..., K }
Output:
 { f } , a set of classifiers, for j,k ∈{1,..., K } and j < k.
jk

Procedure:
 For each j ,k in {1,..., K } with j < k , form subsets X jk of X
such that xi ∈ X jk if yi =j or yi =k .
 Apply L to each ( X jk ) to obtain f jk
Making a decision consists of applying all classifiers to a sample and
predicting the label i for which the most classifiers report a
result of i. Ref.: Wikipedia
The City College of New York
CSc 59929 – Introduction to Machine Learning 11
Spring 2020 – Erik K. Grimmelmann, Ph.D.

You might also like