You are on page 1of 28

149

Chapter 5 -
Classification

1
150

From Regression to Classification


q Regression: Predict scalar output y ∊ IR given input x.
q Classification: Predict categorical class label y given input x.
q Examples:
§ Disease diagnoses: Classifying whether a patient is healthy or
not.
§ Text classification: Classifying documents according to topic.
§ Fault diagnoses: Is a photovoltaic system is operating as
expected or not.

Dr Wided Lejouad Chaari


151

Target Output
q Data: In
the label y(n) should tell us which class x(n) belongs to.
q There is a number of ways to encode y numerically.
q Binary classification: y ∊ {0, 1} or y ∊ {-1, 1}.
q Multiclass classification: Classification among K classes
y ∊ {1, 2, …, K}.

Dr Wided Lejouad Chaari


152

Target Output
q Data: In
the label y(n) should tell us which class x(n) belongs to.
q There is a number of ways to encode y numerically.
q Binary classification: y ∊ {0, 1} or y ∊ {-1, 1}.
Spam detection: Spam (1) or Not Spam (0), or (1) for True and (-1) for
False.
q Multiclass classification: Classification among K classes y ∊ {1, 2, …, K}.
Classify text documents from the news: class 1 could be Sport, class 2
could be Political news, class 3 could be Weather, …

Dr Wided Lejouad Chaari


153

Iris flower dataset


https://en.wikipedia.org/wiki/Iris_flower_data_set

Dr Wided Lejouad Chaari


154

Iris flower dataset


q This dataset records different types of Irises and specifically
gives measures of the different Widths and Lengths of
flowers.

q They recorded the Width and the Length of the different


leaves and then wrote down the corresponding type of Iris.

Dr Wided Lejouad Chaari


155

Learn more about Iris …

Petal Leaf

Sepal Leaf

Dr Wided Lejouad Chaari


156

Learn more about Iris …


q For each Iris the biologists picked, they noted the petal
length and width, and the sepal length and width.

q They also noted the type of Iris:


§ Setosa Iris
§ Versicolor Iris
§ Virginica Iris

Dr Wided Lejouad Chaari


157

Iris dataset
To simplify we can plot
just two of the features

Dr Wided Lejouad Chaari


158

Iris dataset
q When I pick a flower with sepal length of 6 and sepal width
of 4:
In real world:
x - more dimensions,
- more noisy data,
- more overlap between the calsses

Dr Wided Lejouad Chaari


159

Generative vs.Discriminative
q Most learning algorithms categorize into two classes:

§ Discriminative learning algorithms


§ Generative learning algorithms

Dr Wided Lejouad Chaari


160

Generative vs. Discriminative


q In Generative Probabilistic Model:
§ P(x) given y
§ Density of the feature vector in that class (y)

q In Discriminative Probabilistic Model:


§ P(y) given x
§ When I want to classify in C1, …, C5
§ Probability of being in y given x

Dr Wided Lejouad Chaari


161

Generative vs.Discriminative
q Generative learning algorithms may work better if we have very
few training examples.
q They are very simple and very quick to implement and also very
quick to run.
q Because they are so efficient, they often scale very easily even
to massive datasets.
q The best thing to do sometimes isn’t to overthink or to
overdesign the algorithm but rather to implement something
quick and then to iterate to improve it.
q We take the example of the generative learning algorithm Naive
Bayes which is often a good candidate for a quick
implementation.

Dr Wided Lejouad Chaari


162

Generative vs Discriminative
Given a training set ...

Dr Wided Lejouad Chaari


163

Discriminative Model
q For example:
§ Logistic Regression
§ (which we may fit with Gradient Descent)

q What a discriminative learning algorithm does by Logistic Regression is to


search for a straight line to separate two concepts.

q If we initialize logistic regressions parameters randomly (see next figure), we


start to apply gradient descent.

q After some iterations, one after one innovation, we get a decision boundary.

Dr Wided Lejouad Chaari


164

Discriminative model

Dr Wided Lejouad Chaari


165

Discriminative model
We are basically looking
at all of our data and Malignant Tumors
trying to find a straight
line that separates the
Malignant Tumors from
the benign Tumors

Benign Tumors

Dr Wided Lejouad Chaari


166

Generative Model
q Let’s focus on the Benign Tumors to start (look at blue
circles).

q We build a model of what Benign Tumors look like.

q You see that the most Benign Tumors tend to lie in the blue
region of space.

Dr Wided Lejouad Chaari


167

Generative Model

Benign

Dr Wided Lejouad Chaari


168

Generative Model
q We will then turn our attention to the Malignant Tumors, we
focus our attention on them.

q We try to build a model of what Malignant Tumors look like


in that region of space.

q You see that the most Malignant Tumors tend to lie in the
red region of space.

Dr Wided Lejouad Chaari


169

Generative Model
Malignant

Dr Wided Lejouad Chaari


170

Generative Model

Dr Wided Lejouad Chaari


171

Generative Model
If a New Patient
comes in with
features x1 and x2 …

Dr Wided Lejouad Chaari


172

Generative Model
q What Generative Algorithm does is to build a model of each
of the two classes and then mix classification predictions
based on:
« looking at your example and comparing it to your two
models to see whether it looks like more Benign or Malignant
Tumor »

Dr Wided Lejouad Chaari


173

Let’s formalize ...


q Discriminative Learning Algorithm:
Learns p(y|x) « Probability of y given x »
Logistic Regression uses the sigmoid function to estimate it
directly.
q Generative Learning Algorithm:
Learns p(x|y) p(y) « class prior »
p(y=0) or p(y=1)
y: the class label, indicates whether a tumor is malignant or
benign
x: features
Dr Wided Lejouad Chaari
174

Let’s formalize ...


q Suppose p(x|y) , p(y)

Given new example x:


! " # = % !(#$%)
p(y=1|x) =
!(')
(by Bayes Rule)
where
p(x) = ∑# 3 (", #)
= p(x|y=1) p(y=1) + p(x|y=0) p(y=0)

Dr Wided Lejouad Chaari


175

Example

Dr Wided Lejouad Chaari


176

Let’s compute
! "#=% !(#$%)
p(y=1|x) = ! "#=% ! #$% (! "#=5 !(#$))

).)+
= ).)+().)%

= 0.75

Dr Wided Lejouad Chaari

You might also like