Chapter 5 - Classification

149
Chapter 5 -
Classification
1
150
From Regression to Classification

q Regression: Predict scalar output y ∊ IR given input x.
q Classification: Predict categorical class label y given input x.
q Examples:
§ Disease diagnoses: Classifying whether a patient is healthy or
not.
§ Text classification: Classifying documents according to topic.
§ Fault diagnoses: Is a photovoltaic system is operating as
expected or not.
Dr Wided Lejouad Chaari

151
Target Output
q Data: In
the label y(n) should tell us which class x(n) belongs to.
q There is a number of ways to encode y numerically.
q Binary classification: y ∊ {0, 1} or y ∊ {-1, 1}.
q Multiclass classification: Classification among K classes
y ∊ {1, 2, …, K}.

152
Target Output
q Data: In
the label y(n) should tell us which class x(n) belongs to.
q There is a number of ways to encode y numerically.
q Binary classification: y ∊ {0, 1} or y ∊ {-1, 1}.
Spam detection: Spam (1) or Not Spam (0), or (1) for True and (-1) for
False.
q Multiclass classification: Classification among K classes y ∊ {1, 2, …, K}.
Classify text documents from the news: class 1 could be Sport, class 2
could be Political news, class 3 could be Weather, …

153
Iris flower dataset

https://en.wikipedia.org/wiki/Iris_flower_data_set

154
Iris flower dataset

q This dataset records different types of Irises and specifically
gives measures of the different Widths and Lengths of
flowers.
q They recorded the Width and the Length of the different

leaves and then wrote down the corresponding type of Iris.

155
Learn more about Iris …
Petal Leaf
Sepal Leaf

156
Learn more about Iris …

q For each Iris the biologists picked, they noted the petal
length and width, and the sepal length and width.
q They also noted the type of Iris:

§ Setosa Iris
§ Versicolor Iris
§ Virginica Iris

157
Iris dataset
To simplify we can plot
just two of the features

158
Iris dataset
q When I pick a flower with sepal length of 6 and sepal width
of 4:
In real world:
x - more dimensions,
- more noisy data,
- more overlap between the calsses

159
Generative vs.Discriminative
q Most learning algorithms categorize into two classes:
§ Discriminative learning algorithms

§ Generative learning algorithms

160
Generative vs. Discriminative

q In Generative Probabilistic Model:
§ P(x) given y
§ Density of the feature vector in that class (y)
q In Discriminative Probabilistic Model:

§ P(y) given x
§ When I want to classify in C1, …, C5
§ Probability of being in y given x

161
Generative vs.Discriminative
q Generative learning algorithms may work better if we have very
few training examples.
q They are very simple and very quick to implement and also very
quick to run.
q Because they are so efficient, they often scale very easily even
to massive datasets.
q The best thing to do sometimes isn’t to overthink or to
overdesign the algorithm but rather to implement something
quick and then to iterate to improve it.
q We take the example of the generative learning algorithm Naive
Bayes which is often a good candidate for a quick
implementation.

162
Generative vs Discriminative
Given a training set ...

163
Discriminative Model
q For example:
§ Logistic Regression
§ (which we may fit with Gradient Descent)
q What a discriminative learning algorithm does by Logistic Regression is to

search for a straight line to separate two concepts.
q If we initialize logistic regressions parameters randomly (see next figure), we

start to apply gradient descent.
q After some iterations, one after one innovation, we get a decision boundary.

164
Discriminative model

165
Discriminative model
We are basically looking
at all of our data and Malignant Tumors
trying to find a straight
line that separates the
Malignant Tumors from
the benign Tumors
Benign Tumors

166
Generative Model
q Let’s focus on the Benign Tumors to start (look at blue
circles).
q We build a model of what Benign Tumors look like.
q You see that the most Benign Tumors tend to lie in the blue
region of space.

167
Generative Model
Benign

168
Generative Model
q We will then turn our attention to the Malignant Tumors, we
focus our attention on them.
q We try to build a model of what Malignant Tumors look like

in that region of space.
q You see that the most Malignant Tumors tend to lie in the
red region of space.

169
Generative Model
Malignant

170
Generative Model

171
Generative Model
If a New Patient
comes in with
features x1 and x2 …

172
Generative Model
q What Generative Algorithm does is to build a model of each
of the two classes and then mix classification predictions
based on:
« looking at your example and comparing it to your two
models to see whether it looks like more Benign or Malignant
Tumor »

173
Let’s formalize ...

q Discriminative Learning Algorithm:
Learns p(y|x) « Probability of y given x »
Logistic Regression uses the sigmoid function to estimate it
directly.
q Generative Learning Algorithm:
Learns p(x|y) p(y) « class prior »
p(y=0) or p(y=1)
y: the class label, indicates whether a tumor is malignant or
benign
x: features
174
Let’s formalize ...

q Suppose p(x|y) , p(y)
Given new example x:

! " # = % !(#$%)
p(y=1|x) =
!(')
(by Bayes Rule)
where
p(x) = ∑# 3 (", #)
= p(x|y=1) p(y=1) + p(x|y=0) p(y=0)

175
Example

176
Let’s compute
! "#=% !(#$%)
p(y=1|x) = ! "#=% ! #$% (! "#=5 !(#$))
).)+
= ).)+().)%
= 0.75

Chapter 5 - Classification

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 5 - Classification

Uploaded by

Copyright:

Available Formats

149

From Regression to Classification

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

Iris flower dataset

Dr Wided Lejouad Chaari

Iris flower dataset

q They recorded the Width and the Length of the different

Dr Wided Lejouad Chaari

Learn more about Iris …

Dr Wided Lejouad Chaari

Learn more about Iris …

q They also noted the type of Iris:

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

§ Discriminative learning algorithms

Dr Wided Lejouad Chaari

Generative vs. Discriminative

q In Discriminative Probabilistic Model:

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

q What a discriminative learning algorithm does by Logistic Regression is to

q If we initialize logistic regressions parameters randomly (see next figure), we

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

q We build a model of what Benign Tumors look like.

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

q We try to build a model of what Malignant Tumors look like

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

Let’s formalize ...

Let’s formalize ...

Given new example x:

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

Dr Wided Lejouad Chaari

You might also like