You are on page 1of 22

Dr. P.

PPD
Motivation

• Classification models aim to predict the labels of a dataset based on the features.
• Regression models aim to predict a number, whereas classification models aim to
predict a state or a category.
• Classification models are often called classifiers, Many classifiers predict one of
two possible states (often yes/no), although it is possible to build classifiers that
predict among a higher number of possible states.
• The following are popular examples of classifiers:
• A recommendation model that predicts whether a user will watch a certain
movie
• An email model that predicts whether an email is spam or ham

Dr. P. PPD 2
Perceptron model or Perceptron classifier or
Perceptron
• A medical model that predicts whether a patient is sick or healthy
• An image-recognition model that predicts whether an image contains an
automobile, a bird, a cat, or a dog
• A voice recognition model that predicts whether the user said a particular
command
• A perceptron is the building block of neural networks.
• Developing the perceptron algorithm in two ways:
• Using a trick that we can iterate many times
• Defining an error function that we can minimize using gradient descent.

Dr. P. PPD 3
Perceptron

• One of the example of classification models is sentiment analysis.


• In sentiment analysis, the goal of the model is to predict the sentiment of
a sentence.
• In other words, the model predicts whether the sentence is happy or sad.
• For example, a good sentiment analysis model can predict that the
sentence “I feel wonderful!” is a happy sentence, and that the sentence
“What an awful day!” is a sad sentence.
• how could we build a machine learning model that takes a sentence as an
input and, as output, tells us whether the sentence is happy or sad.

Dr. P. PPD 4
Machine Learning model that Classifies
Sentences
• Happy sentences tend to contain happy words, such as wonderful, happy, or joy,
whereas sad sentences tend to contain sad words, such as awful, sad, or
despair.
• A classifier can consist of a score for every single word in the dictionary. Happy
words can be given positive scores, and sad words can be given negative scores.
Neutral words such as the can be given a score of zero.
• When a sentence is feed into our classifier, the classifier simply adds the scores
of all the words in the sentence. If the result is positive, then the classifier
concludes that the sentence is happy. If the result is negative, then the classifier
concludes that the sentence is sad.
• The goal now is to find scores for all the words in the dictionary. For this, we use
machine learning.
Dr. P. PPD 5
Perceptron

• The type of model we just built is called a perceptron


• The process of training a perceptron is called the perceptron
algorithm.
• Idea of the perceptron algorithm:
• To train the model, we first need a dataset containing many sentences
together with their labels (happy/sad).
• We start building our classifier by assigning random scores to all the words.
• Then we go over all the sentences in our dataset several times.
• For every sentence, we slightly tweak the scores so that the classifier
improves the prediction for that sentence.

Dr. P. PPD 6
The problem: We are on an alien planet, and
we don’t know their language!
• we are astronauts and have just landed on a distant planet where a
race of unknown aliens live. We would love to be able to
communicate with the aliens, but they speak a strange language that
we don’t understand. We notice that the aliens have two moods,
happy and sad. Our first step in communicating with them is to figure
out if they are happy or sad based on what they say. In other words,
we want to build a sentiment analysis classifier. Their language seems
to only have two words: aack and beep. We form the following
dataset with the sentence they say and their mood:

Dr. P. PPD 7
Dataset

• Alien 1 Mood: Happy Sentence: “Aack, aack, aack!”


• Alien 2: Mood: Sad Sentence: “Beep beep!”
• Alien 3: Mood: Happy Sentence: “Aack beep aack!”
• Alien 4: Mood: Sad Sentence: “Aack beep beep beep!”
• All of a sudden, a fifth alien comes in, and it says, “Aack beep aack
aack!”
• We predict that this alien is happy because, even though we don’t know the
language, the word aack seems to appear more in happy sentences, whereas
the word beep seems to appear more in sad sentences

Dr. P. PPD 8
Dataset

Dr. P. PPD 9
The classifier - Perceptron

• This observation gives rise to our first sentiment analysis classifier.


• This classifier makes a prediction in the following way: it counts the
number of appearances of the words aack and beep. If the number of
appearances of aack is larger than that of beep, then the classifier
predicts that the sentence is happy.
• If it is smaller, then the classifier predicts that the sentence is sad.
• By default, the prediction is that the sentence is happy.

Dr. P. PPD 10
SENTIMENT ANALYSIS CLASSIFIER

• Given a sentence, assign the following scores to the words:


• Scores: Aack: 1 point
• Beep: –1 points
• Rule: Calculate the score of the sentence by adding the scores of all
the words on it as follows:
• If the score is positive or zero, predict that the sentence is happy.
• If the score is negative, predict that the sentence is sad.

Dr. P. PPD 11
Dr. P. PPD
• In fact, a line formed by all the
sentences with the same
number of appearances of aack
and beep divides these two
regions, as shown in figure.
• This line has the following
equation: #aack = #beep
• Or equivalently, this equation:
#aack – #beep = 0

Dr. P. PPD
• x with different subscripts to indicate the number of appearances of a
word in a sentence. In this case, xaack is the number of times the word
aack appears, and xbeep is the number of times the word beep appears

Dr. P. PPD 14
Dr. P. PPD 15
Dr. P. PPD 16
Dr. P. PPD 17
• SENTIMENT ANALYSIS CLASSIFIER Given a sentence, assign the
following scores to the words: Scores: Crack: one point Doink: one
point Rule: Calculate the score of the sentence by adding the scores
of all the words on it. If the score is four or more, predict that the
sentence is happy. If the score is three or less, predict that the
sentence is sad. To make it simpler, let’s slightly change the rule by
using a cutoff of 3.5.

Dr. P. PPD 18
Dr. P. PPD 19
Dr. P. PPD 20
Dr. P. PPD 21
Dr. P. PPD 22

You might also like