Professional Documents
Culture Documents
Classification Annotated PDF
Classification Annotated PDF
Analyzing Sentiment
Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
1 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Predicting sentiment by topic:
An intelligent restaurant
review system
Sample review:
Watching the chefs create
incredible edible art made
the experience very unique.
All reviews
for restaurant
Experience
★★★★
Novel intelligent Ramen
restaurant review app ★★★
Sushi
★★★★★
Sentence Sentiment
Classifier
Sushi
Sentiment
All the sushi was delicious. All the sushi was delicious.
The sushi was amazing, and The sushi was amazing, and Classifier
the rice is just outstanding. the rice is just outstanding. Most
The service is somewhat hectic. &
Easily best sushi in Seattle. Easily best sushi in Seattle.
Easily best
sushi
in Seattle.
8
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Classifier applications
Sentence
Classifier
from
review MODEL
Output: y
Input: x Predicted
class
Education
Finance
Technology
Input: x Output: y
Webpage
11
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Spam filtering
Not spam
Spam
Input: x Output: y
Text of email,
12
sender, IP,… ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Image classification
Input: x Output: y
Image pixels Predicted object
13
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Personalized medical diagnosis
Input: x Output: y
Healthy
Disease Cold
Classifier Flu
MODEL Pneumonia
…
“Hammer”
“House”
Sentence
Classifier
from
review MODEL
Output: y
Input: x
Predicted class
0
0 1 2 3 4 …
awesome
25
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Decision boundary example
Word Weight
awesome 1.0
Score(x) = 1.0 #awesome – 1.5 #awful
awful -1.5
awful
Score(x) < 0
…
0
Score(x) > 0
0 1 2 3 4 …
awesome
26
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Decision boundary separates
positive & negative predictions
• For linear classifiers:
- When 2 weights are non-zero
è line
- When 3 weights are non-zero
è plane
- When many weights are non-zero
è hyperplane
• For more general classifiers
è more complicated shapes
Training Learn
set classifier
Data
(x,y) Test
Evaluate?
(Sentence1, ) set
(Sentence2, )
…
29
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Classification error
Learned classifier
ŷ=
Test example Correct 1
0
Correct!
Mistake!
(Food
Foodwas
(Sushi
Sushi wasgreat,
OK
OK,
great )) Mistakes 0
1
Hide label
30
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Classification error & accuracy
• Error measures fraction of mistakes
error = .
True False
True label
Positive Negative
Positive
(FN)
(FP)
False True
Positive
Negative Negative
(FP)
(FN)
37
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Cost of different types of mistakes can be
different (& high) in some applications
Spam Medical
filtering diagnosis
False
Disease
negative Annoying not treated
False
Wasteful
positive Email lost
treatment
True False
True label
Positive Positive
(FP)
False True
Negative Negative
(FN)
Healthy
True label
Cold
Flu
• In practice:
- More complex models require more data
- Empirical analysis can provide guidance
Bias of model
Classifier based
on single words
P(y|x)
Output label Input sentence
Extremely useful in practice
48
©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Summary of classification