A18 CU6051NA A2 CW Coursework 16034872 Anjil Shrestha

SENTIMENT ANALYSIS
SENTIMENT ANALYSIS USING NAÏVE BAYES CLASSIFIER

AI, ML, NLP & SENTIMENT ANALYSIS
• Artificial Intelligence is the ability of a machine or a software to perceives its environment and
takes actions that are relatable to human behavior and this action has high chances of success.
• Machine learning is the subfield of Artificial Intelligence that allows software applications to
automatically learn and improve from experience without being explicitly programmed.
• NLP is a fundamental element of AI for communicating with an intelligent system using natural
language.
• Sentiment analysis falls under the different applications of NLP and is a process of determining
whether a piece of writing is positive, negative or neutral.
MACHINE LEARNING AND LEXICON BASED APPROACHES
RESEARCH
• Pang and Lee have described the existing techniques and
approaches for an opinion-oriented information retrieval
• another research the authors used web-blogs to construct

• Rule-based systems corpora for sentiment analysis and use emoticons assigned to
blog posts as indicators of users’ mood. Using SVM and CRF
• Automatic systems • Alec Go and team performed a sentiment search by using
• Hybrid systems Twitter to collect training data. Various classifiers were used in
a corpora constructed by using positive and negative samples
from emoticons. Among the classifiers used Naïve Bayes
classifier obtained by best result with accuracy up to 81% on
their test set but this method when used with three classes
(“negative”, “positive” and “neutral”) showed bad
performance
APPLICATIONS OF SENTIMENT ANALYSIS
• BRAND MONITORING
• CUSTOMER SUPPORT
• CUSTOMER FEEDBACK
• PRODUCT ANALYTICS
• MARKET RESEARCH AND ANALYSIS
• WORKFORCE ANALYTICS & VOICE OF THE EMPLOYEE
• SPAM FILTERING
WHY NAÏVE BAYES CLASSIFIER?
• HIGHLY PRACTICAL METHOD
• FREQUENTLY USED FOR WORKING WITH NATURAL LANGUAGE TEXT DOCUMENTS.
• NAIVE BECAUSE OF STORING INDEPENDENCE ASSUMPTION IT MAKES
• PROBABILISTIC MODEL
• FAST, ACCURATE AND RELIABLE

SOLUTION
SENTIMENT ANALYSIS AND NAÏVE BAYES CLASSIFIER
• Naïve Bayes is a probabilistic algorithm that takes advantage of probability theory and
Bayes’ theorem to predict sentiment of a text.
P(A|B) – posterior
P(A) – prior
P(B) – evidence
P(B|A) – likelihood
BAG OF WORDS
helpful course and material boring dont waste time in this useful content helped lot thanks a La
s be
l Bag of Words (BOW) is the
representation of text that
Helpful 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 +
course
and
describes the occurrence of ways
materials
. of extracting features from
Boring. 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 - documents.
Don’t
waste
0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 -
• A vocabulary of known words
time in
this.
Useful 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 +
• A measure of the presence of
materials
and known words
content.
Helped a 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 +
lot.
Thanks
DEVELOPMENT
Naïve Bayes classifier can be effectively implemented

using python. This algorithm is implemented using
python programming language as it provides many
libraries for data pre-processing, NLP and machine
learning. The libraries are listed below:
• Pandas
• NumPy
• Scikit-learn
• NLTK
• Regex
The end product is a web app developed in a Django framework

DEVELOPED SYSTEM
Startup page
• Options to train dataset
for sentiment prediction
DEVELOPED SYSTEM CONTINUED..
Training page
• The training process is
getting carried out in
the backend.
Page after model is

trained successfully
• Shows accuracy score
• Text box to enter
review for sentiment
predictiom
Test of positive review

• Testing the working of
system by enter
positive review.
Result of positive review

• Result is displayed at
button and the
output is a expected.
Test of negative review

• Testing the working of
system by enter
negative review.
Result of negative review

• Result is displayed at
button and the
output is a expected.
Floating navigation
button for visualization
• Option to open
visualization page.
Visualization page
• Bar diagram showing
total reviews made on
12 test courses.
Visualization page
• Total positive,
negative and neutral
reviews. (Extracted
from data set)
PSEUDO CODE
IMPORT NECESSARY LIBRARIES (PANDAS, SKLEARN, NLTK TOOLS)
COLLECT LABELED TRAINING DATASETS
READ DATASET AND SEPARATE SENTIMENT TEXT AND ITS SENTIMENT LABEL.
DATAFRAME = PANDAS.READCSV(“TRAINING DATA”)
X = DATAFRANE.SENTIMENTTEXT
Y = SENTIMENTLABEL
SPLIT X AND Y INTO TRAINING AND TESTING SET
X_TRAIN,
X_TEST,Y_TRAIN,Y_TEST=TRAIN_TEST_SPLIT(X,Y,TEST_SIZE=0.2,RANDOM_STATE=1)
PERFORM DATA PRE-PROCESSING USING COUNTVECTORIZER.
REMOVE STOPWORDS.
TOKENIZATION.
IGNORING CASE AND PUNCTUATION
STRIP WHITE SPACE.
REMOVE NUMBERS AND OTHER CHARACTERS
TRAIN THE MODEL ON TRAINING SET
MODEL=NAIVE_BAYES.MULTINOMIALNB()
MODEL.FIT(X_TRAIN,Y_TRAIN)
MAKE THE PREDICTION ON TESTING SET
MY_TEST_DATA=['THIS IS REALLY GOOD','THIS WAS BAD']
MY_VECTORIZER=VECTORIZER.TRANSFORM(MY_TEST_DATA)
MODEL.PREDICT(MY_VECTORIZER
COMPARE ACTUAL RESPONSE VALUE WITH THE PREDICTED RESPONSE VALUE.

FLOWCHART
THE END

A18 CU6051NA A2 CW Coursework 16034872 Anjil Shrestha

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A18 CU6051NA A2 CW Coursework 16034872 Anjil Shrestha

Uploaded by

Copyright:

Available Formats

SENTIMENT ANALYSIS

SENTIMENT ANALYSIS USING NAÏVE BAYES CLASSIFIER

• another research the authors used web-blogs to construct

• HIGHLY PRACTICAL METHOD

• FREQUENTLY USED FOR WORKING WITH NATURAL LANGUAGE TEXT DOCUMENTS.

• NAIVE BECAUSE OF STORING INDEPENDENCE ASSUMPTION IT MAKES

• FAST, ACCURATE AND RELIABLE

Naïve Bayes classifier can be effectively implemented

The end product is a web app developed in a Django framework

Page after model is

Test of positive review

Result of positive review

Test of negative review

Result of negative review

COLLECT LABELED TRAINING DATASETS

DATAFRAME = PANDAS.READCSV(“TRAINING DATA”)

SPLIT X AND Y INTO TRAINING AND TESTING SET

PERFORM DATA PRE-PROCESSING USING COUNTVECTORIZER.

IGNORING CASE AND PUNCTUATION

STRIP WHITE SPACE.

REMOVE NUMBERS AND OTHER CHARACTERS

TRAIN THE MODEL ON TRAINING SET

MAKE THE PREDICTION ON TESTING SET

MY_TEST_DATA=['THIS IS REALLY GOOD','THIS WAS BAD']

COMPARE ACTUAL RESPONSE VALUE WITH THE PREDICTED RESPONSE VALUE.

You might also like