You are on page 1of 23

SENTIMENT ANALYSIS

Sentiment Analysis using Naïve bayes classifier


AI, ML, NLP & SENTIMENT ANALYSIS
 Artificial Intelligence is the ability of a machine or a software to perceives its environment
and takes actions that are relatable to human behavior and this action has high chances of
success.
 Machine learning is the subfield of Artificial Intelligence that allows software applications
to automatically learn and improve from experience without being explicitly programmed.
 NLP is a fundamental element of AI for communicating with an intelligent system using
natural language.
 Sentiment analysis falls under the different applications of NLP and is a process of
determining whether a piece of writing is positive, negative or neutral.
MACHINE LEARNING AND LEXICON BASED APPROACHES
• Pang and Lee have described the existing techniques and
approaches for an opinion-oriented information retrieval

• another research the authors used web-blogs to construct


 Rule-based systems corpora for sentiment analysis and use emoticons assigned to
blog posts as indicators of users’ mood. Using SVM and CRF
 Automatic systems
• Alec Go and team performed a sentiment search by using
 Hybrid systems Twitter to collect training data. Various classifiers were used in
a corpora constructed by using positive and negative samples
from emoticons. Among the classifiers used Naïve Bayes
classifier obtained by best result with accuracy up to 81% on
their test set but this method when used with three classes
(“negative”, “positive” and “neutral”) showed bad

RESEARCH performance
APPLICATIONS OF SENTIMENT ANALYSIS

• Brand Monitoring
• Customer Support
• Customer Feedback
• Product Analytics
• Market Research and Analysis
• Workforce Analytics & Voice of the Employee
• Spam filtering
WHY NAÏVE BAYES CLASSIFIER?

• Highly practical method


• Frequently used for working with natural language text documents.
• Naive because of storing independence assumption it makes
• Probabilistic model
• Fast, accurate and reliable
SOLUTION
SENTIMENT ANALYSIS AND NAÏVE BAYES CLASSIFIER

 Naïve Bayes is a probabilistic algorithm that takes advantage of probability theory and
Bayes’ theorem to predict sentiment of a text.

P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes).


P(c) is the prior probability of class.
P(x|c) is the likelihood which is the probability of predictor given class.
P(x) is the prior probability of predictor.
BAG OF WORDS
  helpful course and material boring dont waste time in this useful content helped lot thanks a La
s be
l Bag of Words (BOW) is the
representation of text that
Helpful 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 +
course
and
describes the occurrence of ways
materials
. of extracting features from
Boring. 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 - documents.
Don’t
waste
0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 -
• A vocabulary of known words
time in
this.
Useful 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 +
• A measure of the presence of
materials
and known words
content.

Helped a 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 +
lot.
Thanks
DEVELOPMENT

Naïve Bayes classifier can be effectively implemented


using python. This algorithm is implemented using
python programming language as it provides many
libraries for data pre-processing, NLP and machine
learning. The libraries are listed below:

• Pandas
• NumPy
• Scikit-learn
• NLTK
• Regex

The end product is a web app developed in a Django


framework
DEVELOPED SYSTEM

Startup page
• Options to train dataset
for sentiment prediction
DEVELOPED SYSTEM CONTINUED..

Training page
• The training process is
getting carried out in
the backend.
DEVELOPED SYSTEM CONTINUED..

Page after model is


trained successfully
• Shows accuracy score
• Text box to enter
review for sentiment
predictiom
DEVELOPED SYSTEM CONTINUED..

Test of positive review


• Testing the working of
system by enter
positive review.
DEVELOPED SYSTEM CONTINUED..

Result of positive review


• Result is displayed at
button and the
output is a expected.
DEVELOPED SYSTEM CONTINUED..

Test of negative review


• Testing the working of
system by enter
negative review.
DEVELOPED SYSTEM CONTINUED..

Result of negative review


• Result is displayed at
button and the
output is a expected.
DEVELOPED SYSTEM CONTINUED..

Floating navigation
button for visualization
• Option to open
visualization page.
DEVELOPED SYSTEM CONTINUED..

Visualization page
• Bar diagram showing
total reviews made on
12 test courses.
DEVELOPED SYSTEM CONTINUED..

Visualization page
• Total positive,
negative and neutral
reviews. (Extracted
from data set)
PSEUDO CODE
Import necessary libraries (pandas, sklearn, nltk tools)

Collect labeled training datasets

Read dataset and separate sentiment text and its sentiment label.

dataframe = Pandas.readCsv(“training data”)

x = datafrane.sentimentText

y = sentimentLabel

Split X and Y into training and testing set

X_train, X_test,y_train,y_test=train_test_split(X,Y,test_size=0.2,random_state=1)

Perform data pre-processing using countvectorizer.

Remove stopwords.

Tokenization.

Ignoring case and punctuation

Strip white space.

Remove numbers and other characters

Train the model on training set

model=naive_bayes.MultinomialNB()

model.fit(X_train,y_train)

Make the prediction on testing set

my_test_data=['This is really good','This was bad']

my_vectorizer=vectorizer.transform(my_test_data)

model.predict(my_vectorizer

Compare actual response value with the predicted response value.


FLOWCHART
THE END

You might also like