You are on page 1of 15

TWITTER SENTIMENT ANALYSIS

ROBIN RAINA
ABOUT (TWITTER)

• An online social networking and micro blogging service.


• Enables users to send and read “ tweets “ which are text
messages limited to 140 characters , hence unambiguous.
• 500 million tweets daily by 240+ million active users.
• Audience varies from common man to celebrities .
• User discuss current affairs and share personal views.
INTRODUCTION ( SENTIMENT ANALYSIS )

• Sentiment analysis is the process of determining whether a


piece of writing is positive , negative or neutral .
• Mechanism to extract opinions , emotions and sentiments in
text .
SYSTEM FLOW DIAGRAM FOR TWITTER
SENTIMENT ANALYSIS
CHALLENGES

• Tweets are highly unstructured and also non - grammatical.

• Out of vocabulary words .

• Lexical variation .
DOTokenization
CUMENT PREPROCESSING

• Tokenization is the process of


breaking a stream of text up into
words, phrases, symbols, or other
meaningful elements called
tokens.
• The list of tokens becomes input for
further processing such as
parsing or text mining
LEMMATIZATION

• Lemmatization in linguistics, is the


process of grouping
together the different inflected forms
of a word so they can be
analyzed as a single item.
• In computational linguistics,
lemmatization is the algorithmic
process of determining the lemma for
a given word.
NAIVE BAYES CLASSIFIERS

• Naive Bayes classifiers are a collection of classification algorithms based on Bayes’


Theorem.
• Bayes’ Theorem finds the probability of an event occurring given the probability of
another event that has already occurred.
• P(A|B) = \frac{P(B|A) P(A)}{P(B)}
• Basically, we are trying to find probability of event A, given the event B is true. Event
B is also termed as evidence.
• It is simple probabilistic classifier that calculates a set of probabilities by counting the
frequency and combination of values in a given dataset.
• It is very useful to classify the tweets properly.
• The precision and recall of this method is known to be very effective.
PROBLEM STATEMENT
• People express opinions in complex ways .
• In opinion texts , lexical content alone can be misleading
• Intra-textual and sub sentinental reversals , negation, topic change
common.
• Rheortical devices / modes such as sarcasm , irony, implication etc.
• Unstructured and also non grammatical.
• Lexical Variation .
• Out of Vocabulary words.
• Extensive usage of acronyms like asap , lol, afaik .
OBJECTIVE

The main objective is to


connect on twitter and search
for the tweets that contain a
particularl keyword and then
evaluate the polarity of the
tweets as positive, negative or
neutral.
SCOPE OF THE PROJECT
• This project will be helpful for the political parties for
reviwing about the program they are going to do or te program
or they have performed .
• Companies can also review about their new product on newly
released hardware or software.
• Movie maker can also take review of their currently running
movie .
• By analyzing the tweets we can get the result how positive,
negative or neutral are people about it .
REQUIREMENT ANALYSIS
1. FUNCTIONAL REQUIREMENT -
Functional requirement are the functions or features that must be included in any system to satisfy
the business needs and be acceptable to the users.
-System should be able to process new tweets stored in database after retrieval
-System should be able to analyze data and classify each tweet polarity.

2. NONFUNCTIONAL REQUIREMENT-
Non-functional requirements is a description of features, characteristics and attribute of the system
as well as any constraints that may limit the boundaries of the proposed system.
-User friendly
-System should provide better accuracy
-To perform with efficient throughput and response time
DATASET
• Dataset is the collection of
data or related information
that is composed for separate
elements.

• The dataset used in this


project will be tweets .
CONCLUSION
• In the given study I was able to classsify the tweets as positive,
negative or neutral by the help of Naive Bayes algorithm to
clarify the wistful examination and was able to gain a test
accuracy of 90 % with our algorithm.

• Research results show that machine learning methods, such as


SVM and Naive Bayes have the highest precision and can be
viewed as the standard learning techniques.
THANK YOU!

You might also like