You are on page 1of 11

Sentiment Analysis on Twitter Data

Aman Pratap Singh(2028003)


Amarnath Patro(2028004)
Arpita Kushwaha(2028009)
Harsh Rai(2028018)
Parag Dhama(2028025)
Sunidhi Mohapatra(2028037)
1. Acknowledgements
We would like to thank our supervisor, N Biraja Isac, for
bringing the weight of his considerable experience and
knowledge to this project and for providing us all the necessary
resources for the project. His high standards have made us better
at what we do.

I would also like to thank all my teammates for all the work and
efforts they put in this project, without them this project would
have not been a successful one.

Our parents as well as friends were constantly encouraging us


throughout the process when we felt discouraged or became
frustrated because they knew how much work went into this
venture so that is why we want to extend them thanks too!

Above all, we would like to thank the Great Almighty for


always having his blessing on us.

2. Abstract
Every social networking site like Facebook, twitter,
Instagram etc. become one of the key sources of information.
It is found that by extracting and analyzing data from social
networking sites, a business entity can be benefited in them
product marketing. Twitter is one of the most popular sites
where people used to express their feelings and reviews for a
particular product. In our work, we use twitter data to analyze
public views towards a product. In this project we have to take
tweets from twitter and we have a set of predefined words which
tell the emotions related to those particular words. Then we
analyze the tweet with respect to our NLP based program. Then
we are analyzing the raw data and plotting the sentiment
analysis graph accordingly.
3. Introduction
Twitter allows businesses to engage personally with consumers.
However, there’s so much data on Twitter that it can be hard for
brands to prioritize which tweets or mentions to respond to first.

That's why sentiment analysis has become a key instrument in


social media marketing strategies.

Sentiment analysis is a tool that automatically monitors


emotions in conversations on social media platforms.

Carefully listening to the voice of the customer on Twitter using


sentiment analysis allows companies to understand their
audience, keep on top of what’s being said about their brand –
and their competitors – and discover new trends in the industry.

In this guide, learn how you can use sentiment analysis tools to
listen to your customers on Twitter, and follow our tutorial on
how to perform sentiment analysis in just a few simple steps.

Sentiment analysis is the automated process of identifying and


classifying subjective information in text data. This might be an
opinion, a judgment, or a feeling about a particular topic or
product feature.

The most common type of sentiment analysis is ‘polarity


detection’ and involves classifying statements as Positive,
Negative or Neutral.

4. Basic Concepts
4.1 Cleaning The Test:
 It is done in mainly two ways, firstly, we convert all the
characters in lowercase, and secondly, we remove all the
unwanted characters like punctuations.
 This is done so as in NLP, a word like “Data” and “data”
are treated differently by the computer.
 The unwanted characters like: !#@$%^& etc don’t convey
any sentiment and hence are removed.

4.2 Tokenization:
 The sentence is split/broken down into words and stored
in a word list.
 And then the stop-words like I,am,with,me,myself etc are
removed from this word list(as these words don’t
convey/add-up any sentiment to a sentence) to form the
final word list.

4.3 Natural Language Processing Emotion Algorithm:


 Check if the word in the final word list is also present in
Emotion.txt file
 We will be adding the emotions to our empty emotion list
and also counting emotions using the Counter from the
collections package of Python.
 Steps:-
1. Open the Emotion.txt file
2. Loop through each line and clear it
3. Extract the word and emotion using split() method
4. If word is present, add the emotion to emotion_list
5. Finally count each emotion in the emotion list

4.4 Plotting the Sentiments:


 We will be displaying emotions in a bar graph using
Matplotlib.pyplot

5. Problem Statement / Requirement Specifications

5.1 Project Planning:


 Objective of the project is to analyze the public sentiments
upon a topic in Twitter.
 Sentiment analysis focuses on the polarity of a text
(positive, negative, neutral) but it also goes beyond these 3
polarities to detect the specific feelings and emotions (such
as angry, happy, sad, etc.), urgency (urgent, not urgent)
and even intentions (interested, not interested).
 NLP, in simple form, finds the most frequently used
positive words or phrases compared to negative ones. This
becomes a numeric value, usually expressed as a
percentage, without the need to actually have people
complete a survey. In addition to the simple result,
sentiment analysis can be applied continuously and reveal
trends in the metric.

5.2Project Analysis:
1. Sorting Data at Scale:
Manually sorting through thousands of tweets, customer
support conversations, or surveys would be just too much
business data to process. Sentiment analysis helps
businesses process huge amounts of unstructured data in
an efficient and cost-effective way.

2. Real-Time Analysis:
Sentiment analysis can identify critical issues in real-time,
for example:-
Is an election crisis on social media escalating?
Is an angry customer about to churn?
Sentiment analysis models can help us immediately
identify these kinds of situations, so that we can take
action right away.

3. Consistent criteria:
It’s estimated that people only agree around 60-65% of the
time when determining the sentiment of a particular text.
Tagging text by sentiment is highly subjective, influenced
by personal experiences, thoughts, and beliefs.
6. Implementation
Methodology
To achieve this we follow the following steps

 A thorough study of existing approach and techniques in the


field of sentiment analysis
 Collection of related data from twitter with the help of twitter
api
 Pre processing of related date from twitter so that it can be
fitting for mining
 To build the classifier based on different supervised
machine learning techniques
 Training and testing of build classifier using large datasets
 Computing the result of difffrent classifier using dataset
collected from twitter
 Comparing results of the each classifier and plotting the
graph of that show the trend of positive and negative
sentiment for different political parties

Testing
 Firstly, For testing Purpose I have created two text files
emotions.txt and read.txt. Read.txt contains random texts
or paragraphs for testing and emotions.txt contains all the
keywords or emotions like sad, angry, happy, bored etc
related to our project that a person can feel that our model
can identify accurately.
 Now, After we successfully run our algorithm or model over
a set of statements We will get a graph as an output which
will classify the words present in the statement and analyse
them .We can accurately predict if the statement contains
more negative or positive words. Using that we can predict
the sentiment of the person .
 Testing on small data set. To verify if the model is working
properly. The sentiments are classified and the graphical
output is presented.
INPUT :

OTPUT:

Result Analysis
INPUT: A large data set.
OUTPUT:
As we can see in this example after running our algorithm over
a statement . We can clearly see the classification of words
present in the statement. The statement mostly contains happy
words following equal number of hatred and entitled words.

7. Conclusion and Future Scope


Studies in SA approaches have existed for more than a decade
and now are exploited by enterprises as an important tool for
strategic marketing planning and manoeuvring. This move is
also due to the advancement in data storage, access and
analytics enabled through big data frameworks. However, the
bigdata frameworks regard SA as just another possible
application that can benefit through its advanced data
Although several literatures are available that study the
challenges of SA in the big data frameworks, such as through
the volume, velocity and variety issue, the value, veracity and
volatility have not been explored as much, though in fact taming
the data is key for big data analytics. This paper discusses SA
approaches and their suitability for the big data framework. The
ratio of standard SA approaches to the SA approaches in big
data platform is still huge.Implementation and evaluation of the
effectiveness of close monitoring of social customer relationship
management is also still scarce although big data technologies
adoption is healthy. Gaps in the existing approaches and
possible future works are suggested according to each of the big
data issues. It is predicted that studies and skills development on
SA on big data platform for brand monitoring and customer
relation management are going to get increasing attention and its
growth will be energised by the high demands and a promise of
higher revenues for companies. This prediction is supported by
analysing the current marketing reports, surveys and summits on
SA-based big data analytics for application in customer
behaviour understanding and social network comments analysis
for consumer sentiments. Furthermore, brand management
approaches through SA are expanding and creating a marketing
tsunami in many organisations,which has got companies to shift
focus towards personalisation and a consumer-centric
engagement .
The era of getting valuable insights from surveys and social
media has peaked due to the advancement of technology.
Therefore, it is time for businesses to be in touch with the pulse
of what the customers are feeling. Companies are using
intelligent classifiers like contextual semantic search and
sentiment analysis to leverage the power of data and get the
deepest insights.
By SA we can formulate business strategies, exceed customer
expectations, generate leads, build marketing campaigns, and
open up new avenues for growth through natural language
processing solutions.

8. References
https://docs.python.org/3/
https://numpy.org/doc/
https://pandas.pydata.org/docs/
https://scikit-learn.org/

9. Individual Contribution

Abstract and Introduction by Aman Pratap Singh


Basic concepts by Amarnath Patro
Testing by Parag Dhama
Implementation by Harsh Rai
Conclusion by Sunidhi Mohapatara
Future scope by Arpita Kushawaha

You might also like