You are on page 1of 5

Twitter Sentiment Analysis

Mirza Babar Barlas Salman Sadruddin


Department of Master in Data Science Department of Master in Data Science
Shaheed Zulfikar Ali Bhutto Institute of Shaheed Zulfikar Ali Bhutto Institute of
Science and Technology Science and Technology
Karachi, Pakistan Karachi, Pakistan
mbhbarlas@gmail.com salmanvadsarya@gmail.com

Abstract— Analysis of sentiments is the approach of and gain valuable judgment from such data. Sentiment
figuring out whether or not the sentiments in the textual analysis or opinion mining is the important technique, which
content are positive, negative or neutral. It's otherwise help in detecting opinions of people on social media data.
called supposition mining, inferring the sentiment or Visualization this data is the most difficult task of this
frame of mind of a user or material polarity or mining of unstructured data
opinions. The growth and advancement in social media
platforms engaged a huge number of users. Social The main focus of this project is to build a
networking sites like twitter have millions of people classifier or some analysis based on tweets from the live
share their thoughts day by day as tweets. Because of the APIs of available for twitter. In this paper, sentiments are
constrained variety of characters in tweets, it turns into classified as three different categories such as positive,
easy for the sentiment analysis. On Twitter 550 million negative, and neutral based on the polarity of the sentiments.
approx. of tweets are posted daily. Twitter also The polarity of each tweet has been calculated using the
represents all age group people and also a fair Text Blob Tool. It is a very powerful tool in the python
representation of gender. Therefore, the sentiment programming language to calculate the polarity of sentiment
analysis of twitter data becomes somewhat general and analyze the sentiments of tweets. This research work
sentiments of society. This paper aim to build a model has been done to determine that which hashtag have
that perform a sentiment analysis of people’s opinion negative, positive and neutral feedback of the people around
related to the ongoing trend. To analyze the sentiments the globe.
of people's Machine Learning tools and techniques are
used. To classify each tweet as positive, negative, or Based on the data source and upcoming output of
neutral using Text blob in Python based on the polarity machine algorithm, will visualize the data on tableau and
of sentiments. A sentiment polarity is the emotions of power BI, the most powerful data visualization tools. This
user such as angry, sad, happy and joy. The proposed all are very important to analysis that what word game is
mechanism has been implemented in Python. This paper going now days a world and people discussing most.
will also show sentiment analysis types and techniques
used to perform extraction of sentiment from tweets and II. LITERATURE REVIEW
visualize it on a tableau Our project is the sentiments analysis and its visualization
on twitter data based on real time twitter API. Before the
Keywords— Social media analytics, Sentiment analysis, development of this project, we have the review papers
twitter, tableau which are designed on the analysis of twitter post based on
different algorithms and libraries.
I. INTRODUCTION
Most of the authors and expertise in this field state
Microblogging sites like WhatsApp, Instagram, twitter, and discuss those opinions and attitude expressed in social
Facebook and YouTube are rich in source for a varied kind media platform is growing in abundance nowadays and
of information. It is a place where people usually post their people decide everything based on these views fellow users
opinion, feedbacks, comments etc. based on their make. This leads to the creation of enormous amount of
experience, it can be negative, positive and neutral. Many data. Gleaning 3rd IEEE International Conference on
upcoming organizations require feedback on their products Computational Systems and Information Technology for
to improve further or build some good features or addons. Sustainable Solutions 2018 ISBN: 978-1-5386-6078-2 ©
Most of the time organization analysis their feedback from 2018 IEEE 30 information from huge storage of data is a big
social media and answer them so there is a challenge to challenge for the companies nowadays. Here comes the area
analyze or detect the sentiments of people globally and then for analyzing data mostly on popular platforms such as
act as per there feedbacks. Twitter is one of the most famous twitter.[1]
platforms for users to communicate with people. Sentiment
analysis, known as opinion mining, for classifying specific In paper [2] the author discusses about the systems
words into positive or negative and neutral. are designed to retrieve information using twitter data and
then classify them based on the semantics of knowledge
Most of the data that available in social networks is contained. Authors Lokmanyathilak Govindan Sankar
unstructured. Such unstructured data is almost 80% of the Selvan and Teng-Sheng Moh have developed the framework
data all over the world. This makes it difficult to analyze in paper [3] which makes use of real-time Twitter data
stream, that are cleaned and analyzed and then fast feedback Hasan[12] and Clavel[12] used machine learning
is acquired through opinion mining. techniques for sentiment analysis. Jianqiang[13] compared
the sentiment analysis and data mining techniques. Abraham
In paper [4] Web-based tool named SWAB (Social and Putra [14] [15] analyzed the people's review of mobile
Web Analysis Buddy) which is integrating qualitative applications using various machine learning techniques.
analysis and large-scale data mining techniques together is Neethu et al. [16] and, Groot [17] used machine learning
proposed here. Prototype of this tool is demonstrated by techniques for sentiment analysis
analyzing matter posted by student on Twitter is
demonstrated. Prabhsimran Singh, Ravindra Singh and Karanjeeet
Singh Kalhon, [18] they've examined the authority’s policy
In paper [5] author give details about usage of the demonetization from the everyday character’s viewpoint
StanfordcoreNLP libraries and twitter4j libraries to with using the technique of sentiment analysis and the use of
construct an application that can obtain data in the form of Twitters facts, Tweets are amassed using sure hashtag
tweets and perform the sentimental analysis to display (#demonetization). Analysis based totally on geo-vicinity
positive or negative tweet on any particular topic using its (State clever tweets are accrued). The sentiment evaluation
associated hash tag. API used from which means cloud and classified the states
into six categories, they're satisfied, sad, very sad, very
happy, impartial, and no information.
Authors M.Trupthi and others focus on the short
sentences and entity level sentiment analysis in paper
[6].The streamed tweets collected using the interface Twitter III. METHODOLOGY
API which as well stores tweet scores along with its In this paper the live twitter data is classified based on the
timestamp being classified as positive, neutral and negative sentiments. This can be done in 7 phases as in figure 1.1.
using standard classifier.

Authors of [7] used Textblob for pre-processing,


polarity, the polarity confidence calculation, and they
validated the obtained results by SVM and Naïve Bayes
using Weka; they reported the highest accuracy of Naïve
Bayes with a 65.2% rate, which was 5.1% more than the
SVM accuracy rate.
Figure 1.1
Kaur and Sharma [8] analyze the sentiments
regarding coronavirus disease (COVID-19), so analyze Tweet Mining:
the sentiments of different people’s opinion for this disease. In phase I tweets are collected by taking the input
For this purpose, twitter API used for collecting related in the form of hashtags, the number of tweets is mentioned
tweets to the coronavirus, then positive, negative and in the code and to be considered is restricted in between the
neutral emotion analyzed by using machine learning given range. The tweets are collected online from the
approaches and tools. In addition, for preprocessing of developer APIs of twitter using consumer key. In Python we
fetched tweets NLTK library is used and Textblob dataset have a library of tweepy which allow us to collect or
for analyzing tweets is used, after that show the interesting gathered the information from the live API.
results in positive, negative, neutral sentiments through
different visualizations. Data Cleaning:
In the second phase, we have performed the
As a comparison, the results of this study are also cleaning process on gathered information or collected data.
compared to the TextBlob sentiment analysis, a sentiment Like null or empty locations are filled with NaN to further
analyzer that has a Natural Language Toolkit (NLTK) and process with next phase.
Pattern processing basis [9]. TextBlob can also be used for
text mining, text processing modules for python winners, Tweet Processing:
and even text analysis. TextBlob also provides simple APIs In the 3rd Phase, to reach the ultimate goal, there
for general Natural Language Processing (NLP) processing was a need to clean up the individual tweets. To make this
such as part-of-speech tagging, tokenizing sentences, noun easy, I created a function "cleantext" in Python program
phrase extraction, sentiment analysis, classification, which I further applied to the "Tweets" to produce the
translation [10]. desired results. This user-defined function was used to
remove punctuations, links, emojis, and stop words from the
Adi Laksono all [11] extracted customer reviews tweets in a single run. Additionally, we used a concept
on TripAdvisor using Naive Bayes and TextBlob to known as "Tokenization" in NLP. It is a method of splitting
determine the sentiment of each comment. The result of the a sentence into smaller units called "tokens" to remove
experiment shown Naive Bayes have better data accuracy unnecessary elements. Another technique worthy of mention
with 72.06% while TextBlob only obtained an accuracy of is "Lemmatization". This is a process of returning words to
69.12%. their "base" form. A simple illustration is shown below.
Data Exploration:
In the 4th Phase, we explore the data using Word
Cloud and MatPlotlib library and see which common word
are frequently used in the tweets by people.

Location Geocoding:
In the 5th Phase, for final dashboard, we wanted to
add a map that shows the number of tweets per country. To
do that, Tableau needs basic geographic information that it
can recognize. For that we have used the HERE Developer
API to return Longitude, Latitude & Country names for each
tweet.

Sentiments Analysis:
In the 6th Phase, this is the most important stage
where we are extracting the sentiments on the processed
tweet data by applying text blob model.
Figure 1.4
Dashboard Design:
In the last phase we have design the dashboard on tableau Figure 1.5 shows the polarity of the processed or clean tweet
and power bi based on the analyze dataset. of the people by using the text blob model. It shows how the
tweet of the people is subjective and whats its polarity.
Figure 1.2 shows the collecting data from tweets based on
given hasgtag in the program. It shows the columns which
extracted from the live APIs of twitter.

Figure 1.2
Figure 1.5
Figure 1.3 shows the location geocoding based on the given
location mentioned in the collected data. Here we used the Figure 1.6 shows the count of the positive, negative and
API for the location geocoding. neutral sentiment of the tweets based on polarity

Figure 1.3

In Figure 1.4 shows the word cloud of the text data based on
given hashtag. The figure shows the most frequent words
extracted from the people opinion
Figure 1.6

Figure 1.7 shows the final dashboard based on the processed


tweets and extracted sentiments from the tweets and twitter
data.
of Twitter Data by Using StandfordNLP Libraries with
Software as a Service (SaaS)”, Computational Intelligence
and Computing Research (ICCIC), IEEE International
Conference, 15-17 December. 2016

[6] M.Trupthi, Suresh Pabboju, G.Narasimha, “Sentiment


Analysis on Twitter using Streaming API”, Advance
Computing Conference (IACC), and 2017 IEEE 7th
International Conference, Jan 2017, Page: 915 – 919

[7] Saha, S.; Yadav, J.; Ranjan, P. Proposed approach for


sarcasm detection in twitter. Indian J. Sci. Technol. 2017,
10. [CrossRef]
Figure 1.7
[8] C.Kaur and A. Sharma,"Twitter Sentiment Analysis on
Coronavirus using Textblob," Easy Chair2516-2314,
IV. CONCLUSION 2020.
In this paper we have analysis the sentiment of tweets based
on negative, positive or neutral. This paper also shows the [9] B. Agarwal, N. Mittal, P. Bansal, and S. Garg,
visualization of the data on tableau to get its insights. In the "Sentiment analysis using common-sense and context
proposed algorithm, twitter dataset is fetched from twitter information," Computational intelligence and neuroscience,
API/ data source for analysis of sentiments emotions of vol. 2015, p. 30, 2015.
different users. Here we check sentiment polarity of each
tweet. The sentiment polarity is emotions of users like joy, [10] S. Vijayarani, R. Janani, and others, "Text mining:
happy, sad and angry. If the sentiment polarity is equal to open source tokenization tools-an analysis, " Advanced
zero then tweet is neutral and if polarity is greater than zero Computational Intelligence: An International Journal
then tweet is positive otherwise, tweet is negative. In this (ACI/), vol. 3, no. I, pp. 37- 47,2016.
manner proposed algorithm distinguish tweets based on
sentiment polarity of each tweet of users. [11] R. A. Laksono, K. R. Sungkono, R. Sarno, and C. S.
Future work: In future it is intended to continue Wahyuni, “Sentiment Analysis of Restaurant Customer
working on it and propose a new sentiments analysis Reviews on TripAdvisor using Naive Bayes,” in 2019
method for analyzing image and video sentiments of twitter 12th International Conference on Information &
data based on different ML and python algorithms Communication Technology and System (ICTS), Surabaya,
Indonesia, Jul. 2019, pp. 49-54, doi:
10'1109/ICTS'2019'8850982'
REFERENCES
[11] A. Hasan, S. Moin, A. Karim, and S. Shamshirband,
[1] Prakruthi V, Sindhu D and Dr S Anupama Kumar. Real “Machine Learning-Based Sentiment Analysis for
Time Sentiment Analysis of Twitter Post 3rd IEEE Twitter Accounts,” Math. Comput. Appl., vol. 23, no. 1, p.
International Conference on Computational Systems and 11, 2018.
Information Technology for Sustainable Solutions 2018
[12] C. Clavel and Z. Callejas, “Sentiment Analysis: From
[2] Asad Masood Khattak, Rabia Batool, Jahanzeb Maqbool Opinion Mining to Human-Agent Interaction,” IEEE
and Sungyoung Lee “Precise Tweet Classification and Trans. Affect. Comput., vol. 7, no. 1, pp. 74–93, 2016, doi:
Sentiment Analysis”, Computer and Information Science 10.1109/TAFFC.2015.2444846.
(ICIS), 2013 IEEE 12th International Conference, 16-20
June 2013, Page: 461 – 466 [13] Z. Jianqiang and G. Xiaolin, “Comparison research
on text pre-processing methods on twitter sentiment
[3] Lokmanyathilak Govindan Sankar Selvan, Teng-Sheng analysis,” IEEE Access, vol. 5, pp. 2870–2879, 2017.
Moh “A Framework for Fast-Feedback Opinion Mining
on Twitter Data Streams”, Collaboration Technologies [14] M. P. Abraham and K. R. Udaya Kumar Reddy,
and Systems (CTS), 2015 International Conference, 1-5 “Feature-based sentiment analysis of mobile product
June 2015, Page: 314 – 318 reviews using machine learning techniques,” Int. J. Adv.
Trends Comput. Sci. Eng., vol. 9, no. 2, pp.
[4] Xin Chen, Krishna Madhavan, Mihaela Vorvoreanu, “A 10.30534/ijatcse/2020/210922020.
Web-Based Tool for Collaborative Social Media Data
Analysis” Published in Cloud and Green Computing [15] R. R. Putra, M. E. Johan, and E. R. Kaburuan, “A
(CGC), Third International Conference on 30 Sept.-2 Oct. Naïve Bayes Sentiment Analysis for Fintech Mobile
2013 Germany, Page: 383 – 388 Application User Review in Indonesia,” Int. J. Adv.
Trends Comput. Sci. Eng., vol. 8, no. 5, pp. 1856–1860,
[5] Hase Sudeep Kisan, Hase Anand Kisan, Aher Priyanka 2019.
Suresh, “Collective Intelligence & Sentimental Analysis
[16] Neethu M S, Rajasree R, “Sentiment Analysis in
Twitter using Machine Learning Techniques,” 4th Int.
Conf. Comput. Commun. Netw. Technol., 2013.

[17] R. De Groot, “Data Mining for Tweet Sentiment


Classification” pp. 1- 63, 2012.

[18] Singh, Prabhsimran, Ravinder Singh Sawhney, and


Karanjeet Singh Kahlon. "Sentiment analysis of
demonetization of 500- & 1000-rupee banknotes by
Indian government." ICT Express (2017).

You might also like