You are on page 1of 5

Survey on Sentiment Analysis of Tweets Data using Deep Learning and Big

Data Approach

Vivek Pandey1 Javed Khan2 Nuzhat Ansari3 Hamza Ansari4


Assistant UG Scholar UG Scholar UG Scholar
Professor
Department of Computer Engineering
ARMIET, University of Mumbai, India

Abstract— Social sites like Twitter helps the millions of people to share their thoughts about
a particular thing and what they feel about them. The tweet is a short and a simple form of
expression. So, in this review paper we focused on sentiment analysis of Twitter data. The
Sentiment Analysis sees as subject of text data mining and Natural Language Processing.
Using different aspects, the research of Sentiment Analysis of Twitter Data can be
performed. Here, in this paper we can see the different types of Sentiment Analysis and
techniques used to perform the extraction of the data. In this paper, we have taken
comparative study of different approaches and techniques of sentiment analysis having
twitter as a data.

Keywords – Sentiment Analysis, Data Mining, Social Sites, Twitter Dataset, Deep
Learning, Big Data.

Introduction
The social sites such as Twitter, Google+, Instagram, Facebook, and YouTube have gained so
much popularity these days. The area of sentiment analysis falls under computational
linguistics and data mining known as Opinion Mining. With the use of social sites, analysis
techniques have started to do studies in public data to do sentiment analysis in different areas
like politics, sociology, economy, entertainment and finance. It mainly aims to detect the
public’s mood, behaviour, sentiments, thoughts, and opinion from the texts provided.
Mostly the data available on the social sites are unstructured i.e. almost 80% of data is
unstructured. This unstructured data makes it more difficult to analyse and get a judgement
from this type of data.
To make a decision opinion of many people are required. These opinions are required when
the decisions have valuable resources. People now get new tools to share their ideas through
WWW. Sentiment Analysis only concentrates on the detection of positive, negative, or neutral
i.e. polarity. Now seeing Twitter is a microblogging site which allows the people to express
and share their ideas which contains large number of short lengths for marketing, networking.
Understanding through an example, film producers may be eager to known about opinions of
the public about their movies. Now a day’s gathering opinions and drawing conclusions about
the people likes & dislikes have been the most important perspective.

I. Level of Analysis

The definition of Sentiment Analysis can be given as the area of study which interprets people’s
thought or opinions, against any specific topic, it is known as Sentiment Analysis. Sentiment
Analysis, Opinion Extraction, Opinion Mining, Sentiment Mining, Affect Analysis, Review
Mining, etc. are also various names and having different tasks. [1]
Levels in Analysis-
Generally, Sentiment Analysis has mainly three categories: -
A. Document Analysis
In this level, we classify whether the complete document gives a positive, negative
or neutral sentiment.
B. Sentence Analysis
This level decides if each sentence represents opinion into positive, negative or
neutral and the task is sentence by sentence. If the sentence does not give any opinions that
means it is a neutral sentence. It is related to subjectivity classification. It expresses factual
information from sentences that gives subjective aspect and opinions i.e. good-bad terms.
C. Entity/Aspect Analysis
Both the above analysis does not give people likes and dislikes. This level gives throughout
analysis. This level was earlier called feature level. The main task of this level is to identify
constructs, gives attention at the opinion or sentiment.

II. Literature Survey

Singh, Prabhsimran, Ravinder Singh Sawhney, and Karanjeet Singh Kahlon. "Sentiment
analysis of demonetization of 500 & 1000-rupee banknotes by Indian government." ICT
Express (2017).[2]
In this paper, we can see that they have discuss and examine about the government policy of
demonetization from the citizen point of view. They have used this point of view to approach
the Sentiment Analysis by using the twitter data set. State wise tweets are collected i.e. geo-
location for the analysis. The Sentiment Analysis used classify the country into categories of
happy, sad, very sad, neutral, and no affect. Tweets collected are based on the keyword and
hashtags like #demonetization.
Gautam, Geetika, and Divakar Yadav. "Sentiment analysis of twitter data using machine
learning approaches and semantic analysis." Contemporary computing (IC3), 2014 seventh
international conference on. IEEE, 2014.[3]
In this paper we see the, Sentiment Analysis for customers review classification. They have
used three supervised learning of machine learning – Naive Bayes, Maximum Entropy and
SVM followed by sematic analysis which was used to calculate the similarity along with all
the three learning. They used python and Natural Language Toolkit to train and classify the
methods. The Naive-Byes approach gives a better result than the Maximum Entropy and SVM.
Fang, Xing, and Justin Zhan. "Sentiment analysis using product review data." Journal of Big
Data 2.1 (2015).[4]
In this paper, they have solved the issue of Sentiment Polarity Categorization and one of the
basic problems of Sentiment Analysis. Online product review is used as a data. The review
data is collected from Amazon.com. Investigation is achieved for both sentence level and
review level categorization. Naïve Bayesian, Random Forest and SVM are classification
techniques used. Scikit- learn open source software is used for this study. Scikit-Learn is a
learning software package used in python.
Amolik, Akshay, et al. "Twitter sentiment analysis of movie reviews using machine learning
techniques." International Journal of Engineering and Technology 7.6 (2016). [5]
They have proposed a better version model of Sentiment Analysis of Twitter data about the
reviews of coming movies in Bollywood and Hollywood. With the help of Naive Bayes and
SVM we are able to classify those tweets accurately. Naive-Bayes is better than SVM in
precision but slightly lower accuracy and recall. The accuracy can be increased by increasing
the training data.
III. Twitter

Sentiment Analysis is challenging on twitter tweets while performing. Now the field of
research, various techniques have come up with various methods to train the model and then
do testing to check the effectiveness. The aim is to classify the tweets in different sentiment
accurately. There are only 280 characters limit in hand, which generally results in sparse set of
features.
The words used are not quite same as the English Dictionary words and it makes our approach
outdate because of the evolutionary use of slangs.
Twitter also permits the use of user reference, URLs, emoticons, and Hash tags. This requires
different processing than other words.
All above are the problems faced in the pre-processing section in the system.

IV. Sentiment Analysis on Dataset

Figure: Sentiment Analysis on Dataset

Data Collection (Input – Keywords):


Take a subject and then collect data related to that keyword and perform sentiment analysis
on that.
Retrieval of tweets data:
Tweets can be of different types: Structured, Semi- structured and unstructured type. R or
Python can be used to collect data from Twitter.

Data Pre- Processing:


It is nothing but filtering of the data by removing the incomplete noisy data.
Below tasks are involved in pre-processing-
 Removal of retweets
 Removing special characters and numbers.
 Stemming
 Tokenization
Detection of Sentiment:
The main and fundamental task in Sentiment Analysis is classify the polarity of the
given tweets. Polarity identification is done by using different lexicons. The polarity
is of three types – Positive, Negative or Neutra
Algorithm of Classification:

Output Analysis
After the analysis is done, the result will be in a graphical format.

V. Conclusion
Due to a large number of real-world applications discovering people’s opinion is important
in better decision making, therefore, there is exciting new research in the field of sentiment
analysis. Recently people have started to express their opinion on the web that increases
the need for analysing opinion online content for the various real-world application. There
is a huge scope of improvement of these existing sentiment analysis model. In this
technical paper, we’ve discussed the importance of social network analysis. We have
implemented a python program to implement sentiment analysis. Support vector machine
is learned as best data classification technique it is nothing different from that technique
on other genres in the future these topics can be explored. Our proposed to classify the
tweet as positive, negative, neutral and it is gone through the pre-processing stage and
classified learning.in this POS tagging and features of tweets give the best result using
SVM. there are also several types of algorithm present in machine learning which can be
more useful for solving these types of problem. We can see this technique in future to rich
linguistic analysis like topic modelling and sentiment analysis

VI. References

[1] Liu, Bing. "Sentiment analysis and opinion mining." Synthesis lectures on human language
technologies 5.1 (2012).
[2] Singh, Prabhsimran, Ravinder Singh Sawhney, and Karanjeet Singh Kahlon. "Sentiment
analysis of demonetization of 500 & 1000-rupee banknotes by Indian government." ICT
Express (2017).

[3] Gautam, Geetika, and Divakar Yadav. "Sentiment analysis of twitter data using machine
learning approaches and semantic analysis." Contemporary computing (IC3), 2014 seventh
international conference on. IEEE, 2014.

[4] Fang, Xing, and Justin Zhan. "Sentiment analysis using product review data." Journal of
Big Data 2.1 (2015).

[5] Amolik, Akshay, et al. "Twitter sentiment analysis of movie reviews using machine
learning techniques." International Journal of Engineering and Technology 7.6 (2016).

You might also like