Professional Documents
Culture Documents
Social Media Sentiment Analysis
Social Media Sentiment Analysis
Twitter tweets
Saif Ullah
UCP, saifullah27@gmail.com
Abstract - In past decade, due to the rapid growth of In the last decade, social media platforms rapidly grew and
online social platforms people constructed a global built a global communication over the world through
world for communication through different applications different social applications. The massive information is
of social media on internet. The massive information is generated on online social platforms on daily basis. Such as
generated on online social platforms on daily basis. The on twitter there are above than 500 million posts per day [1].
massive use of social media is influencing digital market. Online social networks affects the field of business,
To know customer’s sentiment, loyalty, attitude and promotion, and web based business as it clarifies customer
behavior towards the particular brand, social media conduct and response about specific business products and
monitoring is a new way. The objective of the conducted services. The organizations are affected due to people
study to perform sentiment analysis of different opinions and purchase selection and organizations are now
available mobile operating system to know customers utilizing contents of online social networks to analyze user’s
sentiments and attitude towards these brands. The behavior before entering into actual market. For the
following brands (iOS, Android, Windows, BlackBerry concerns of social media analytics, the posts, comments and
and Symbian) are selected. However, the contents feedback are requires to conclude results. The process of
available on social media are in unstructured form and retrieving information form online social networks and to
to analyze unstructured data is still a challenging task. analyze the information for business decision is called social
media analytics. Generally, social media analytics
Index Terms - twitter, R language, sentiment analysis, techniques are used to discover sentiment of customer with
machine learning the end goal to help promoting and client benefit exercises.
The significance of online networks is natural and adaptably
PROBLEM DEFINITION utilized by organizations and people to know the current
There are different sorts of information generate in social trends in market. It encourages organizations to know
media platforms by social media consumers/groups that client’s perspectives and their remarks on product’s quality
should be monitor in a systematic way to measure and services to take better decision in the favor of successful
individuals sentiments about different brands and products. business. The regular goals incorporate expanding incomes,
Text mining is an approach that support to build valuable decreasing the cost of customer services, gathering customer
business visions through sentiment analysis on contents, the feedback on products, and in addition improve customer
contents are may be in different forms such as document, opinions regarding to specific product [1], [2].
tweets, Facebook posts, comments, videos or images.
To understand the online network analytics, there is need to
There are several mobile operating systems are available in view problem from two perspective: the business
market with their own functionaries but following mobile perspective and technical perspective. As concern regarding
OS are selected for current study: 1) iOS, 2) Android to business perspective, there is need to know currents
(Samsung), 3) BlackBerry, 4)Windows and 5) Symbian’s. market trends to compete the competitors and target the
iOS comes with limited devices such as iPod, iPad, iPhone customers. In short, organization must know the right time
while Android comes with a huge number of devices with to introduce their products and services in market.
different companies but we focused on Samsung. Moreover, organizations must acknowledge the market
BlackBerry comes with KeyOne, Passport and etc and state, either there is a similar product already exist or not. If
Windows comes with limited mobile phones such as Lumia. similar products exists, it is essential to know what are the
However, Symbian’s OS is not a well establish OS todays positive and negative feedback are received these similar
but we consider it to analyze why it fails. All operating products so organization can take certain actions to improve
systems exist since long time and supported by huge their products and services. This process enables
devices. To measure customer’s behavior and sentiments organization to take advantage above competitors in terms
towards these mobile operating system, this paper presents a of competition.
sentiment analysis study.
After releasing the product or services in market,
organizations take interests to validate customer’s feedback
INTRODUCTION to know the customer response towards the products or
services. It includes the number of followers, replies, To extract and perform sentiment analysis on twitter tweets
retweets and reaction on released product. In the end, it the tool sentiment viz [20] has used. The tool is open source
enables organizations to know and understand the customer and available online. It provides the functionality to extract
behavior and opinion. The technical perspective is belong tweets based on particular keywords and generated multiple
those difficulties which are face during the information results on sentiment dictionary. The tool classify the tweets
retrieval from online networks and to filter the positive and based on Russell’s model of emotion affect [20]. Further to
negative sentiments based on certain criteria. In addition, estimate sentiment score, the machine learning approach
the analysis of online networks needs to access via internet such as Naïve Bayes model is apply by the tool. They built
and require huge memory to store the gather information for up dictionary for sentiment containing above than ten
further preprocessing and analysis. It also incorporates the thousand English terms. To rate every sentiment term, a
techniques related to data cleaning, transformation data scale ranging 1-9 is considered.
from unstructured form to structure form.
SENTIMENT ANALYSIS
In order to determine customer’s opinion is not an easy The accessibility of the Internet, Web and cell phones made
tasks as it seems. To identify customer’s behaviors towards it possible to connect and communicate immense mass of
particular event or topic of interest requires to conduct individuals with each other’s through online networks
sentiment analysis approach. Sentiment analysis determines accounts from any place and any time in the world. Rather
user’s attitude polarity and emotion towards certain than being the inactive buyers, individuals have begun to
sentence. To accomplish this, machine learning and natural end up dynamic purchasers by means of their online
language process (NLP) approaches are take place and networking accounts. Cell phones have just cooked their
developer or analyst face difficulties while applying these use. An overview led in year 2015 announced that the cell
approaches on unstructured content. In the last few years, an phone proprietorship has come to a stunning measure of
emerging interest has been observed in these analysis 86% in American 18-to-29-year-olds [9]. Interpersonal
techniques in order to utilize social media content for interaction locales have since a long time ago understood
sentiment analysis, opinion analysis, observe cohesion this and everybody can see its aftermaths. One can see each
between community and for advertising purpose. Now a online network accessible as an application on Play Store
day’s social media playing an important role in modeling of and Apple Store, online networks have turned into a
public opinion, it is essential to utilize large volume of significant piece of our regular daily existence. As of now,
dataset in an efficient way. A primary phase, read, annotate Facebook is the world's greatest informal community, with
and limit the dataset size that can be easily analyze. more than1.44 billion clients as of year 2015 (around 2% of
the total populace), with Instagram (300 million) and
The section II represent scope of presented study. Section II Twitter (284 million). Facebook's clients around the globe
represents the research question. Section III represents the spend a normal of 20+ minutes for every day on the
tool which are used for sentiment analysis. Section IV informal community, loving remarking, and looking through
demonstrates the brief explanation of sentiment analysis. announcements which represents almost 20% ever on the
Section B represents literature review, Section VI represents web [8, 9].
research methodology. Section VII represents results. The
conclusion is represented in Section VIII. This implies with this much data accessible online we can
SCOPE without much of a stretch track the "patterns", "prevalent
contemplations" and "opinions" of individuals effortlessly.
The five operating system (smart phones) have selected for The general population nature of client substance, for
current study. Although, Android comes with multiple smart example, measure of 'takes after', 'likes', 'remarks' on social
phone companies but we have limited it to Samsung through networks can give an understanding to the themes pulling in
hashtags. While iOS offers from single company Apple Inc. client intrigue. People have likewise begun swinging to
so the iPhone series is selected for iOS. For the BlackBerry informal communities as wellsprings of continuous news
the two following models are selected (Passport, KeyOne). and suppositions. To help this, stage suppliers have made it
For the Windows based mobile Nokia Lumia is selected. conceivable to look through the enormous measure of open
announcements. Stages like Google, Twitter, and Facebook.
RESEARCH QUESTION They have additionally offered the likelihood to get to
There are two major research questions are considered for people in general notices through their pursuit APIs,
selected case study: bringing about a burst of business and research endeavors to
1. Explore how user response change towards assemble information through examination of the common
different OS over the time. substance. This has persuaded investigate into content
2. Analyze and conclude which OS contains the more examination and use of the current data recovery and pattern
positive or negative sentiments. location systems to online networking so as to profit by the
learning encased inside the client produced content. This
TOOL investigation offers significant experiences into the themes
that pull in the consideration of an extensive part of social terms. The sentiment bearing terms were fetched from the
networks clients. Not only for the people but rather general Whissell [18] dictionary. They process tweets from
suppositions in from these examination are additionally multiple stages to clean them such as tokenization,
extremely critical for journalists (reporters for news), abbreviations are converted into original words and links are
customer behavior tracking organizations, election outcome removed. After data cleaning, the tweets are isolated in
predictions, economical predictions and more. pleasant and unpleasant sentiments using supervised
learning approach with n-gram. In the same year, Jiang et al
The text or sentiment analysis is perform on gathered [19] classified tweets based on SVM approach and classify
information from twitter tweets. The term sentiment tweets on three attributes (pleasant, unpleasant and neutral).
analysis applies to the way toward utilizing an arrangement
of etymological, factual, and machine learning strategies In 2009, a study conducted on the analysis of tweets polarity
which leads to structure the data to substance of printed by Go et al [11]. The researcher applied supervised
hotspots for insight, exploratory information examination, classification approach on emotions (tweets). Two emotions
research, or examination. Moreover, sentiment Analysis is are considered Happy “” and sad “” as positive and
the way toward distinguishing the passionate tone in a negative effects. In 2005, Read applied supervised
progression of words which on turn is utilized to pick up a classification technique to build corpus from tweets based
comprehension of the states of mind, sentiments and on positive and negative sentiments [10]. The researchers
feelings communicated inside an online mention. This paper applied different techniques such as SVM, Naïve Bayes and
utilizes sentiment analysis in scope of seven unique and Entropy and stated that straightforward utilization of
differing feelings, to be specific, anger, anticipation, disgust, unigrams resulted in good outcome but the more
fear, joy, negative, positive, sadness, surprise and trust. improvements are possible in results by applying the
unigrams and bigrams together. In 2010, Pak and Paroubek
Along with different techniques of machine learning an build tweets corpus based on positive and negative emotions
essential protagonist is played by sentiment technique to and performed and compared different learning approaches
identify sentiment of massive contents. Todays, several and claimed they have achieved better results as compared
analysis tools are exist which are performed particular tasks to old studies using Naïve Bayes considering unigrams and
to identify individual’s excitement towards forthcoming speech tags [12].
movies, relates individuals emotions towards political party
to know positive and negative attitudes, measure individual METHODOLOGY
sentiment based on rating criteria such as to know good and Generic Dataset
based things about hotel/restaurant services and facilities.
Due to massive amount of shared information on social The sentiment analysis is conducted on leading operating
media platforms (blogs, forums and etc), it is not possible to system in market such as iOS, BlackBerry, Windows,
analyze massive amount of data manually. Symbian’s and Android. The maximum tweets are fetched
from the twitter through sentiment viz tool. The tool itself
LITERATURE REVIEW build corpus from tweets using lexicon methods and
The social media analysis is gaining attention day by day in compare each token or emotion with sentiment dictionary to
order to achieve individuals feeling what they think about decide either it is positive sentiment or negative sentiment.
product or topic. Generally, sentiment approaches are sorted Further, to compute sentiment score, sentiment lexicon and
in two methods 1) lexicon based [15] and 2) machine appropriate techniques are applied. Sentiment lexicons is a
learning approach [13]. The lexicon based approach process to map or create relationships between words and
depends on corpus (collection of terms/information or sentiment score. Results are presented using graphs. The
content). Along with sentiment lexicon method, machine overall process is divided into four phases:
learning techniques utilize different types of methods such
as syntactic, linguistic and hybrid techniques. A research • gather tweets from twitter through twitter API (viz tool)
[14] established reviews polarity through recognition of Phase-1
adjectives polarity who appears in reviews. The research
resulted in ten times better accuracy as compare to pure • Perform data normalizing on gathered contents
• To apply techniques of sentiment analysis, perform feature reduction
approaches of machine learning. Unfortunately, these Phase-2
approaches are failed while applied in new domain because
• Conduct sentiment Analysis
they are not flexible with ambiguous sentiment terms. The • Polarity Classification
adjective terms changed meaning of sentence in sentiment Phase-3 • Generate visualization of results
lexicon [16].
• determine sentiment score based on emotions
In 2011, Zhang et al [17] applied hybrid techniques in Phase-4
tweets for sentiment analysis. The hybrid technique was the
combination of supervised learning and sentiment bearing
Phase 1: information gathering
In the phase-1, the tweets are gathered through sentiment
viz. One year old tweets (thousands of tweets) are gathered
to perform sentiment analyze and to get accuracy in results.
As compare to android (Samsung), the very less individuals The Figure 10 determines the hashtags those are used for
are lies in unpleasant quadrant. The majority of the people BlackBerry OS. The Figure 11 shows the sentiment score
lies in pleasant quadrant. for BlackBerry OS. A large number of individuals are lies in
pleasant quadrant.