You are on page 1of 6

2015 IEEE International Conference on Systems, Man, and Cybernetics

Sentiment Analysis Over Social Networks: An


Overview
Khaled Ahmed Neamat El Tazi Ahmad Hany Hossny
Faculty of Computyers and Information Faculty of computers and Information Centre for Intelligent Systems Research
Cairo University Cairo University Deakin University
Cairo, Egypt Cairo, Egypt Geelong, Australia
khaled.elahmed@gmail.com n.eltazi@fci-cu.edu.eg ahmad.hossny@deakin.edu.au

Abstract—The rapid increase in data on social media creates best features for better sentiment results. Using crowdsourcing
a need for mining such data to get valuable insights. The data [7] and public crowd experience in labeling can be one of
type can be unstructured with large volumes. Sentiment analysis
addresses such need by detecting opinions or emotions on the
the techniques that are used to enhance the sentiment analysis
social media text. Sentiment analysis can be performed in various results according to the crowd labeling as well as on feedback
domains such as social, medical and industrial applications. This given on sentiment classes. Data cleaning [8] and data integra-
paper presents a survey about sentiment analysis addressing tion [9], [10] also serve as enhancement techniques towards
the different concepts in this area, problems and its solutions, better sentiment analysis results.
available APIs, tools used and presenting a list of open challenges
in this area.
Spam or fake sentiment detection in reviews or posts is an
Index Terms—Social Media, Sentiment Analysis, Feature Se- important application of the the sentiment analysis[11], [8].
lection, Recommendation, Spam Detection, Sentiment Lexicons Sentiment analysis also can be used to define trust over social
and Emotion Detection network for a brand or a service [12] or to build recommen-
dation systems [13], [4], which recommend a service, a place
I. I NTRODUCTION or a product for user.
Most of the data that exist in social networks is unstructured
[1]. Such unstructured data is approximately 80% of the data
all over the world. This makes it difficult to analyze and gain
valuable insights from such data. Sentiment analysis or opinion
mining are two important techniques, which help in detecting
emotions and opinions on social media data. This can help
in solving many problems and provide many indicators in
election, public opinion, and advertisement, health care and
public satisfaction.
Discovering hidden patterns from data by applying analysis
and data mining techniques over data can help in discovering
many indictors [2], [3], [4], [5], [29]. Sentiment analysis
helps in bioinformatics such as cancer detection, as well as in
predicting future stock market trends by analyzing and mining
social media posts. Applying mining techniques and sentiment
analysis over unstructured data is considered a big challenge
in the sentiment analysis research area.
Sentiment analysis can be applied in four levels: sentence,
aspect and document and user level. This can be performed
using machine learning (clustering or classification), lexi-
con, NLP, Ontology or hybrid techniques. There are many
enhancement methods to enhance sentiment analysis results
such as feature selection, data integration, data cleaning, and
crowdsourcing.
Fig. 1. The Different phases of sentiment mining including the steps to build
Feature selection is used for choosing suitable features from the model and the steps to use it later in medical and social applications
text that enhance sentiment analysis results. Feature selection
has multiple techniques [6], which are applied for choosing The rest of the paper is organized as follows: Section II
describes sentiment analysis levels. Section III presents the

978-1-4799-8697-2/15 $31.00 © 2015 IEEE 2174


DOI 10.1109/SMC.2015.380
state of the art sentiment analysis techniques. Section IV states III. S ENTIMENT ANALYSIS METHODS
sentiment analysis enhancement techniques. Section V presents A. Sentiement analysis lexicons
recent applications and models. Section VI presents a list of
available sentiment analysis tools, APIs and lexicons . Section Sentiment analysis is the process of defining positive or
VII presents a list of accuracy measures used for evaluating negative or neutral feeling through text [6] . There are three
the sentiment analysis techniques. Section VIII presents open solutions for defining the text sentiment, the first is to label
challenges in this research area and the conclusion of this paper text manually and this takes a lot of effort. The second is
is presented in Section IX. using NLP, Lexicon or machine learning solution. The third is
hybrid, which uses human experts or crowdsourcing in giving
feedback of sentiment analysis results or in labeling training
II. S ENTIMENT ANALYSIS LEVELS data sets.
There are two types of lexicons [17]. The first type is corpus
Applying sentiment analysis over big data [14] leads to a lot
lexicon, which is divided into two types (semantic oriented,
of insights and business benefits. Sentiment Analysis, opinion
statistical oriented). Corpus lexicon, such as SenticNet [12] can
mining or emotion detection is the process of extracting
achieve more accurate sentiment results as it is context oriented
sentiment from text which is commonly used over online
not similarity of words oriented. An example of semantic
unstructured text like micro-blogger data and social media data
oriented lexicons can be found in [18] where the authors dealt
streams [15].
with the meaning of words based on a concept net lexicon.
Sentiment analysis can be applied on four different levels On the other hand, in [19] the authors presented a statistical
[16]. Level 1 is the sentence level, which detects positive, method in defining sentiments.
negative and neutral sentiment for each sentence. Level 2 is the The second type of lexicons is dictionary based. In [20],
document level, which detects the whole document sentiment two dictionaries were presented. The first is a word dictionary,
as one unit or one entity positive or negative or neutral. Third which ripped with human emotion. The second is a topic
level is the aspect level and it is used in case of the availability modeling or a topic oriented dictionary, which is helpful
of attributes inside entity, post or input text. Each attribute can in aspect sentiment analysis. Some researchers [21] tackles
hold a sentiment in its own. For example, a customer review on dictionary based lexicons by integrating existing seeds or
a mobile phone has the attributes battery life, screen light and dictionaries to build more valuable multi domain dictionaries.
other attributes. Each attribute can have a different sentiment
Consider the following example on a sentence level: happy B. Sentiment analysis and NLP
to meet you is considered a positive sentence, while My phone Using of NLP, natural language processing, one can achieve
is very interesting but need enhancement in some issues is accurate sentiment results by resolving context of words, the
considered a positive in document level if the whole text was implicit or indirect meaning of words challenges. Stanford sen-
considered as one entity. timent Treebank [22] provides a solution for these challenges
The aspect level can lead to a better analysis and results if on a sentence level. The authors in [23] proved that the using
taken into consideration. Consider the following example on of NLP, lemmatizes, n-grams (unigram bigram or trigram),
the aspect level: My phone is really nice but I have a bad negations, valence shifters and stemming as a preprocessing
battery. It contains slow applications but I am happy with phase will enhance sentiment results. Hash-tags can also help
its screen. The aspect here is the phone while the attributes in indicting tweets or posts polarity or objective [15], it can
are battery, applications, and screen. Sentiment detection can also be used to identify the author using his writing-print or
lead to the following results (battery, negative), (application, the stylometry of the tweets [24] .
negative), (screen, positive).
Some sentiment analysis techniques apply grouping on the C. Sentiment analysis and machine learning
aspect level where all attributes having the same sentiment Machine learning solutions, are supervised using labeled
result are grouped together. The grouping of the previous trained data, unsupervised without trained labeled data and
example will lead to the following result: (battery, application, semi-supervised with mixed of labeled and unlabeled data.
and negative) and (screen, Positive). Supervised learning [6] , has a different classifiers to handle
The fourth level is the user level which handles the social the classification process based on the trained data. Classifiers
relationships between different users using graph theory [16]. include but are not limited to decision tree classifier, linear
Consider the following example: A is a user who has a friend classifier (support vector machine, neural network), rule based
B connected to him. User B is always mentioned in user A classifier, or probabilistic classifier (Bayesian network, maxi-
posts, always gains likes and shares from user A.. User A mum Entropy, nave bayes). Decision Tree classifier presents a
might have the same opinion or sentiment as user B. This can hierarchy division of the trained data based on a condition.
be the result of the influence of user B on user A and how Linear classifier support vector machine (SVM) linearly
much such user can affect user B opinion. The user level takes separates trained data based on the highest or maximum margin
such influence into consideration. and lowest generalization error in the classification process.
The Linear equation is:

2175
Y = Bx + A. IV. S ENTIMENT ANALYSIS ENHANCEMENT METHODS
where point(X, Y ) has two dimensional values X, Y A. Sentiment anaylsis data cleaning
and A is constant value. Data cleaning [31], [8] is an important preprocessing phase
(1) which enhances sentiment analysis results. Data cleaning
Equation 2 is used to check if a point with value X can be operations include tokenizing, stemming and filtering. Data
classified in a certain class. cleaning can be applied in two phases [31], data transformation
n
and data filtering.

W = α j yj x j (2) Data transformation [31], [8] operations involve but are not
j=0
limited to removing useless spaces, handling abbreviations and
negations, stemming and removing stop words. Data filtering
Linear classifier Neural network iterates over the data the [31] is related to selecting features which are suitable for
data is classified. The results of each iteration is taken as a sentiment analysis.
feedback to the next iteration to return a better classification
with the smallest error values. B. Dimension Reduction
Probabilistic classifier Nave Bayes calculates word distri-
Dimension reduction [28] is the process of reducing high
bution in a document and uses it to forecast the suitable class
dimensions using two methods either feature selection or fea-
or label for a feature or word. This classifier is based on the
ture extraction. Feature extraction is a transformative method
assumption of independent features as presented in Equation
which applies a transformation on the data to project it into a
3.
new feature space with lower dimensions.
(P (label) ∗ P (f eatures|label)) Feature selection [28] is the process of selecting features
P ((label)|f eatures) =
P (f eatures) from the original data set based on specific selection criteria
(3) taking into consideration that the result subset has the smallest
classification error with lossless content meaningure.
On the other hand, the probabilistic classifier Bayesian
Feature selection algorithms include Chi-square, Latent
network [25] is based on the assumption of dependent features
semantic indexing and Point-wise Mutual Information (PMI).
It is an acyclic graph of nodes with a set variables and
Chi-square algorithm is used to define features. Suppose F(w)
dependency edges as presented in Equation 4, where a, b, c
is a global fraction of a data source, w is a word and n is
and are features.
a the number of data source files or documents. Pj is a global
P (a, b, c, d) = P ( ) ∗ P (b, c, d) (4) fraction of data sources which contains the class labels while
b, c, d
pj (w) is the conditional probability for class label j, which
The probabilistic classifier, Maximum Entropy [6] encodes contains a word w and X2 represents the goodness of fit of a
the features into vector space to calculate the weight of feature set of values and the expected value to select the best suitable
for labeled class in feature set (fs) where d is a dot product feature from a set of candidate features to represent the classes.
The equation of Chi-square is presented in Equation 6.
f eatures
P( )
labels n.F (w)2 .(pj (w) − pi )2
(5) x2j = ( (6)
d(weights, encoded(f s, label)) (F (w)(1 − F (w)).pj ..(1 − pj )
=
d(weights, encoded(f s, label)f or all labels)
Point-wise Mutual Information (PMI), can help in defining
Rule based classifier [26] is based on a set of rules. Decision mutual information between classes and features.. As presented
trees and sequential algorithms are useful for If-then rules in Equation 7, Mj (W ) is the mutual information between class
using FOIL pruning (FP) or Rule pruning. j and word w and Pj (W ) is the probability of a word w in
On the other hand, unsupervised learning deals with unla- class label j.
beled data for performing the clustering process using LDA
and HowNet lexicon [27]. In [28], the authors presented clus- Pj (W )
Mj (W ) = log( ) (7)
tering based on document similarity. While in [1], a framework Pj
was presented, which automated detecting hotspots from online Applying Feature Frequency (FF), Feature Presence (FP)
shared data (online forums data using K-means and SVM and term frequency inverse document frequency (TF-IDF) as in
classifier. SVM Classifier was also used in [9] to define public [31] lead to better feature selection. Feature frequency selects
users opinion towards products. And Nave Bayes SVM was the words or features which are most frequent in a class or a
used in [29] on several medical forums data. Meanwhile, document as presented in Equation 8 where w is a word and
Semi-supervised learning deals with labeled and unlabeled j is the class. Wo is the word or feature occurrence. For all
data as presented in [6]. Ensemble learning techniques were feature of word use i.
presented in [30] to resolve language ambiguity problem and   i,j
produce more accurate polarity prediction with combination of
F F = M ax( Wo ) (8)
classifiers. i,j

2176
Feature presence, on the other hand, concerns with the ontology framework using OpenDover to define aspects that
presence or absence of a feature inside a document. Term fre- are related to tweet topics. Ontologies [35] proved to enhance
quency inverse document frequency (TF-IDF) merges the two the results as they provide the semantics rather than only
concepts of term frequency and inverse document frequency. syntactic matching.
It presents a composite weight for each feature or word within
a document as shown in Equation 9 where N is the number F. Sentiment Analysis And Spam Detection
of documents, DF is the document frequency, which is the
number of documents that contain the features and FF is the Fake and spam sentiments on social media leads to inac-
Feature Frequency. curate sentiment detection results. In [36], [11] the authors
extracted geographic user characteristics and tweets content-
T F − IDF = F F ∗ log (N/DF ) (9)
based features to discover spam sentiment within text. Charac-
C. Sentiment Analysis and Data Intergation. teristics to define fake sentiment include but are not limited to
speed of publishing a tweet/post , the tweet/post location fake
Data is integrated, from different sentiment lexicons for
writing identity, number of mentions, number of hash tags,
sentiment analysis classification. This integration is performed
emotions, URLs in tweet/post and number of tweets/posts or
by combining, filtering and deleting the duplicated data from
posts per day. Building user profiles [36] over social networks
individual dictionaries. Available dictionaries include AFINN,
or defining online identity can help in detecting fake spam
General Inquirer, Micro-WNOp, Opinion Lexicon, SenticNet,
sentiment contents and fake spam users.
SentiSense, SentiWordNet, SO-CAL, Subjectivity Lexicon and
WordNet-Affect [9].
Using data-driven or data integration lexicons builds a G. Sentiment Analysis And User Profiling
high quality sentiment lexicon, which enhances the sentiment Online user identity and user profiling helps in making the
detection results. Other approaches [10] involve integrating sentiment analysis results more accurate as it measures polarity
user reviews with the users profile by presenting the integration based on the user profile in addition to the post polarity. In
in a multi-dimensional model for opinion mining. A similar [37], the authors proved that there is a strong relationship
approach EMOTube [10] integrated YouTube movie reviews between online user identity and the users contribution in
using Mashups for a better classification of reviews. Moreover, blogs. They divided the online users into classes based on
a semantic integration [32] was performed on different data their online features like (kindness, social skills, creativity).
sources from different social and medical domains as a way Others presented profiling models [37], [38] to either predict
to enhance prediction of diseases. the political interests of users or data publishing interests [38].
D. Sentiment Analysis and Crowdsourcing
Crowdsourcing is the science of resolving a problem or H. Sentiment Analysis And Text Summarization
task by the help of crowd [5]. Crowdsourcing can help in Sentiment summarization is the process of summarizing
providing more accurate sentiment analysis results. Crowd can sentiment according to a specific domain or a topic, also called
help in assigning labels to training data set or giving feedback target based summarization. A hybrid system for target or
about sentiment classification results, which can enhance the aspect oriented summarization was presented in [38] where the
predication and the classification models. authors defined features of the aspect (product or service) and
Crowdsourcing was used extensively in enhancing senti- applied sentiment analysis to classify the available sentiments.
ment analysis results [5], [7]. It was used in [5] to predict and Another attempt of summarization was carried out in [17]
measure depression over social media through twitter tweets where a summarization of Arabic tweets was performed to
using SVM classifier. In [7] authors proved that the use of generate specific topics rather than reading all the tweets.
crowdsourcing resulted in more accurate sentiment detection
for topic and sentiment classification in social media data.
V. S ENTIMENT A NALYSIS A PPLICATIONS
E. Sentiment Analysis and Ontologies
A. Social Applications
Ontology based sentiment analysis systems produces more
efficient classifiers and presents more detailed analysis about Sentiment analysis has been used extensively in social
the results. Ontologies were used in building visual ontology applications. Some existing applications include monitoring vi-
system called Sentibank [33], [14] over the attached images olence by detecting violence polarity in tweets [30], predicting
in posts. This helped in detecting emotions in the images. election results and public attention [39], determining satisfac-
The same model was also trained to predict emotions in new tion of places and recommending those places accordingly [13]
images. Another approach [34] presented a fuzzy ontology and monitoring and tracking students opinions in education [2].
model for opinion mining based on HowNet. This approach Another application for sentiment analysis is to enhance the
was built over a micro blogging data where the emotions were machine translation quality by detecting the implicit emotion
divided into hate, surprise, anxiety, sorrow, anger, expected of the text that may change the meaning such as the sarcasm
joy and love. In [33], authors built a sentiment analysis [40]

2177
B. Medical and Health applications Another measure is Correlation coefficient which is used to
Applying sentiment analysis on medical data can help in measure the similarity of the predicted value to the original
defining and predicting suicide rates, depression rates [5], one as indicated in Figure 2.
monitoring and tracking healthy and unhealthy areas according
to tweets as well as ranking doctors according to patients
satisfaction (from posts) and experience levels [41].
C. Industrial applications
In industry, sentiment analysis was used in brand monitor-
ing [42], stock market prediction [43], predicting box office
results according to users tweets [3] and measuring user
satisfaction levels [44]. Fig. 2. Correlation coefficient degree
VI. TOOLS, L EXICONS AND API S
A. Tools and APIs Moreover, Relative error and Relative error percentage
measures are used to measure error or error rate value. These
A list of available sentiment analysis tools include but are
errors can exist in any of classification, clustering or prediction
not limited to Weka [23], R [4], NLTK [13], QDA Miner [4],
process. Equations 14 and 15 calcuate both measures consid-
ifeel[45] and OntoGen [33].
ering the actual correct value as A and the predicated value as
B. Lexicons P.
Common used Lexicons [9] are AFINN, General inquirer |P − A|
Micro-WNOp, Opinion Lexicon, SentiNet, SentiSense, Senti- Relativeerror = (14)
A
wordnet, SO-CAL, WordNet.affect, NRC-emotion, NRC Hash
tag, Sentiment 140, Sentistrengh, Liu and OpioionFinder.
|P − A|
VII. SENTIMENT ANALYSIS EVALUATION Relativeerrorpercentage = ∗e (15)
A
Sentiment analysis results are evaluated according to several
measures. As presented in Table 1, a correct classification of VIII. O PEN P ROBLEMS AND R ESEARCH G APS
a positive data is named A, an incorrect classification of a
negative data is named B, while an incorrect classification of Sentiment analysis is still a hot topic which contains several
a positive data is named C and correct one is named D. Using open challenges and research gaps. These challenges include
these notations, one can calculate the different measures that building multilingual classifiers, building common user profile
can be used to evaluate the results using Recall, Precision, by integrating the same user data from different social media
F-measure and Accuracy as presented in Equations 10 to 13. applications, and enhancing Stanford Treebank by adding the
ability to be applied at aspect level or document level instead of
Positive data Negative Data sentence level. In addition to handling implicit word meaning
Predicted positive A B and indirect text, building domain independent lexicon or
Predicted Negative C D
classifiers, building real time sentiment analysis systems which
TABLE I can dynamically capture new data and enhances results ac-
ACCURACY M EASURES FACTORS cording to feedback. Moreover, one can also investigate multi
labeling and clustering using unsupervised dynamic clustering
and multi label feature selection in enhancing the sentiment
analysis results.
A
Recall = (10)
A+B
IX. CONCLUSION AND FUTURE WORKS

Sentiment analysis, opinion mining or emotion detection


A
P recision = (11) is the process of defining feeling or emotion through text.
A+C
Sentiment analysis is a very important process as it pro-
vides many valuable indictors in different domains such as
2 ∗ Recall ∗ precision medical, social and industrial domains. This survey presented
F − M easure = (12) sentiment analysis levels, techniques, enhancement methods,
Recall + P recision
applications, list of APIs, lexicons, tools and existing research
gaps. Future work will consider comparing the state of the art
A+D techniques presented through this paper using the same data
Accuracy = (13) set across all the different techniques to be able to evaluate the
A+B+C +D
best techniques used.

2178
R EFERENCES [23] A. F. Anta, L. N. Chiroque, P. Morere, and A. Santos, “Sentiment
analysis and topic detection of spanish tweets: A comparative study of of
nlp techniques,” Procesamiento del lenguaje natural, vol. 50, pp. 45–52,
[1] Y. Ko and J. Seo, “Automatic text categorization by unsupervised
2013.
learning,” in Proceedings of the 18th conference on Computational
[24] S. Keretna, A. Hossny, and D. Creighton, “Recognising user identity
linguistics-Volume 1. Association for Computational Linguistics, 2000,
in twitter social networks via text mining,” in Systems, Man, and
pp. 453–459.
Cybernetics (SMC), 2013 IEEE International Conference on. IEEE,
[2] A. Ortigosa, J. M. Martı́n, and R. M. Carro, “Sentiment analysis
2013, pp. 3079–3082.
in facebook and its application to e-learning,” Computers in Human
[25] C. C. Aggarwal and C. Zhai, Mining text data. Springer Science &
Behavior, vol. 31, pp. 527–541, 2014.
Business Media, 2012.
[3] J. Du, H. Xu, and X. Huang, “Box office prediction based on microblog,” [26] H. Y. Abu Mansour, “Rule pruning and prediction methods for as-
Expert Systems with Applications, vol. 41, no. 4, pp. 1680–1689, 2014. sociative classification approach in data mining,” Ph.D. dissertation,
[4] M. M. Mostafa, “More than words: Social networks text mining for University of Huddersfield, 2012.
consumer brand sentiments,” Expert Systems with Applications, vol. 40, [27] F. Xianghua, L. Guo, G. Yanyan, and W. Zhiqiang, “Multi-aspect
no. 10, pp. 4241–4251, 2013. sentiment analysis for chinese online social reviews based on topic
[5] M. De Choudhury, M. Gamon, S. Counts, and E. Horvitz, “Predicting modeling and hownet lexicon,” Knowledge-Based Systems, vol. 37, pp.
depression via social media.” in ICWSM, 2013. 186–195, 2013.
[6] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms [28] G. Chandrashekar and F. Sahin, “A survey on feature selection methods,”
and applications: A survey,” Ain Shams Engineering Journal, vol. 5, Computers & Electrical Engineering, vol. 40, no. 1, pp. 16–28, 2014.
no. 4, pp. 1093–1113, 2014. [29] T. Ali, D. Schramm, M. Sokolova, and D. Inkpen, “Can i hear you?
[7] R. Machedon, W. Rand, and Y. Joshi, “Automatic crowdsourcing-based sentiment analysis on medical forums,” in Proceedings of the Sixth
classification of marketing messaging on twitter,” in Social Computing International Joint Conference on Natural Language Processing. Asian
(SocialCom), 2013 International Conference on. IEEE, 2013, pp. 975– Federation of Natural Language Processing, Nagoya, Japan, 2013, pp.
978. 667–673.
[8] A. K. Uysal and S. Gunal, “The impact of preprocessing on text [30] J. Ko, H. Kwon, H. Kim, K. Lee, and M. Choi, “Model for twitter
classification,” Information Processing & Management, vol. 50, no. 1, dynamics: Public attention and time series of tweeting,” Physica A:
pp. 104–112, 2014. Statistical Mechanics and its Applications, vol. 404, pp. 142–149, 2014.
[9] H. Cho, S. Kim, J. Lee, and J.-S. Lee, “Data-driven integration of [31] E. Haddi, X. Liu, and Y. Shi, “The role of text pre-processing in
multiple sentiment dictionaries for lexicon-based sentiment classification sentiment analysis,” Procedia Computer Science, vol. 17, pp. 26–32,
of product reviews,” Knowledge-Based Systems, vol. 71, pp. 61–71, 2014. 2013.
[10] E. Polymerou, D. Chatzakou, and A. Vakali, “Emotube: A sentiment [32] X. Ji, “Social data integration and analytics for health intelligence,”
analysis integrated environment for social web content,” in Proceedings Management, 2013.
of the 4th International Conference on Web Intelligence, Mining and [33] E. Kontopoulos, C. Berberidis, T. Dergiades, and N. Bassiliades,
Semantics (WIMS14). ACM, 2014, p. 20. “Ontology-based sentiment analysis of twitter posts,” Expert systems with
[11] D. Guo and C. Chen, “Detecting non-personal and spam users on geo- applications, vol. 40, no. 10, pp. 4065–4074, 2013.
tagged twitter network,” Transactions in GIS, vol. 18, no. 3, pp. 370–384, [34] W. Shi, H. Wang, and S. He, “Sentiment analysis of chinese microblog-
2014. ging based on sentiment ontology: a case study of 7.23 wenzhou train
[12] E. Cambria, A. Livingstone, and A. Hussain, “The hourglass of emo- collision,” Connection Science, vol. 25, no. 4, pp. 161–178, 2013.
tions,” in Cognitive behavioural systems. Springer, 2012, pp. 144–157. [35] L.-z. Liu, H. Liu, H.-s. Wang, W. Song, and X.-l. Zhao, “Generating
[13] D. Yang, D. Zhang, Z. Yu, and Z. Wang, “A sentiment-enhanced domain-specific affective ontology from chinese reviews for sentiment
personalized location recommendation system,” in Proceedings of the analysis,” Journal of Shanghai Jiaotong University (Science), vol. 20,
24th ACM Conference on Hypertext and Social Media. ACM, 2013, pp. 32–37, 2015.
pp. 119–128. [36] X. Hu, J. Tang, and H. Liu, “Online social spammer detection,” in
[14] D. Borth, R. Ji, T. Chen, T. Breuel, and S.-F. Chang, “Large-scale Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014.
visual sentiment ontology and detectors using adjective noun pairs,” in [37] H.-W. Kim, J. R. Zheng, and S. Gupta, “Examining knowledge contribu-
Proceedings of the 21st ACM international conference on Multimedia. tion from the perspective of an online identity in blogging communities,”
ACM, 2013, pp. 223–232. Computers in Human Behavior, vol. 27, no. 5, pp. 1760–1770, 2011.
[15] K. Rajan, “Materials informatics,” Materials Today, vol. 15, no. 11, pp. [38] S.-A. Bahrainian and A. Dengel, “Sentiment analysis and summarization
470 –, 2012. [Online]. Available: http://www.sciencedirect.com/science/ of twitter data,” in Computational Science and Engineering (CSE), 2013
article/pii/S1369702112702043 IEEE 16th International Conference on. IEEE, 2013, pp. 227–234.
[16] C. Tan, L. Lee, J. Tang, L. Jiang, M. Zhou, and P. Li, “User-level [39] H. Hodson, “Twitter hashtags predict rising tension in egypt,” New
sentiment analysis incorporating social networks,” in Proceedings of the Scientist, vol. 219, no. 2931, p. 22, 2013.
17th ACM SIGKDD international conference on Knowledge discovery [40] A. Hossny, K. Shaalan, and A. Fahmy, “Machine translation model using
and data mining. ACM, 2011, pp. 1397–1405. inductive logic programming,” in Natural Language Processing and
Knowledge Engineering, 2009. NLP-KE 2009. International Conference
[17] N. El-Fishawy, A. Hamouda, G. M. Attiya, and M. Atef, “Arabic sum-
on. IEEE, 2009, pp. 1–8.
marization in twitter social network,” Ain Shams Engineering Journal,
[41] A. López, A. Detz, N. Ratanawongsa, and U. Sarkar, “What patients
vol. 5, no. 2, pp. 411–420, 2014.
say about their doctors online: a qualitative content analysis,” Journal of
[18] H. Saif, M. Fernandez, Y. He, and H. Alani, “Senticircles for contextual
general internal medicine, vol. 27, no. 6, pp. 685–692, 2012.
and conceptual semantic sentiment analysis of twitter,” in The Semantic
[42] K. Ikeda, G. Hattori, C. Ono, H. Asoh, and T. Higashino, “Twitter user
Web: Trends and Challenges. Springer, 2014, pp. 83–98.
profiling based on text and community mining for market analysis,”
[19] A. Hogenboom, F. Boon, and F. Frasincar, “A statistical approach to star Knowledge-Based Systems, vol. 51, pp. 35–47, 2013.
rating classification of sentiment,” in Management Intelligent Systems. [43] X. Zhang, H. Fuehres, and P. A. Gloor, “Predicting stock market
Springer, 2012, pp. 251–260. indicators through twitter i hope it is not as bad as i fear,” Procedia-
[20] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, “Lexicon- Social and Behavioral Sciences, vol. 26, pp. 55–62, 2011.
based methods for sentiment analysis,” Computational linguistics, [44] S. O. Orimaye, S. M. Alhashmi, and E.-G. Siew, “Buy it-dont buy it:
vol. 37, no. 2, pp. 267–307, 2011. sentiment classification on amazon reviews using sentence polarity shift,”
[21] A. C.-R. Tsai, C.-E. Wu, R. T.-H. Tsai, and J. Y.-j. Hsu, “Building a in PRICAI 2012: Trends in Artificial Intelligence. Springer, 2012, pp.
concept-level sentiment dictionary based on commonsense knowledge,” 386–399.
IEEE Intelligent Systems, no. 2, pp. 22–30, 2013. [45] M. Araújo, P. Gonçalves, M. Cha, and F. Benevenuto, “ifeel: A system
[22] R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. that compares and combines sentiment analysis methods,” in Proceedings
Ng, and C. Potts, “Recursive deep models for semantic compositionality of the companion publication of the 23rd international conference on
over a sentiment treebank,” in Proceedings of the conference on empirical World wide web companion. International World Wide Web Conferences
methods in natural language processing (EMNLP), vol. 1631. Citeseer, Steering Committee, 2014, pp. 75–78.
2013, p. 1642.

2179

You might also like