Professional Documents
Culture Documents
13, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.3028260
ABSTRACT Determining the polarities of words in a given context has been in existence since the inception
of computational linguistics, text mining, and sentiment analysis. Due to its fundamental role in determining
the overall semantic orientation of natural language expressions, it is considered one of the most challenging
issues facing these areas of research. This paper introduces a new implementation of the lexicon-based word
polarity identification method on several customer reviews datasets. Herein, we use a variation of a lexicon-
based word polarity identification method that operates by computing the semantic relatedness between
the context expansion set of the target word and a synonym expansion set comprising the synonyms of
all words surrounding the target word within the original text fragment. The polarity of the target word
is determined as that for which the semantic relatedness between these two meaningful sets is the highest.
Unlike most existing lexicon-based multi-polarity word identification methods, the used method is not based
on estimating pairwise relatedness at term-level, but instead, it is based on measuring semantic relatedness at
the fragment-level. This enables the exploration and capture of a higher degree of semantic and sentimental
information, and is more consistent with people’ understanding through the consideration of the larger
context in which the word appears. Its performance can be further improved by incorporating an initial
step in which the relative negation scope of words in the given text fragment is managed while determining
their sentiment orientation. The implementation results demonstrate that the used variation of the lexicon-
based word polarity identification method performs favourably against compared methods, as evaluated on
numerous benchmark datasets through stand-alone and end-to-end evaluation models.
INDEX TERMS Semantic similarity, sentiment scores, semantic orientation, multi-polarity words, negation
scope.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 8, 2020 179955
K. Abdalgader, A. Al Shibli: Experimental Results on Customer Reviews Using Lexicon-Based Word Polarity Identification Method
approaches: machine learning-based methods [21]–[25] and the negation-term. In contrast, the scope of negation in the
lexicon-based methods [26]–[32]. Machine learning-based sentence ‘‘We are not waiting for you since a long time’’ is
methods use a supervised learning mechanism to motivate a extended to the end of the sentence; thus, the negation affects
classifier from a textual collection of training data contain- the whole sentence. To overcome such negation challenges,
ing of a set of manual sentiment labels. This indicates the some methods [39], [40], [43] invert the polarity of negated
polarity information (i.e., positive, negative, and neutral) and words to handle the negations. Furthermore, these methods
an assigned numeric value to score how positive or negative are operated as a stand-alone, and they do not incorporate the
a given word is [15], [16]. Once a classifier has been estab- negation handling process while determining the polarity of
lished by predicting the syntactic and semantic features, it can words. Therefore, it is evident that predicting the negation
then be used to extract the polarity of the target word in a scope occurring within the context will improve the overall
given textual context. The lexicon-based methods, however, performance of word in the polarity identification task.
are usually unsupervised and utilize thesauri, dictionaries, Considering these issues, the contributions of this study
or lexicons to determine the polarity score of each word, and are as follows. First, we introduce a new implementation
they do not require any such training corpus [33]. These meth- of the word polarity identification task on several customer
ods determine the overall sentiment orientation of a given reviews datasets. This implementation uses a new variation of
context as one where the total positive scores of each word lexicon-based method that determines the polarity of a target
is matched with the total of their negative scores. The whole word by calculating the semantic similarity between its Word-
textual context is classified to be negative if the summation of Net’s [44] synonyms, gloss, and usage-example (referred as
the negative scores is higher, and vice versa [26], [27], [34]. context expansion), and the synonyms context provided by
In this study, we exclusively focus on lexicon-based methods. all synonym-words of all words surrounding the target word
Many approaches to lexicon-based polarity identification within the given text fragment (referred as synonyms expan-
rely on predetermined lexicon words that are often expressed sion). This allows it to explore and capture closely related
subjectively in the specific domain [7], [15], [16]. Naturally, sentimental information to determine the correct polarity of
some words definitely change their polarity across different the word. The method also identifies the polarity of the
domains or contexts, where most of these approaches do not words simultaneously by progressively incorporating these
appropriately identify the actual polarity in which words are polarity-assigned words in the synonyms expansion context
being used within the target context [26], [27], [35], [36]. that contains all words surrounding the target word. This
Overlooking the effects of considering the actual polarity of again enables our variation method to exploit more related
each word according to its context could lead to poor accuracy semantic and sentiment information. Second, we present a
in overall performance of the polarity identification process. simplified process that improves the performance of polar-
For example, the word ‘‘small’’ has a negative polarity in the ity identification by handling the relative negation scope
sentence ‘‘The train’s seats are very small,’’ and it could be of words while identifying their polarities. Finally, this is
understood as positive when appearing in a different context, believed to be the maiden research evaluating the lexicon-
as in ‘‘The trolley is small and can easily fit in my luggage.’’ based multi-polarity identification using two different eval-
Nevertheless, this occurs even in closely related domains. uation models. Empirical results demonstrate that the used
For example, ‘‘unpredictable’’ is a negative aspect for a car’s method achieves better performance than that achieved by the
steering but a positive one for a car’s shape. This situation compared lexicon-based methods as evaluated on different
is different from polysemy (i.e., a word having multiple standard customer review datasets.
meanings) as a word can carry the same meaning across The remainder of the paper is organized in the following
domains while having different polarities. A limitation of this manner. Section II describes relevant work in the area of
approach is in determining the words’ polarities individually lexicon-based sentiment analysis. Section III describes the
without considering the polarity assigned to neighbouring used word polarity identification method. Section IV presents
words. In practice, the performance usually suffered from a walk-through example illustrating how the used method
the lack of utilizing the semantic orientation presented in the works. Experimental results are presented in Section V, and
target context. Section VI concludes the paper.
Another challenging phenomenon in lexicon-based word
polarity identification methods is dealing with the words II. RELATED WORK
that are affected either by morphological (i.e., prefixes and Various approaches for lexicon-based word polarity iden-
suffixes) or syntactic (i.e., no, not, rather, etc.) negations tification have been developed in recent years to analyze
[37]–[43]. However, it is extremely important to predict emotions, attitudes, and opinions about particular objects or
words that are affected by negation-terms before starting to events [30]–[32]. These methods can be generally classified
identify their polarities. The scope of negation can affect a into two groups: corpus-based [47]–[50] and dictionary-
single word that appears directly after a negation-term or may based [34], [45], [46], [51], [52]. In corpus-based methods,
even encompass to affect the whole context. For example, the polarity of each word is determined based on term-
in the sentence ‘‘This book is not readable, but it looks nice’’ frequency (i.e., word co-occurrence) in the classified and
the negation scope here only affects the word that follows more dynamic seed words sets [26], [27], whereas in the
dictionary-based methods, the polarity of each word is com- identify the polarity of words that appear within some limited
puted by utilizing fixed lexical and lexicon resources such distance from each other in the given context (i.e., a fixed-
as WordNet [44] and SentiWordNet [52], respectively. Both sized window surrounding the target word).
approaches basically operate by extracting the opinion and The family of dictionary-based methods [34], [45], [46],
emotion targets, known as sentiment attributes, from the [51], [52], [56], which have become immensely popular in
unstructured text. Following this, sentimental information is recent years, are based on utilizing lexicon resources such
labeled or rank scored to demonstrate how positive, negative, as SentiWordNet [52], WordNet-Affect [57] which are built
or neutral a word is. These targets are identified based on using WordNet [44] lexical information. They have been
lexical/lexicon or corpus resources that have been applied. developed to annotate their synsets with sentiment scores
Finally, the highest value resulting from the summation or (positively, negatively, and objectively) ranged from 0 to
average is considered to determine the overall semantic ori- 1 for each sentimental class. However, these lexicon resources
entation for the given text fragment. However, apart from face the challenge of their synsets being ordinarily orga-
sentiment scores, the aspect of exploiting the sentential con- nized based on the common senses of the words that are
text such as senses, negation, or intensification surrounding a extracted from WordNet. Therefore, incorrect identification
word are not considered in the majority of these approaches. of the actual meaning (sense) from the possible senses of the
Most of the lexicon-based methods are inspired by the target word would impact the performance of word polarity
assumption that word polarity usually organizes in the form identification. For example, if the words room, hotel, and
of an ordinal scale, where this scale may be numerical values unexpected appear near the word small, we can simply rec-
(e.g., 0 can be ‘‘very negative’’ to 1 being ‘‘very positive’’), ognize that the intended sentiment of small is of negative
or annotated labels (e.g., positive, negative, neutral) [26], polarity, and if it appears near words fit, easily, and bucket,
[27], [34], [46], [51]. The first word polarity identification it has a positive polarity. Thereby, the actual polarity of a
method that capitalized on this assumption is attributed to word needs to be identified in the textual context in which it
Hatzivassiloglou and McKeown [26]. This method deter- occurs, and this presents yet more linguistic challenges, such
mines the polarity of adjective terms by exploiting the pairs of as context limitation, negation, and intensity handling.
their conjoined linguistic features (and, or, but, etc.) extracted The majority of lexicon-based approaches for word
from a free-annotated textual collection. The method adopts polarity identification do not handle the issue of lan-
the linguistic distributional hypothesis that the conjunction guage ambiguity, wherein word senses can carry different
but usually links adjectives that have opposite orientation and polarities depending on the context of target words being
the conjunction and/or links adjectives that have equal orien- used. However, little work has been done to consider the
tation to expand the seed document sets. A graph has been actual polarity of the word senses in the sentiment analysis
created to represent the extracted terms (adjectives) as nodes task [58]–[65]. The method proposed by Esuli and Sebastiani
connected by edges that indicate their relationships (e.g., [45] determines the semantic orientation of word senses (i.e.,
equal-orientation or opposite-orientation). Finally, a non- polarity of subjective words) based on quantitative analysis
hierarchical clustering was applied to classify a group of of the glosses associated to WordNet’s synsets. They man-
nodes (adjectives) into a Positive class and a Negative class, ually create a lexicon resource [52] in which each WordNet
which is based on the similar relationship induced by the synsets (i.e., word-sense) is assigned three scores demonstrat-
edges. ing how positive, negative, or neutral the senses contained
While Hatzivassiloglou and McKeown’s method and many in the synset are. The labelling was based on applying a
of its variants [27]–[31] are based on the notion of word PageRank method [51] for ranking the synsets according to
polarities being identified individually at term-level (i.e., their semantic property. Wiebe and Mihalcea [63] proposed
without considering the other parts of speech and the ori- another method that relied on exploring large labelled opinion
entation assigned to the neighbouring words in the context), datasets for annotates of WordNet’s synsets as objective
Kanayama and Nasukawa [53], followed by Ding et al. [30], or subjective in order to identify the semantic orientation.
Qiu et al. [54] and Zhang and Liu [55] opine that involving Interestingly, Rentoumi et al. [66] have tackled the polarity
only a single part of speech is just one of many possible identification issue by disambiguating all target words first,
polarity identification methods and propose methods based and then associating the senses-assigned words to models of
on the idea of coherency (i.e., at clause-level). They deal sentimental information using graphical representations. This
with the three main concepts of part of speech type, fea- was done to produce better related contextual and sub-word
ture context, and dependency relations. Their approaches are information [67].
motivated by the belief that words in a context must be related With few exceptions, the typical lexicon-based approaches
in meaning to the text-fragment to be coherent. This means for identifying word polarity use some functions of the lexical
that words that have the same semantic orientation usually relations matching through utilizing the lexicon (dictionary)
occur together in the context. The methods are evaluated resources and produce a relative numerical score based only
by checking their performance at classifying words using on the original constituent word individually [26], [27], [45],
precompiled sets of positive and negative words. However, [51], [60], [62], [63], [65]. However, these relationships
these methods are limited by the fact that they can only may not be absolutely real as not all word-senses may be
coherence in meanings. The research described in this paper, where N1 , N2 and N3 are the number of non-stopwords
however, is inspired by the belief that the performance of such in WordNet’s synonyms, gloss, and usage example, respec-
identification could be improved through analysing the orig- tively, and they can be expressed as:
inal context and then expanding it to simultaneously explore
synonymskword i = word 1i | i = . . . N1
and identify the accurate polarity of the target word. In the fol-
lowing section, a lexicon-based word polarity identification glosskword i = word 2i | i = . . . N2
method is modified that encompasses a wide range of lexical
examplekword i = word 3i | i = . . . N3
relations, and outperforms other compared approaches.
Synonyms Expansion is the set containing all available
III. POLARITY IDENTIFICATION METHOD synonym-words of the words in Target Text set except the
In this section, a new variation of lexicon-based word polarity target word wordi :
identification method is presented in detail. Unlike exist-
Synonyms_Expansion
ing methods, which rely on either a word’s pairwise mea- n
suring [27], [35], [58], [58]–[60], [62], [63] or contextual = s_wordssword j | s_wordssword j ∈ word j _all_synonyms,
overlapping [45], [54], [53], [64], the modified method deter- and j 6= i}
mines the polarity of a target word by calculating the semantic
relatedness between its context expansion and synonyms where s is the synset of the word in the WordNet in which the
expansion sets, which provided by all related WordNet’s asso- first one has been considered for a word whose polarity has
ciated semantic information (i.e., synset). This calculation not been identified during the whole identification process.
of the semantic information was performed at text-fragment The polarity of target word wordi is determined as the k
level. The actual polarity of the target word is assigned score for which context expansion set is semantically most
as the WordNet’s synset (with its sentimental information related to the synonyms expansion set (see Subsection 3.4).
retrieved from SentiWordNet) for which the semantic similar- The pseudocode that demonstrates how the proposed method
ity between context expansion set and synonyms expansion operates is described in Algorithm 1.
set is greatest. Significantly, it also identified the polarity of It is important to note that all words in both expanded sets
the target words simultaneously by progressively incorporat- (line 7 of the algorithm) are treated with the first WordNet’s
ing these polarity-assigned words in the synonyms expansion synset, where they usually are organized from most frequent
context, thereby they could be used to identify the polarity to least frequent. This means that considering the first synset
of other unidentified words, thus contributing an assignation is likely to expand the context in order to capture enough
to other words that have not had their polarity determined. semantic information between words being compared (but
Performance was further improved by handling the relative not always granted). While most WordNet semantic relations
negation scope of words while identifying their polarities and do not extend beyond parts of speech, context expansion
other related issues. improves this lack of connectivity by enabling a greater rela-
tional connectivity between words compared with the current
A. MODIFIED LEXICON-BASED METHOD approaches, which only consider the constituent words. For
The target text comprising the words to be identified by example, the semantic connection between the noun con-
semantic orientation (polarity) is first represented as the set tainer and the adjective fleshy does not exist, where they
that it consists: Target Text = {word i |i = 1...N } , where N is are usually considered to be topically related. Therefore,
the total number of words in original text fragment (i.e., given we may discover some words that can improve the connec-
text). Now, assume that wordi is the word whose polarity we tivity between these semantically related words by looking at
need to identify. Below Context Expansion represents the set their synonyms.
of WordNet’s associated information (i.e., synonyms, gloss, The semantic orientation between context expansion and
and usage example) corresponding to the available synsets of synonyms expansion sets is calculated after preforming a pre-
wordi : processing step to the WordNet’s gloss and usage example
in the context expansion set. This pre-processing function
n o adopts a sequence of words as input and retunes their morpho-
Context_Expansion = akword i | k = 1 . . . Nword i logical base form. Since the gloss and usage example contain
a description of the word meaning, (it is better to utilize as
where Nword i is the total number of the possible synsets much of their meaning as possible). This requires the removal
for wordi in WordNet and akword i is the union set of non- of stopwords, prefixes, suffixes, and other linguistic features
stopwords in the k th WordNet associated information of in order to keep words in their original form. However,
represents the semantic similarity between wordi and wordj , separately aggregating the scores of all polarity-identified
and can be found in the ith row (synonyms expansion set) and words in the list. There were two cases to consider depending
jth column (context expansion set). Consequently, the size of on whether the overall sentiment orientation was positive,
a matrix is dynamic and defined by the number of words in negative, or neutral:
the sets. More detailed information on matrix representation Case 1: hard classification, where the given text could
is demonstrated in the next section. belong to a single class (positive, negative, or neu-
tral) for which aggregated sentiment orientation
D. EXPANDED CONTEXTS SEMANTIC CALCULATION score was highest.
To calculate the semantic relatedness between the synonyms Case 2: soft classification, where the given text could
expansion and context expansion sets as shown in line 7 of belong to all classes (positive, negative, and neu-
algorithm 1, they are represented in a matrix structure, where tral) with different degrees of membership. In this
the rows list words in the synonyms expansion set and the case, all the retrieved sentiment orientation scores
columns list words in the context expansion set. All words have been considered.
in both sets are organized in the order as they appear in
their original context. Let Synonyms_Expansion and Con- Therefore, the output’s class membership values polarityci ,
text_Expansion be the word sets of the two contexts whose which demonstrated the highest score that shows the mem-
semantic similarity are to be computed. Suppose that Syn- bership of a particular given text to class c; i.e., Max =
onyms Expansion is the set that contains all non-stopwords {polarityci }. In the event that a soft classification was required,
(synonym-words) of the given text-fragment except the target this was to be achieved by assigning a particular given text
word whose polarity we aim to identify, and Context Expan- to all classes with different degrees of membership. This
sion is the union set that encompasses all non-stopwords meant that overall sentiment orientation of the given text
and morphed (stemmed) words from the associated WordNet could belong to all classes of positive, negative, and neutral
synonyms, gloss, and usage examples corresponding to the at the same time but with different aggregated scores. The
synset k for the target word: experimental testing in Sections IV and V demonstrates these
two cases.
Synonyms_Expansion = {word 1i |i = 1. . .N1 , i 6 = j};
Context_Expansion = akword 1j = {word 2i |j = 1. . .N2 } F. COMPUTATIONAL COMPLEXITY
The used word polarity identification method utilizes the
For each word in the Synonyms Expansion set, the semantic semantic and sentimental information provided by all words
similarity value is measured at the cross point with all words in the given text to identify their polarities. Importantly,
in the Context_Expansion set. The word-to-word shortest this utilization does not lead to a significant increase in the
path [68] similarity measure was applied to estimate the calculations required for running it. To demonstrate, sup-
semantic similarity between words across the compared sets. pose that T is the total number of words in Target_Text, t
Following this, the semantic orientation list is constructed, is the number of words in each Synonyms_Expansion set,
where each element of this list corresponds to a word in the a the number of words in each Context_Expansion set, and
Synonyms_Expansion set, thereby its size is dependent on k the possible number of WordNet’s synsets for each word
the total number of the remaining words in this set (N1 ). in the given context (Target_Text). Each word whose polarity
The elements value of the list is obtained by picking the needs to be identified will require a calculation of semantic
highest cross point similarity score in each column belong- orientation between each of its Context_Expansion set and the
ing to the word in the Synonyms_Expansion set. Once the Synonyms_Expansion set (synonym-words of its surround-
semantic orientation list is formed for each possible synset ing original words). The maximum number of calculations
k of the target word, the Context_Expansion set that has the needed to identify P the polarity of all words in the given text
highest similarity score will be used to retrieve its sentimental is therefore O( TK =0 K = sum(k)). Each of these semantic
information from SentiWordNet to determine the polarity of orientation calculations needs O(t. a)Psimilarity computations
the target word. This is only for determining the polarity of between words, which leads to O( K sum=0 sum = (t.a)) as a
each single word in the given context. The following section maximum overall word-to-word similarity computation. For
presents how the overall polarity (orientation) of the given the purpose of clarification, let T be equal to 3 and K equal 5
context can be determined. (i.e., 5, 5, 4, 3, 3 words in each synonyms expansion sets and
6, 7, 5, 4, 7 words in context expansion set for each target
E. TOTAL SENTIMENT SCORE DETERMINATION word respectively). The method proposed would therefore
After performing word polarity identification using the mod- require a maximum of 118 computations to estimate the
ified method, retrieving their sentiment information from semantic similarity between words.
SentiWordNet, a list (i.e., polarity = {Pi | i = . . . N}) Interestingly, these computational requirements will
was obtained that comprised of two scores (positive and not usually be more expansive compared to the current
negative) corresponding to each polarity-assigned word. approaches that utilize a limited context provided within
Finally, overall polarity of the given text was determined by a size-fixed window. In cases dealing with long textual
contexts, however, it is of course possible to limit the given TABLE 1. Word polarity progressively assigned.
context by using only the context provided within the deter-
mined window of words.
B. END-TO-END EXPERIMENTS
A more intuitive appreciation of the modified method results
can be obtained when applying it within two sentence-level
sentiment analysis tasks: hotel review sentiment analysis and
Sohar fort review sentiment analysis.
TABLE 5. Performance (%) on hotels review dataset. TABLE 6. Sohar fort dataset.
[6] F. Ali, D. Kwak, P. Khan, S. El-Sappagh, A. Ali, S. Ullah, K. H. Kim, [26] V. Hatzivassiloglou and K. R. McKeown, ‘‘Predicting the semantic orien-
and K.-S. Kwak, ‘‘Transportation sentiment analysis using word embed- tation of adjectives,’’ in Proc. 35th Annu. Meeting Assoc. Comput. Linguis-
ding and ontology-based topic modeling,’’ Knowl.-Based Syst., vol. 174, tics. Madrid, Spain: The Association for Computational Linguistics, 1997,
pp. 27–42, Jun. 2019. pp. 174–181.
[7] E. Riloff, J. Wiebe, and W. Phillips, ‘‘Exploiting subjectivity classification [27] P. D. Turney and M. L. Littman, ‘‘Measuring praise and criticism: Inference
to improve information extraction,’’ in Proc. 20th Nat. Conf. Artif. Intell., of semantic orientation from association,’’ ACM Trans. Inf. Syst., vol. 21,
Pittsburgh, PA, USA, 2005, pp. 1106–1111. no. 4, pp. 315–346, Oct. 2003.
[8] A.-M. Popescu and O. Etzioni, ‘‘Extracting product features and opinions [28] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, ‘‘Lexicon-
from reviews,’’ in Natural Language Processing and Text Mining. London, based methods for sentiment analysis,’’ Comput. Linguistics, vol. 37, no. 2,
U.K.: Springer, 2007, pp. 9–28. pp. 267–307, Jun. 2011.
[9] H. Takamura, T. Inui, and M. Okumura, ‘‘Extracting emotional polarity of [29] B. Agarwal and N. Mittal, ‘‘Semantic orientation-based approach for sen-
words using spin model,’’ in Proc. 43rd Annu. Meeting Assoc. Comput. timent analysis,’’ in Prominent Feature Extraction for Sentiment Analysis.
Linguistics. Ann Arbor, MI, USA: The Association for Computational Cham, Switzerland: Springer, 2016, pp. 77–88.
Linguistics, 2005, pp. 133–140. [30] A. Nawaz, S. Asghar, and S. H. A. Naqvi, ‘‘A segregational approach for
[10] S. Mukund, R. Srihari, and E. Peterson, ‘‘An information-extraction sys- determining aspect sentiments in social media analysis,’’ J. Supercomput.,
tem for urdu—A resource-poor language,’’ ACM Trans. Asian Lang. Inf. vol. 75, no. 5, pp. 2584–2602, May 2019, doi: 10.1007/s11227-018-2664-
Process., vol. 9, no. 4, pp. 1–43, Dec. 2010. 3.
[11] L. Abberley, N. Gould, K. Crockett, and J. Cheng, ‘‘Modelling road [31] S. Bandari and V. V. Bulusu, ‘‘Survey on ontology-based sentiment
congestion using ontologies for big data analytics in smart cities,’’ in Proc. analysis of customer reviews for products and services,’’ in Data Engi-
Int. Smart Cities Conf. (ISC), Wuxi, China, Sep. 2017, pp. 1–6. neering and Communication Technology (Advances in Intelligent Sys-
[12] J. F. F. Pereira, ‘‘Social media text processing and semantic anal- tems and Computing), vol. 1079, K. Raju, R. Senkerik, S. Lanka, and
ysis for smart cities,’’ 2017, arXiv:1709.03406. [Online]. Available: V. Rajagopal, Eds. Singapore: Springer, 2020.
https://arxiv.org/abs/1709.03406 [32] M. E. Mowlaei, M. S. Abadeh, and H. Keshavarz, ‘‘Aspect-based senti-
[13] C. Musto, G. Semeraro, M. de Gemmis, and P. Lops, ‘‘Developing smart ment analysis using adaptive aspect-based lexicons,’’ Expert Syst. Appl.,
cities services through semantic analysis of social streams,’’ in Proc. 24th vol. 148, Jun. 2020, Art. no. 113234.
Int. Conf. World Wide Web, Florence, Italy, 2015, pp. 18–22. [33] C. Hung and H.-K. Lin, ‘‘Using objective words in SentiWordNet to
[14] A. Al Nuaimi, A. Al Shamsi, A. Al Shamsi, and E. Badidi, ‘‘Social media improve word-of-mouth sentiment classification,’’ IEEE Intell. Syst.,
analytics for sentiment analysis and event detection in smart cities,’’ in vol. 28, no. 2, pp. 47–54, Mar. 2013.
Proc. 4th Int. Conf. Natural Lang. Comput. (NATL), Apr. 2018, pp. 57–64. [34] L. Gatti, M. Guerini, and M. Turchi, ‘‘SentiWords: Deriving a high preci-
[15] S. Singhal, S. Maheshwari, and M. Meena, ‘‘Survey of challenges in sion and high coverage lexicon for sentiment analysis,’’ IEEE Trans. Affect.
sentiment analysis,’’ in Recent Findings in Intelligent Computing Tech- Comput., vol. 7, no. 4, pp. 409–421, Oct. 2016.
niques (Advances in Intelligent Systems and Computing), vol. 709, P. Sa, [35] C. Song, X.-K. Wang, P.-F. Cheng, J.-Q. Wang, and L. Li, ‘‘SACPC:
S. Bakshi, I. Hatzilygeroudis, and M. Sahoo, Eds. Singapore: Springer, A framework based on probabilistic linguistic terms for short text senti-
2018. ment analysis,’’ Knowl.-Based Syst., vol. 194, Apr. 2020, Art. no. 105572.
[16] E. Cambria, ‘‘Affective computing and sentiment analysis,’’ IEEE [36] W. Medhat, A. Hassan, and H. Korashy, ‘‘Sentiment analysis algo-
Intell. Syst., vol. 31, no. 2, pp. 102–107, Mar./Apr. 2016, doi: rithms and applications: A survey,’’ Ain Shams Eng. J., vol. 5, no. 4,
10.1109/MIS.2016.31. pp. 1093–1113, Dec. 2014.
[17] S. Behdenna, F. Barigou, and G. Belalem, ‘‘Sentiment analysis at document [37] K. Ravi and V. Ravi, ‘‘A survey on opinion mining and sentiment anal-
level,’’ in Smart Trends in Information Technology and Computer Com- ysis: Tasks, approaches and applications,’’ Knowl.-Based Syst., vol. 89,
munications (Communications in Computer and Information Science), pp. 14–46, Nov. 2015.
vol. 628, A. Unal, M. Nayak, D. Mishra, D. Singh, and A. Joshi, Eds. [38] M. Giatsoglou, M. G. Vozalis, K. Diamantaras, A. Vakali, G. Sarigiannidis,
Singapore: Springer, 2016, pp. 159–168. and K. C. Chatzisavvas, ‘‘Sentiment analysis leveraging emotions and
[18] A. Abdi, S. M. Shamsuddin, S. Hasan, and J. Piran, ‘‘Automatic sentiment- word embeddings,’’ Expert Syst. Appl., vol. 69, pp. 214–224, Mar. 2017.
oriented summarization of multi-documents using soft computing,’’ Soft [39] A. Hogenboom, P. van Iterson, B. Heerschop, F. Frasincar, and U. Kaymak,
Comput., vol. 23, pp. 10551–10568, Dec. 2018, doi: 10.1007/S00500-018- ‘‘Determining negation scope and strength in sentiment analysis,’’ in Proc.
3653-4. IEEE Int. Conf. Syst., Man, Cybern., Oct. 2011, pp. 2589–2594.
[19] R. Arulmurugan, K. R. Sabarmathi, and H. Anandakumar, ‘‘Classifica- [40] U. Farooq, H. Mansoor, A. Nongaillard, Y. Ouzrout, and M. A. Qadir,
tion of sentence level sentiment analysis using cloud machine learning ‘‘Negation handling in sentiment analysis at sentence level,’’ J. Comput.,
techniques,’’ Cluster Comput., vol. 22, pp. 1199–1209, Sep. 2017, doi: vol. 12, no. 5, pp. 470–478, 2017.
10.1007/s10586-017-1200-1. [41] D. Gautam, N. Maharjan, R. Banjade, L. J. Tamang, and V. Rus, ‘‘Long
[20] T. Wu, D. S. Weld, and J. Heer, ‘‘Local decision pitfalls in interactive short term memory based models for negation handling in tutorial dia-
machine learning: An investigation into feature selection in sentiment logues,’’ in Proc. 31st Int. Flairs Conf. (FLAIRS), 2018, pp. 1–6.
analysis,’’ ACM Trans. Comput.-Hum. Interact., vol. 26, no. 4, pp. 1–27, [42] R. Banjade, N. B. Niraula, and V. Rus, ‘‘Towards detecting intra- and inter-
Jul. 2019. sentential negation scope and focus in dialogue,’’ in Proc. FLAIRS Conf.,
[21] A. Alarifi, A. Tolba, Z. Al-Makhadmeh, and W. Said, ‘‘A big data approach 2016, pp. 198–203.
to sentiment analysis using greedy feature selection with cat swarm [43] J. Barnes, E. Velldal, and L. Vrelid, ‘‘Improving sentiment analysis with
optimization-based long short-term memory neural networks,’’ J. Super- multi-task learning of negation,’’ Natural Lang. Eng., vol. 1, no. 1,
comput., vol. 76, no. 6, pp. 4414–4429, Jun. 2020, doi: 10.1007/s11227- pp. 1–25, 2019.
018-2398-2. [44] C. E. Fellbaum, WordNet: An Electronic Lexical Database. Cambridge,
[22] M. T. Al-Sharuee, F. Liu, and M. Pratama, ‘‘Sentiment analysis: Dynamic MA, USA: MIT Press, 1998.
and temporal clustering of product reviews,’’ Appl. Intell., Mar. 2020, doi: [45] A. Esuli and F. Sebastiani, ‘‘Determining the semantic orientation of terms
10.1007/s10489-020-01668-6. through gloss classification,’’ in Proc. 14th ACM Int. Conf. Inf. Knowl.
[23] N. K. Singh, D. S. Tomar, and A. K. Sangaiah, ‘‘Sentiment analysis: Manage. (CIKM), Bremen, Germany, 2005, pp. 617–624.
A review and comparative analysis over social media,’’ J. Ambient Intell. [46] D. T. Santosh, K. S. Babu, S. D. V. Prasad, and A. Vivekananda, ‘‘Opinion
Hum. Comput., vol. 11, no. 1, pp. 97–117, Jan. 2020, doi: 10.1007/s12652- mining of online product reviews from traditional LDA topic clusters using
018-0862-8. feature ontology tree and SentiWordNet,’’ Int. J. Educ. Manage. Eng.,
[24] A. Krouska, C. Troussas, and M. Virvou, ‘‘Comparative evaluation of vol. 6, no. 6, pp. 34–44, 2016.
algorithms for sentiment analysis over social networking services,’’ J. UCS, [47] M. E. Moussa, E. H. Mohamed, and M. H. Haggag, ‘‘A generic lexicon-
vol. 23, no. 8, pp. 755–768, 2017. based framework for sentiment analysis,’’ Int. J. Comput. Appl., vol. 42,
[25] J. Song, K. T. Kim, B. Lee, S. Kim, and H. Y. Youn, ‘‘A novel classifi- pp. 463–473, Jun. 2018, doi: 10.1080/1206212X.2018.1483813.
cation approach based on Naïve Bayes for Twitter sentiment analysis,’’ [48] A. Jurek, M. D. Mulvenna, and Y. Bi, ‘‘Improved lexicon-based senti-
KSII Trans. Internet Inf. Syst., vol. 11, no. 6, pp. 2996–3012, 2017, doi: ment analysis for social media analytics,’’ Secur. Informat., vol. 4, no. 1,
10.3837/tiis.2017.06.011. pp. 1–13, Dec. 2015.
[49] A. Pak and P. Paroubek, ‘‘Twitter as a corpus for sentiment analysis [71] S. Rosenthal, N. Farra, and P. Nakov, ‘‘SemEval-2017 task 4: Sentiment
and opinion mining,’’ in Proc. 7th Conf. Int. Lang. Resour. Eval., 2010, analysis in Twitter,’’ in Proc. 11th Int. Workshop Semantic Eval. (SemEval),
pp. 1320–1326. Vancouver, BC, Canada, 2017, pp. 502–518.
[50] D. Rice and C. Zorn, ‘‘Corpus-based dictionaries for sentiment analysis [72] A. Go, R. Bhayani, and L. Huang, ‘‘Twitter sentiment classification using
of specialized vocabularies,’’ in Proc. New Directions Analyzing Text Data distant supervision,’’ Stanford, CA, USA, CS224N Project Rep., 2009,
Workshop (NDATAD), Sep. 2013, pp. 1–16. pp. 1–12.
[51] A. Esuli and F. Sebastiani, ‘‘Pageranking wordnet synsets: An application [73] M. Hu and B. Liu, ‘‘Mining and summarizing customer reviews,’’ in Proc.
to opinion mining,’’ in Proc. 45th Annu. Meeting Assoc. Comput. Linguis- ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining (KDD), 2004,
tics (ACL), 2007, pp. 424–431. pp. 164–168.
[52] A. Esuli and F. Sebastiani, ‘‘SentiWordNet: A publicly available lexical [74] Q. Liu, Z. Gao, B. Liu, and Y. Zhang, ‘‘Automated rule selection for aspect
resource for opinion mining,’’ in Proc. 5th Conf. Lang. Resour. Eval. extraction in opinion mining,’’ in Proc. 24th Int. Conf. Artif. Intell. (IJCAI),
(LREC), vol. 6, 2006, pp. 417–422. 2015, pp. 1291–1297.
[75] B. Pang and L. Lee, ‘‘A sentimental education: Sentiment analysis using
[53] H. Kanayama and T. Nasukawa, ‘‘Fully automatic lexicon expansion for
subjectivity summarization based on minimum cuts,’’ in Proc. Assoc.
domain-oriented sentiment analysis,’’ in Proc. Conf. Empirical Meth-
Comput. Linguistics (ACL), Barcelona, Spain, 2004, pp. 271–278.
ods Natural Lang. Process. (EMNLP), Sydney, NSW, Australia, 2006,
[76] J. Blitzer, M. Dredze, and F. Pereira, ‘‘Biographies, bollywood, boom-
pp. 22–23.
boxes and blenders: Domain adaptation for sentiment classification,’’
[54] G. Qiu, B. Liu, J. Bu, and C. Chen, ‘‘Expanding domain sentiment lexicon
in Proc. 45th Annu. Meeting Assoc. Comput. Linguistics (ACL), 2007,
through double propagation,’’ in Proc. 21st Int. Joint Conf. Artif. Intell.,
pp. 440–447.
Pasadena, CA, USA, 2009 pp. 1199–1204.
[77] R. A. Johnson and G. K. Bhattacharyya, Statistics: Principles and Meth-
[55] L. Zhang and B. Liu, ‘‘Identifying noun product features that imply opin- ods. Hoboken, NJ, USA: Wiley, 2014.
ions,’’ in Proc. Annu. Meeting Assoc. Comput. Linguistics (ACL), Portland,
OR, USA, 2011, pp. 575–580.
[56] S. Taj, B. B. Shaikh, and A. F. Meghji, ‘‘Sentiment analysis of news
articles: A lexicon based approach,’’ in Proc. 2nd Int. Conf. Comput., Math.
Eng. Technol. (iCoMET), Sukkur, Pakistan, Jan. 2019, pp. 1–5.
[57] C. Strapparava and A. Valitutti, ‘‘WordNet-affect: An affective extension
of WordNet,’’ in Proc. 4th Int. Conf. Lang. Resour. Eval. (LREC), Lisbon,
Portugal, vol. 4, 2004, pp. 1083–1086.
[58] C. Akkaya, J. Wiebe, A. Conrad, and R. Mihalcea, ‘‘Improving the impact
of subjectivity word sense disambiguation on contextual opinion analysis,’’
in Proc. CoNLL, 2011, pp. 87–96.
[59] C. Hung and S.-J. Chen, ‘‘Word sense disambiguation based sentiment
lexicons for sentiment classification,’’ Knowl.-Based Syst., vol. 110,
pp. 224–232, Oct. 2016.
[60] T. Wilson, J. Wiebe, and P. Hoffmann, ‘‘Recognizing contextual polarity: KHALED ABDALGADER received the B.Sc.
An exploration of features for phrase-level sentiment analysis,’’ Comput. degree in computer science from Sebha University,
Linguistics, vol. 35, no. 3, pp. 339–433, 2009. Libya, in 2001, the M.Sc. degree from University
[61] M. Marchand, R. Besançon, O. Mesnard, and A. Vilnat, ‘‘Domain adapta- Utara Malaysia, in 2004, and the Ph.D. degree in
tion for opinion mining: A study of multipolarity words,’’ J. Lang. Technol. natural language processing from La Trobe Uni-
Comput. Linguistics, vol. 29, no. 1, pp. 17–31, 2014. versity, Australia, in 2012. He was a Lecturer with
[62] C. Akkaya, J. Wiebe, and R. Mihalcea, ‘‘Subjectivity word sense dis- the Department of Computer Science, Sebha Uni-
ambiguation,’’ in Proc. Conf. Empirical Methods Natural Lang. Pro- versity, from 2004 to 2007. Since 2013, he has
cess. Singapore: The Association for Computational Linguistics, 2009, been an Assistant Professor with the Faculty of
pp. 190–199. Computing and Information Technology, Sohar
[63] J. Wiebe and R. Mihalcea, ‘‘Word sense and subjectivity,’’ in Proc. 21st University, Oman. His research interests include natural language processing
Int. Conf. Comput. Linguistics, 44th Annu. Meeting Assoc. Comput. Lin- and understanding, particularly word sense disambiguation, semantic text
guistics. Sydney, NSW, Australia: The Association for Computational similarity, text mining, sentiment analysis, and knowledge discovery from
Linguistics, 2006, pp. 1065–1072. textual collections.
[64] B. R. Razon and D. L. Salle, ‘‘Word sense disambiguation of opinionated
words using extended gloss overlap,’’ in Proc. 8th Nat. Natural Lang.
Process. Res. Symp., Manila, Philippines, 2011, pp. 1–5.
[65] M. Tamara, B. Alexandra, and M. Andres, ‘‘Word sense disambiguation in
opinion mining: Pros and cons,’’ J. Res. Comput. Sci., vol. 46, pp. 119–130,
2010.
[66] V. Rentoumi, G. Giannakopoulos, V. Karkaletsis, and G. A. Vouros, ‘‘Sen-
timent analysis of figurative language using a word sense disambigua-
tion approach,’’ in Proc. Int. Conf. RANLP, Borovets, Bulgaria, 2009,
pp. 370–375.
[67] K. Abdalgader and A. Skabar, ‘‘Unsupervised similarity-based word sense
disambiguation using context vectors and sentential word importance,’’
ACM Trans. Speech Lang. Process., vol. 9, no. 1, pp. 1–21, May 2012.
[68] R. Rada, H. Mili, E. Bicknell, and M. Blettner, ‘‘Development and appli-
cation of a metric on semantic nets,’’ IEEE Trans. Syst., Man, Cybern., AYSHA AL SHIBLI received the B.Sc. degree in
vol. 19, no. 1, pp. 17–30, Jan./Feb. 1989. information technology and the M.Sc. degree in
[69] S. Rosenthal, P. Nakov, S. Kiritchenko, S. Mohammad, A. Ritter, and computer science from Sohar University, Oman,
V. Stoyanov, ‘‘SemEval-2015 task 10: Sentiment analysis in Twitter,’’ in in 2010 and 2016, respectively. From 2010 to
Proc. 9th Int. Workshop Semantic Eval. (SemEval), Denver, CO, USA, 2016, she was a Teaching Assistant with the Fac-
2015, pp. 451–463. ulty of Computing and Information Technology,
[70] P. Nakov, A. Ritter, S. Rosenthal, F. Sebastiani, and V. Stoyanov, Sohar University, where she is currently a Lecturer.
‘‘SemEval2016 task 4: Sentiment analysis in Twitter,’’ in Proc. 10th Her research interests include the fields of artificial
Int. Workshop Semantic Eval. (SemEval), San Diego, CA, USA, 2016, intelligence, data mining, and smart systems.
pp. 1–18.