Professional Documents
Culture Documents
XX(X):1–10
Deep learning in Arabic Sentiment
c The Author(s) 2016
Reprints and permission:
Analysis: An Overview sagepub.co.uk/journalsPermissions.nav
DOI: 10.1177/ToBeAssigned
www.sagepub.com/
SAGE
1 1 1
Amal Alharbi , Mounira Taileb and Manal Kalkatawi
Abstract
Sentiment analysis became a very motivating area in both academic and industrial fields due to the exponential
increase of the online published reviews and recommendations. To solve the problem of analyzing and classifying
those reviews and recommendations several techniques have been proposed. Lately, deep neural networks showed
promising outcomes in sentiment analysis. The growing number of Arab users on the internet along with the
increasing amount of published Arabic reviews and comments encouraged researchers to apply deep learning to
analyze them. This paper is a comprehensive overview of research works that utilized deep learning approach for
Arabic sentiment analysis.
Keywords
Sentiment Analysis, Deep Learning, Arabic Text.
1. Introduction
Sentiment analysis is the area of study concerned with analyzing opinions, emotions, evaluations or attitudes towards an
entity such as an event, product, service, news, individual, organization or issues. This area is discussed in the literature
under different names, e.g., opinion mining, review mining, sentiment mining and subjectivity analysis[1]. Sentiment
analysis primarily aims at detecting sentiments articulated in texts and classify these sentiments into positive, negative
(favorable or unfavorable) or neutral opinions toward a topic or an issue. It is also concerned with sarcasm detection and
emotion analysis. Lately, there is a considerable interest by organizations and companies to detect and examine opinions
within text documents such as web pages, news articles, comments, reviews, and blogs instead of building surveys.
Thus, sentiment analysis can offer enormous opportunities for various applications ranging from decision making, risk
management, marketing analysis, and detection of rumors [2].
The focus is towards sentiments expressed in Arabic language due to the growing population of internet users that use
the Arabic language; it is estimated about 5% of worldwide users [3]. Also, during the last few years it is considered one of
the most rising languages on the web. The morphological complexity and the lexical ambiguity of the Arabic language can
pose a challenge when working with it. Another problem is that the Arabic language has three different varieties: Classical
Arabic (CA), Modern Standard Arabic (MSA), and Dialectical Arabic (DA). Throughout this paper, we aim at providing
a review on the utilization of deep learning approach to analyze sentiments expressed in Arabic text. The remainder of the
paper is arranged as follows: a concise background regarding techniques applied for sentiment analysis, deep learning and
word embedding is found in Section 2. In Section 3, we explore the sentiment analysis task with its different levels with a
summarization for the proposed deep learning models which have been applied in Arabic sentiment analysis. Finally, the
paper conclusion is presented in Section 4.
Corresponding author:
Mounira Taileb, KSA.
Email: mtaileb@kau.edu.sa
2. Background
The identification of positive and negative opinions is not an easy task; it requires information retrieval, linguistic
knowledge, natural language processing, and a profound comprehension of the textual context [4]. Various techniques
have been used in literature to solve the task of sentiment analysis; in which they can be classified into three classes: (i)
machine learning-based, (ii) lexicon-based, and (iii) hybrid techniques [5], as shown in Figure 1. Techniques based on
machine learning (ML) utilize either traditional machine learning algorithms or deep learning algorithms. The traditional
machine learning algorithms can be supervised using algorithms like Nave Bayes, Bayesian Network, Neural Network,
decision tree, Support Vector Machine (SVM) [6, 7, 8, 9], etc.; or unsupervised using clustering algorithms such as
K-means[10]. While Lexicon-based techniques use two approaches, the dictionary-based approach [11, 12] and the
corpus-based approach which uses either semantic [13] or statistical methods [14]. Hybrid techniques combine both ML
algorithms and lexicon-based methods [15, 16, 17, 18].
vector space which enables deep learning models to map words that have similar semantic properties [20, 26]. All the
words that have a similar meaning are represented by vectors that are close to each other, as shown in Figure 2. Word2Vec
is widely used in the sentiment analysis task as shown in the next section. There are two model architectures of Word2Vec:
Skip-gram (SG) and the Continuous Bag of Words (CBOW). The CBOW model works by predicting the current or the
target word based on its contextual words, which represent the surrounding words, within a predefined window size
(preceding and following words), as illustrated in Figure 3.(a) . However, the SG architecture given the target word it
predicts the surrounding words, as shown in Figure 3.(b).
3. Sentiment Analysis
Researches have considered three main levels of granularity in sentiment analysis: document level, sentence level, and
entity/aspect level [27]. In the following sub-sections these different levels are presented.
Although, many previous works did not explicitly declare the sentiment analysis level that they were using in their work.
We assume that since their models are trained by or tested on microblogs or short-text datasets, they are more likely to be
a sentence level analysis than a document level analysis.
Table 1. Summary of proposed deep learning-based models for Arabic sentiment analysis.
Analysis Text Deep learning Classifier
Ref. Dataset Dialect/MSA Accuracy
level representation Model layer
LDC ATB
[36] Sentence Bag of words RAE Softmax Dialect 74%
dataset [50]
ATB [52] 8,868
Word Dialect and average
[38] Sentence RAE Softmax tweets[53]
embedding(NLM[51]) MSA 80%
QALB [54]
9000 tweets Dialect and
[39] Sentence Word2Vec (sG) RAE Softmax 41%
[55] MSA
QALB about
Word2Vec
[40] Sentence RNTN Softmax 550,000 Dialect 80%
(CBOW)
comments [54]
[42] Sentence Word2Vec (SG) RNTN Softmax ASTD [43] Dialect 58%
[44] Sentence Word2Vec CNN - 2026 tweets Dialect 90%
Dialect and
[45] Sentence Word2Vec CNN - 2026 tweets 92%
MSA
Two datastes:
8635 stock average
[46] Sentence - DNN Softmax Dialect
tweets 7440 90%
football tweets
Nine different
Word2Vec Dialect and Average
[47] Sentence CNN Sigmoid datasets
(CBOW) MSA 86%
(reviews+tweets)
Combined ASTD[43] Dialect and average
[49] Sentence Word2Vec (SG) -
LSTM ArTwitter [56] MSA 84%
Polyglot
2,291 reviews
[57] Aspect embeddings H-LSTM Softmax - 82%
[59]
[58]
Word 2,291 reviews
[60] Aspect CNN Softmax - 82%
embedding [59]
2,291 reviews
[61] Aspect Word2Vec RNN Softmax - 87%
[59]
CNN+ MLP+
Dialect and
[62] Aspect Word2Vec logistic Softmax 13,000 tweets 77%
MSA
regression
Dialect and
[63] Aspect Word2Vec CRNN + MLP Argmax 3300 tweets 73%
MSA
Ruser et al. [57] addressed the aspect based sentiment analysis task by developing a hierarchical bidirectional LSTM
model (H-LSTM). Their model consists of stacked bidirectional LSTMs wherein every time step the output of those
bidirectional layers is concatenated and fed as an input to the final layer along with the aspect vector. The final layer was
a softmax layer that gave the probability distribution for each sentence. They tested their model on different datasets with
different languages including Arabic where they achieved an accuracy equals to 82%. Later the same team developed a
CNN-based model in [60] as a participation in SemEval-2016 for aspect-based sentiment analysis. The CNN layers take
the word embeddings of sentences together with the aspect vector as an input. Again they tested their model on different
languages including Arabic in which their system achieved 82% of accuracy.
Researchers in [61] addressed the aspect-based sentiment analysis for Arabic Hotels reviews. Their dataset consisted
of 2,291 Arabic reviews that were set for the 2016 Semantic Evaluation workshop [64] . The dataset was prepared using
AraNLP [65] and MADAMIRA [66] tools and they were also used to extract semantic, syntactic and morphological
features that the authors believed it would improve their results. They used RNN approach using Deeplearning4j
Framework [67] to implement their solution. The network consisted of five hidden layers, and the results show that their
proposed system achieved an accuracy of 87%.
In SemEval-2017,task4, two systems that use deep learning approach were proposed [62, 63] for Arabic sentiment
analysis. In [62] for the topic-based message classification the authors implemented three independent classifiers, CNN,
Multilayer Perceptron (MLP), and logistic regression. The final classification of each tweet is determined based on voting
among the three classifiers. While in [63] the authors proposed a system that combines three Convolutional Recurrent
Neural Networks (CRNN). The input to each CRNN is different in which, the first CRNN takes as an input out-domain
embedding from Wikipedia in Arabic. The next CRNN takes input from in-domain embedding, e.g., a dataset of tweets,
and the words polarities served as an input to the final network. Then, the output of the three CRNN are concatenated and
used as an input to a MLP. Experimental results revealed that the early model outperformed the later model. Unfortunately,
both works do not consider the topic, (target), information at their models. Thus, they analyzed the text without considering
the target place on the context nor determining the association between the target and its surrounding context.
A summary of the deep learning Arabic sentiment analysis models that have been proposed is presented in Table1.
4. Conclusion
Recently, analyzing sentiments and opinions using deep learning attracted many researchers attention. In this paper, the
proposed models to solve the problem of Arabic sentiment analysis using deep learning are presented. The research work
achieved in Arabic sentiment analysis using deep learning is still in its early stages compared against other languages like
the English language. With the rapid advance in deep learning research, we expect the proposal of a significant number of
models in the Arabic sentiment analysis using different deep learning algorithms since several directions can be explored.
References
[1] Pang B, Lee L, et al. Opinion mining and sentiment analysis. Foundations and Trends
R in Information Retrieval.
2008;2(1–2):1–135.
[2] Nasukawa T, Yi J. Sentiment analysis: Capturing favorability using natural language processing. In: the Proceedings
of the 2nd international conference on Knowledge capture. ACM; 2003. p. 70–77.
[3] El-Masri M, Altrabsheh N, Mansour H. Successes and challenges of Arabic sentiment analysis research: a literature
review. Social Network Analysis and Mining. 2017;7(1):54.
[4] Aydoğan E, Akcayol MA. A comprehensive survey for sentiment analysis tasks using machine learning techniques.
In: INnovations in Intelligent SysTems and Applications (INISTA), 2016 International Symposium on. IEEE; 2016.
p. 1–7.
[5] Singh J, Singh G, Singh R. A review of sentiment analysis techniques for opinionated web text. CSI transactions on
ICT. 2016;4(2-4):241–247.
[6] Mullen T, Collier N. Sentiment analysis using support vector machines with diverse information sources. In:
Proceedings of the 2004 conference on empirical methods in natural language processing; 2004. .
[7] Jia L, Yu C, Meng W. The effect of negation on sentiment analysis and retrieval effectiveness. In: Proceedings of the
18th ACM conference on Information and knowledge management. ACM; 2009. p. 1827–1830.
[8] Pak A, Paroubek P. Twitter as a corpus for sentiment analysis and opinion mining. In: LREc. vol. 10; 2010. p.
1320–1326.
[9] Shoukry A, Rafea A. Sentence-level Arabic sentiment analysis. In: Collaboration Technologies and Systems (CTS),
2012 International Conference on. IEEE; 2012. p. 546–550.
[10] Li G, Liu F. Application of a clustering method on sentiment analysis. Journal of Information Science.
2012;38(2):127–139.
[11] Kim SM, Hovy E. Determining the sentiment of opinions. In: the Proceedings of the 20th international conference
on Computational Linguistics. Association for Computational Linguistics; 2004. p. 1367.
[12] Hu M, Liu B. Mining and summarizing customer reviews. In: the Proceedings of the tenth ACM SIGKDD
international conference on Knowledge discovery and data mining. ACM; 2004. p. 168–177.
[13] Maks I, Vossen P. A lexicon model for deep sentiment analysis and opinion mining applications. Decision Support
Systems. 2012;53(4):680–688.
[14] Fahrni A, Klenner M. Old wine or warm beer: Target-specific sentiment analysis of adjectives. In: the Proceedings
of the Symposium on Affective Language in Human and Machine, AISB; 2008. p. 60–63.
[15] Appel O, Chiclana F, Carter J, Fujita H. A hybrid approach to sentiment analysis. IEEE; 2016. .
[16] Malandrakis N, Kazemzadeh A, Potamianos A, Narayanan S. SAIL: A hybrid approach to sentiment analysis. In:
Second Joint Conference on Lexical and Computational Semantics (* SEM), Volume 2: Proceedings of the Seventh
International Workshop on Semantic Evaluation (SemEval 2013). vol. 2; 2013. p. 438–442.
[17] El-Halees A, et al. Arabic opinion mining using combined classification approach. 2011;.
[18] Aldayel HK, Azmi AM. Arabic tweets sentiment analysis–a hybrid scheme. Journal of Information Science.
2016;42(6):782–797.
[19] Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P. Natural language processing (almost) from
scratch. Journal of Machine Learning Research. 2011;12(Aug):2493–2537.
[20] Shirani-Mehr H. Applications of deep learning to sentiment analysis of movie reviews. In: Technical Report. Stanford
University; 2014. .
[21] Schmidhuber J. Deep learning in neural networks: An overview. Neural networks. 2015;61:85–117.
[22] Rojas-Barahona LM. Deep learning for sentiment analysis. Language and Linguistics Compass. 2016;10(12):701–
719.
[23] Huang EH, Socher R, Manning CD, Ng AY. Improving word representations via global context and multiple word
prototypes. In: the Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long
Papers-Volume 1. Association for Computational Linguistics; 2012. p. 873–882.
[24] Pennington J, Socher R, Manning C. Glove: Global vectors for word representation. In: the Proceedings of the 2014
conference on empirical methods in natural language processing (EMNLP); 2014. p. 1532–1543.
[25] Mnih A, Kavukcuoglu K. Learning word embeddings efficiently with noise-contrastive estimation. In: the
Proceedings of Advances in neural information processing systems; 2013. p. 2265–2273.
[26] Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their
compositionality. In: the Proceedings of Advances in neural information processing systems; 2013. p. 3111–3119.
[27] Liu B. Sentiment analysis and opinion mining. Synthesis lectures on human language technologies. 2012;5(1):1–167.
[28] Pang B, Lee L, Vaithyanathan S. Thumbs up?: sentiment classification using machine learning techniques. In: the
Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Association
for Computational Linguistics; 2002. p. 79–86.
[29] Zhai S, Zhang ZM. Semisupervised Autoencoder for Sentiment Analysis. In: the Proceedings of AAAI; 2016. p.
1394–1400.
[30] Johnson R, Zhang T. Effective use of word order for text categorization with convolutional neural networks. arXiv
preprint arXiv:14121058. 2014;.
[31] Tang D, Qin B, Liu T. Document modeling with gated recurrent neural network for sentiment classification. In: the
Proceedings of the 2015 conference on empirical methods in natural language processing; 2015. p. 1422–1432.
[32] Dou ZY. Capturing user and product Information for document level sentiment analysis with deep memory network.
In: the Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing; 2017. p. 521–
526.
[33] Xu J, Chen D, Qiu X, Huang X. Cached long short-term memory neural networks for document-level sentiment
classification. arXiv preprint arXiv:161004989. 2016;.
[34] Yin Y, Song Y, Zhang M. Document-level multi-aspect sentiment classification as machine comprehension. In: the
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing; 2017. p. 2044–2054.
[35] Zhang L, Wang S, Liu B. Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data
Mining and Knowledge Discovery. 2018;p. e1253.
[36] Al Sallab A, Hajj H, Badaro G, Baly R, El Hajj W, Shaban KB. Deep learning models for sentiment analysis in
Arabic. In: the Proceedings of the Second Workshop on Arabic Natural Language Processing; 2015. p. 9–17.
[37] Badaro G, Baly R, Hajj H, Habash N, El-Hajj W. A large scale Arabic sentiment lexicon for Arabic opinion mining.
In: the Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP); 2014. p. 165–
173.
[38] Al-Sallab A, Baly R, Hajj H, Shaban KB, El-Hajj W, Badaro G. AROMA: a recursive deep learning model for
opinion mining in Arabic as a low resource language. ACM Transactions on Asian and Low-Resource Language
Information Processing (TALLIP). 2017;16(4):25.
[39] Baly R, Badaro G, Hamdi A, Moukalled R, Aoun R, El-Khoury G, et al. Omam at semeval-2017 task 4: Evaluation
of english state-of-the-art sentiment analysis models for arabic and a new topic-based model. In: the Proceedings of
the 11th International Workshop on Semantic Evaluation (SemEval-2017); 2017. p. 603–610.
[40] Baly R, Hajj H, Habash N, Shaban KB, El-Hajj W. A sentiment treebank and morphologically enriched recursive
deep models for effective sentiment analysis in arabic. ACM Transactions on Asian and Low-Resource Language
Information Processing (TALLIP). 2017;16(4):23.
[41] Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng A, et al. Recursive deep models for semantic
compositionality over a sentiment treebank. In: the Proceedings of the 2013 conference on empirical methods in
natural language processing; 2013. p. 1631–1642.
[42] Baly R, Badaro G, El-Khoury G, Moukalled R, Aoun R, Hajj H, et al. A characterization study of arabic twitter data
with a benchmarking for state-of-the-art opinion mining models. In: the Proceedings of the Third Arabic Natural
Language Processing Workshop; 2017. p. 110–118.
[43] Nabil M, Aly M, Atiya A. Astd: Arabic sentiment tweets dataset. In: the Proceedings of the 2015 Conference on
Empirical Methods in Natural Language Processing; 2015. p. 2515–2519.
[44] Alayba AM, Palade V, England M, Iqbal R. Arabic language sentiment analysis on health services. In: the
Proceedings of the 1st International Workshop on Arabic Script Analysis and Recognition (ASAR). IEEE; 2017.
p. 114–118.
[45] Alayba AM, Palade V, England M, Iqbal R. Improving Sentiment Analysis in Arabic Using Word Representation.
arXiv preprint arXiv:180300124. 2018;.
[46] Abdelhade N, Soliman THA, Ibrahim HM. Detecting Twitter Users Opinions of Arabic Comments During Various
Time Episodes via Deep Neural Network. In: the Proceedings of International Conference on Advanced Intelligent
Systems and Informatics. Springer; 2017. p. 232–246.
[47] Dahou A, Xiong S, Zhou J, Haddoud MH, Duan P. Word embeddings and convolutional neural network for arabic
sentiment classification. In: the Proceedings of COLING 2016, the 26th International Conference on Computational
Linguistics: Technical Papers; 2016. p. 2418–2427.
[48] Liu J, Shang J, Wang C, Ren X, Han J. Mining quality phrases from massive text corpora. In: the Proceedings of the
2015 ACM SIGMOD International Conference on Management of Data. ACM; 2015. p. 1729–1744.
[49] Al-Azani S, El-Alfy ESM. Hybrid Deep Learning for Sentiment Polarity Determination of Arabic Microblogs. In:
the Proceeding of the International Conference on Neural Information Processing. Springer; 2017. p. 491–500.
[50] Maamouri M, Bies A, Buckwalter T, Jin H, Mekki W. Arabic treebank: Part 3 (full corpus) v 2.0 (MPG+ syntactic
analysis). Linguistic Data Consortium, Philadelphia, lDC Catalogue number: LDC2005T20, ISBN. 2005;p. 1–58563.
[51] Collobert R, Weston J. A unified architecture for natural language processing: Deep neural networks with multitask
learning. In: the Proceedings of the 25th international conference on Machine learning. ACM; 2008. p. 160–167.
[52] Maamouri M, Bies A, Buckwalter T, Mekki W. The penn arabic treebank: Building a large-scale annotated arabic
corpus. In: the Proceeding of NEMLAR conference on Arabic language resources and tools. vol. 27; 2004. p. 466–
467.
[53] Refaee E, Rieser V. An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis. In: the Proceedings of
Language Resources and Evaluation Conference, LREC; 2014. p. 2268–2273.
[54] Mohit B, Rozovskaya A, Habash N, Zaghouani W, Obeid O. The first QALB shared task on automatic text correction
for Arabic. In: the Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP);
2014. p. 39–47.
[55] Rosenthal S, Farra N, Nakov P. SemEval-2017 task 4: Sentiment analysis in Twitter. In: the Proceedings of the 11th
International Workshop on Semantic Evaluation (SemEval-2017); 2017. p. 502–518.
[56] Abdulla NA, Ahmed NA, Shehab MA, Al-Ayyoub M. Arabic sentiment analysis: Lexicon-based and corpus-based.
In: the Proceeding of 2013 IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies
(AEECT). IEEE; 2013. p. 1–6.
[57] Ruder S, Ghaffari P, Breslin JG. A hierarchical model of reviews for aspect-based sentiment analysis. arXiv preprint
arXiv:160902745. 2016;.
[58] Al-Rfou R, Perozzi B, Skiena S. Polyglot: Distributed word representations for multilingual nlp. arXiv preprint
arXiv:13071662. 2013;.
[59] Pontiki M, Galanis D, Papageorgiou H, Androutsopoulos I, Manandhar S, Mohammad AS, et al. SemEval-2016 task
5: Aspect based sentiment analysis. In: the Proceedings of the 10th international workshop on semantic evaluation
(SemEval-2016); 2016. p. 19–30.
[60] Ruder S, Ghaffari P, Breslin JG. Insight-1 at semeval-2016 task 5: Deep learning for multilingual aspect-based
sentiment analysis. arXiv preprint arXiv:160902748. 2016;.
[61] Al-Smadi M, Qawasmeh O, Al-Ayyoub M, Jararweh Y, Gupta B. Deep Recurrent neural network vs. support
vector machine for aspect-based sentiment analysis of Arabic hotels reviews. Journal of computational science.
2018;27:386–393.
[62] El-Beltagy SR, Kalamawy ME, Soliman AB. NileTMRG at SemEval-2017 Task 4: Arabic Sentiment Analysis.
arXiv preprint arXiv:171008458. 2017;.
[63] González JA, Pla F, Hurtado LF. ELiRF-UPV at SemEval-2017 Task 4: Sentiment Analysis using Deep Learning.
In: the Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017); 2017. p. 723–727.
[64] Mohammad AS, Qwasmeh O, Talafha B, Al-Ayyoub M, Jararweh Y, Benkhelifa E. An enhanced framework for
aspect-based sentiment analysis of Hotels’ reviews: Arabic reviews case study. In: the Proceeding of the 11th
International Conference for Internet Technology and Secured Transactions (ICITST). IEEE; 2016. p. 98–103.
[65] Althobaiti M, Kruschwitz U, Poesio M. Aranlp: A java-based library for the processing of arabic text. 2014;.
[66] Pasha A, Al-Badrashiny M, Diab MT, El Kholy A, Eskander R, Habash N, et al. MADAMIRA: A Fast,
Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic. In: the Proceedings of Language
Resources and Evaluation Conference, LREC 2014. vol. 14; 2014. p. 1094–1101.
[67] Team D, et al. Deeplearning4j: Open-source distributed deep learning for the jvm. Apache Software Foundation
License. 2016;2.