Professional Documents
Culture Documents
Volume 11, Issue 11, November 2020, pp. 101-112, Article ID: IJARET_11_11_010
Available online at http://iaeme.com/Home/issue/IJARET?Volume=11&Issue=11
ISSN Print: 0976-6480 and ISSN Online: 0976-6499
DOI: 10.34218/IJARET.11.11.2020.010
Kabada Sori
Lecturer, Department of Information Technology, Ambo University,
Hachalu Hundesa Campus, Oromia, Ethiopia
ABSTRACT
Sentiment analysis has become the most popular research topic due to its various
application in business, politics, entertainment, however analyzing opinion of people
from short text such as Twitter message and single sentence is quite a challenging task
due to their informality, misspell and semantic error. In this study, we propose character
level multiscale sentiment analysis for Afaan Oromoo using combined Convolutional
Neural Network and Bidirectional Long Short-Term Memory (CNN-Bi-LSTM)
approach. Since there is no standardized and sufficient corpus prepared for Afaan
Oromoo Natural Language Processing (NLP) task including sentiment analysis so far,
we have collected data from two domain, Facebook and Twitter for the experiment.
After collecting data, we removed user names, links, none Afaan Oromoo texts, and any
unnecessary characters. The cleaned data were annotated manually by 4 different
annotators into five class namely, 2 ,1, -2, -1, and 0 which represent very positive,
positive, very negative, negative and neutral respectively. This multi-scale sentiment
analysis provides a more refined analysis, which is vital for prioritizing and comparison
of different opinion. Afterward we performed experiments on the prepared corpus from
1. INTRODUCTION
The growth of the Internet and the wide explosion of social media, such as Facebook, Twitter,
and YouTube, has formed many new chances for people to express their attitudes towards any
individual, services, organizations, political parties, government policy, etc. by using their own
native languages.
The tracking of these opinions in social media has fascinated an increasing level of
concentration in the research community. Since large numbers of users share their views,
opinions, judges, ideas, and opinions it is hard to handle such a huge amount of online content.
It is a critical need for automated methods of opinion analysis, which allows in a short time to
process large amounts of data and to understand the polarity of users' messages. Sentiment
Analysis also called opinion mining is the task of natural language processing which concerns
with determining the writers, speakers or another subject attitude with respect to a certain topic
or event. The rapid development and the popularity of social media networks results a huge
amount of user generated content to be available online. Determining the polarity of this user
generated content has substantial benefit in different areas. In business, it allows companies to
automatically gather the opinions of their customers about their products or services and
identify areas of improvement[1]. In politics, it can help to infer the public orientation and
reaction towards political events, which helps in the decision making[2].
A multi-scale sentiment analysis gives a better insight to sentiment analysis as it shows the
degree to which a given text is positive or negative. The previous study focuses on binary
classification. However, Interpretation of opinions could be challenging for humans as binary
distinction of opinions as only positive or negative may not good enough[3], [4],[5]. The scale
values signify the strength of positivity or negativity of a text as rank. It provides a more refined
analysis and quick indication of tone, which is vital for numerous real-life application such as
prioritizing and comparison of different opinion. Although sentiment analysis can be useful to
any natural language. Sentiment analysis studies have been conducted most widely for English
language. However, the methods used for English language cannot be directly applied on other
languages due to dissimilarity of its structure. Afaan oromoo Language which is morphological
rich language has no exception.
Afaan Oromoo is one of the Cushitic branches of the Afro-Asiatic language family. In
Ethiopia, more than 35 million Oromo people use as mother tongue and neighboring peoples
by an additional half-million parts of northern and eastern Kenya[6]. Afaan Oromoo is also
widely spoken by other countries like Tanzania and Somalia, which is one of the most resource-
scarce languages[7]. Nevertheless, there is a few computational linguistic related works and its
one of the least researched and under resourced language. As well as Parsing, part of speech
tagging, and named entity recognition are challenging sentiment analysis in the field of natural
language processing (NLP) [8] including Afaan Oromo so far. This issue can be solved using
machine learning and state-of-the-art deep learning techniques because deep learning models
are effective due to their generalizability and automatic learning capability.
In this study we proposed a combination of convolutional Neural Network (CNN) and
Bidirectional Long Short-Term Memory (Bi-LSTM) with character level embedding. In recent
years, many text classification methods based on CNN or RNN have been proposed[9],[10]
,[11] [12]. CNNs are able to learn sequential correlations. In contrast to CNN, RNN are
specialized for sequential modeling but unable to extract features in parallel. So, we proposed
to CNN-BiLSTM to get the advantage of the two algorithms. We performed experiments by
collecting short text from Facebook and Twitter. Moreover, we compared the performance of
CNN, Bi-LSTM separately with the combined CNN-LSTM. As well as we compared the
performance of CNN, Bi-LSTM and CNN-Bi-LSTM on small and huge amount of dataset.
2. LITERATURE REVIEW
In this section mainly discuss sentiment analysis attempts on Afaan Oromoo and deep learning
for sentiment analysis.
3.2. Preprocessing
Preprocessing is very import as it reduces the computational time and increases the classifier
performance because noisy data can slow the learning process and decrease the efficiency of
the system. For this reason, to exclude irrelevant data from the dataset and performed different
preprocessing. Accordingly, our preprocessing includes the following:
Cleaning: Removal of user names, Removal of links, Removal of none Afaan Oromoo
texts, unnecessary characters, etc. After preprocessing is completed the training and testing
dataset is chosen.
This followed 80%-20% rule, in which 80 of the datasets is randomly chosen for training
and 20% is for testing purpose.
The sentiment analysis model we proposed is illustrated in Figure 1. The model is developed
by utilizing both Bi-LSTM and CNN[23].
All the sentences are padded with zero to make all the sentences equal size. Then, the model
receives a sequence of characters (a sentence) as input, and then finds the corresponding One-
hot vector for each character through the dictionary which containing l characters. For
characters or spaces that do not appear in the dictionary, we assign a 0 vector to them.
First of all, represent each character in the word of sentence as a vector of dimension .
Let, say is the length of the character in the word and is the dimension, it forms the matrix
. Where, C indicates the matrix, d is the height or dimension of the matrix and l is
simply the number of characters in word w.
The next step is to apply a convolutional filters H. convolutional filter (kernel) convolves
over the matrix and capture information. At a time, the convolutional filters slide over the
window of the matrix with height is the same with the height of original matrix C and width
w is number of characters that convolutional filter take at a time. This is user defined and could
be smaller than l. The weight with convolutional H is randomly initialized and will be adjusted
during training. =convolutional filter matrix of width . Now we overlay H on the
leftmost corner of C and take an element wise product of H and its projection on C. Therefore,
all the output matrix is sum up to get a feature map f . This value set as the first element of a
new vector called . we then slide H one character to the right and
perform the same operations then get the product and sum up all the values in the resulting
matrix to get another scalar. This scalar value is taken as the second element of
. we repeat the same operations character by character until we reach the end of
the word. In each step we add one more element of and lengthen the vector until it reaches its
maximum length which is . The vector is a numeric representation of the word
obtained when we look at the word characters at time, where n is number of characters. One
thing to note is that the value within the convolutional filter H don’t change as H slides through
the word in fancier term we call H “position” invariant. The position invariance of the
convolutional filter enables us to capture the meaning of the certain letter combination no matter
where in the word such combination appears. Maximum pooling function is applied to keep the
maximum value of . This maximum is collected together to form fixed dimensional
representation of parts of word. This process is referred to as “max-pooling”.
. Repeating the same operation with different convolutional filters H. This
convolutional filter may have different width. For example, H=2. As the previous filter we slide
along H’ across the word to get the vector and then perform max pooling on f, get summary
scalar. we repeat this scanning process several times with different
convolutional filters with each scanning process resulting in one summary scalar.
The output from the previous operation (CNN) is fed to another function called BiLSTM.
The output of this multiple filters after maximum pooling operation is fed into Bi-LSTM
network.
3.4. Bi-LSTM
LSTM was firstly proposed by [24] to overcome the gradient vanishing problem of the
traditional RNN. The key objective is to introduce an adaptive gating mechanism, which
decides the degree to keep the previous state and memorize the extracted features of the current
data input. Given a sequence where l indicates the length of input text,
LSTM processes the texts word by word. At time-step , the memory cell and the hidden
state are calculated and updated with the following equations:
= (1)
(6)
Here, indicates element wise summation to combine the forward and backward pass
outputs.
For the input sentence, we set the number of hidden layers as m, the result of BiLSTM
network can be expressed as follows:
]
where indicates the length of the input text. The RNN network result is
where each row of represents the feature of one word generated by BiLSTM.
= (8)
We then use the cross entropy to train the loss function. We first derive the loss of each
labeled comment and the final loss is averaged over all the labeled tweets or comments by the
following equation:
Loss = (9)
where n indicates the n th input comment. Adam optimizer [25] is used to adaptively adjust
learning rate and optimize parameters of the model. At each hidden layer, we also introduce
dropout[26].
4. EXPERIMENTAL ANALYSIS
In this section, we evaluate the performance of the proposed system on different datasets. The
experiments were carried out using window 10, Python 3.7.3 and TensorFlow 1.13.1
Table 4 Performance of CNN, Bi-LSTM and CNN-Bi-LSTM on small dataset for both Facebook and
Twitter
Facebook Twitter
Models
P (%) R (%) F (%) A (%) P (%) R (%) F (%) A (%)
CNN 89 88.7 88 89.7 89 87 89 88.6
Bi-LSTM 88.5 88 88.6 87.4 88 88.4 86.5 86.3
CNN-Bi-
92.2 92.5 91 90.1 93 94 90 89.8
LSTM
Table 5 Performance of CNN, Bi-LSTM and CNN-Bi-LSTM on huge dataset for both Facebook and
Twitter
Facebook Twitter
Models
P (%) R (%) F (%) A (%) P (%) R (%) F (%) A (%)
CNN 94 93.5 93.5 93.3 93 92.5 92.7 92.6
Bi-LSTM 90.4 91 91.6 91.4 91 91.7 90.5 90.3
CNN-Bi-
95 94.8 94.7 94.1 94 94.3 94 93.8
LSTM
REFERENCES
[1] B. Liu, “Sentiment Analysis and Opinion Mining,” Synth. Lect. Hum. Lang. Technol., vol. 5,
no. 1, pp. 1–167, May 2012, doi: 10.2200/S00416ED1V01Y201204HLT016.
[2] A. Bakliwal, J. Foster, J. Van Der Puil, R. O. Brien, L. Tounsi, and M. Hughes, “Sentiment
Analysis of Political Tweets : Towards an Accurate Classifier,” Proc. ofthe Work. Lang. Soc.
Media (LASM 2013), pp. 49–58, 2013.
[3] B. P. and L. L. Vaithyanathan, “Thumbs up? Sentiment Classification using Machine Learning
Techniques,” in Proceedings of the Conference on Empirical Methods in Natural Language
Processing (EMNLP), Philadelphia, 2002, pp. 79–86.
[4] B. Pang and L. Lee, “Opinion Mining and Sentiment Analysis,” Found. Trends Inf. Retr., vol.
2, pp. 1–135, 2008, doi: 10.1561/1500000001.
[5] B. Pang and L. Lee, “Seeing stars,” in Proceedings of the 43rd Annual Meeting on Association
for Computational Linguistics - ACL ’05, 2005, no. 1, pp. 115–124, doi:
10.3115/1219840.1219855.
[6] H. Kebede, Towards the Genetic Classification of the Afaan Oromoo Dialects. The University
of Oslo., 2009.
[7] W. Tesema and D. Tamirat, “Investigating Afan Oromo Language Structure and Developing
Effective File Editing Tool as Plug-in into Ms Word to Support Text Entry and Input Methods.”
[8] Q. T. Ain et al., “Sentiment Analysis Using Deep Learning Techniques : A Review,” Int. J. Adv.
Comput. Sci. Appl., vol. 8, no. 6, pp. 424–433, 2017.
[9] Y. Kim, “Convolutional Neural Networks for Sentence Classification,” 2011.
[10] S. Liao, J. Wang, R. Yu, K. Sato, and Z. Cheng, “CNN for situations understanding based on
sentiment analysis of twitter data,” Procedia Comput. Sci., vol. 111, no. 2015, pp. 376–381,
2017, doi: 10.1016/j.procs.2017.06.037.
[11] W. Cao, A. Song, and J. Hu, “Stacked Residual Recurrent Neural Network with Word Weight
for Text Classification,” IAENG Int. J. Comput. Sci., vol. 3, pp. 277–284, 2017.
[12] Y. Zhang, M. J. Er, R. Venkatesan, N. Wang, and M. Pratama, “Sentiment Classification Using
Comprehensive Attention Recurrent Models,” pp. 1562–1569, 2016.
[13] W. TARIKU, “Sentiment Mining and Aspect Based Summarization of Opinionated Afaan
Oromoo News Text,” 2017.
[14] J. Abate, M. Meshesha, and A. Ababa, “Unsupervised Opinion Mining Approach for Afaan
Oromoo Sentiments.”
[15] M. O. Rase, “Sentiment Analysis of Afaan Oromoo Facebook Media Using Deep Learning
Approach,” New Media Mass Commun., vol. 90, pp. 7–22, May 2020, doi: 10.7176/nmmc/90-
02.
[16] C. N. dosMa´ıra G. Santos;, “Deep Convolutional Neural Networks for Sentiment Analysis of
Short Texts,” 69 Proc. ofCOLING 2014, 25th Int. Conf. Comput. Linguist. Tech. Pap., pp. 69–
78, 2014, [Online]. Available: https://www.aclweb.org/anthology/C14-1008.
[17] X. Wang, W. Jiang, and Z. Luo, “Combination of Convolutional and Recurrent Neural Network
for Sentiment Analysis of Short Texts,” Proc. ofCOLING 2016, 26th Int. Conf. Comput.
Linguist. Tech. Pap., pp. 2428–2437, 2016.
[18] A. Yenter, “Deep CNN-LSTM with Combined Kernels from Multiple Branches for IMDb
Review Sentiment Analysis,” pp. 540–546, 2017.
[19] Q. Shen, Z. Wang, and Y. Sun, “Sentiment Analysis of Movie Reviews Based on CNN-
BLSTM,” IFIP Int. Fed. Inf. Process. 2017, pp. 164–171, 2017, doi: 10.1007/978-3-319-68121-
4.
[20] J. Yoon and H. Kim, “Multi-Channel Lexicon Integrated CNN-BiLSTM Models for Sentiment
Analysis,” 2017 Conf. Comput. Linguist. Speech Process. ROCLING 2017, pp. 244–253, 2017.