You are on page 1of 9

International Journal of Research Studies in Science, Engineering and Technology

Volume 7, Issue 7, 2020, PP 10-18


ISSN 2349-476X

Multi-Class Sentiment Analysis from Afaan Oromo Text Based


On Supervised Machine Learning Approaches
Negessa Wayessa*, Sadik Abas
Department of Information Technology, College of Informatics, Bule Hora University, Ethiopia
*Corresponding Author: Negessa Wayessa, Department of Information Technology, College of
Informatics, Bule Hora University, Ethiopia

ABSTRACT
Sentiment analysis is the field of data mining and natural language processing that study about analyzes the
opinion data from social media like Facebook, Twitter, organization sites, online news report, user reviews
and etc. It is process of classifies opinions under defined polarities classes. Sentiment analysis is a hot
research area in the field of data mining and natural language processing. In this work, sentence level
sentiment analysis is done into five multiple classes- very negative, negative, neutral, positive and very
positive. Here, we proposed two methods of supervised machine learning approaches-Support Vector
Machine and Random Forest algorithms to classify sentiment polarity from Oromia broad casting network
(OBN) Twitter by Ethiopian language Afaan Oromo. We used tokenization, stop word removal,
normalization and stemming as preprocessing and tf-idf used as feature extraction. The Performance of
proposed approaches Support Vector Machine and Random Forest achieved accuracy 90% and 89% on
OBN Twitter dataset with 1810 corpus size respectively.
Keywords: Sentiment Analysis, Ethiopian language Afaan Oromo, Supervised machine learning, Support
vector machine, Random forest, OBN Twitter.

INTRODUCTION opinionated text remains need a technique


which was called Sentiment analysis.
One of the most important things to transferring
and interchanging idea between peoples is Sentiment analysis is the field of data mining
through sharing their idea which says opinion. and natural language processing that study about
Peoples are able to express their ideas or analyzes people‟s opinions, sentiments,
opinions in the form of posting on social medial evaluations, appraisals, attitudes, and emotions
like Facebook, Twitter, Forum discussions and towards entities such as products, services,
etc. within regard to their day to day life. These organizations, individuals, issues, events, topics,
opinions can be explores about private and and their attributes [1]. Almost all peoples are
public like individuals, organization, share opinions text using their own language
government, product, service, politics and etc. and also an organizations or companies have
[1]. Since opinions analysis can influences the social media page using their official language.
interest of different private and public So that a huge volumes opinions text on private
organizations or companies are highly interested and public social media are increasing time to
in analyzing and exploring online opinions [4]. time by using Ethiopian language Afaan Oromo.
Several private and public are interested to Afaan Oromo (AO) is one of the major African
know the opinion of the public with respects to languages that are widely spoken in Ethiopia.
their products and services. Also, many Oromo people are the native speaker of this
companies are interested to know the public language; they are largest ethnic group in
feedback with respect to the new policy, rules Ethiopia. According to the Ethiopia Population
and regulations set out as well as public services Census Commission 2007 the total population
delivered [6]. With the explosive growth of of Oromo People which use AO language is
social media (e.g., reviews, forum discussions, 34.5% of the total population of the country [2].
blogs, micro-blogs, Twitter, comments, and According to paper [3] it is also the third most
postings on social network sites), the opinions widely spoken language in Africa next to Arabic
posted on this social medias are increased huge and Hausa languages. Specially, it is first widely
volumes of opinions time to time in this real spoken and used in most parts of Ethiopia
world. Since analyzing huge volumes of (Oromia state) and also some parts of other

International Journal of Research Studies in Science, Engineering and Technology V7 ● I7 ● 2020 10


Multi-Class Sentiment Analysis from Afaan Oromo Text Based On Supervised Machine Learning Approaches

neighbor countries like Kenya, Tanzania, First we analyzed this paragraph as sentence
Djibouti, Sudan and Somalia. Oromia broad level according to the following: The first
Casting Network (OBN) is prominent official sentence expresses no opinion, it doesn‟t offer
broad castings which use Afaan Oromo any sentiment and it is neutral. Sentence two
language on different social media like Twitter and three are expresses a positive opinion about
page, Facebook page, forum discussion and etc. the program and hangout of program as whole.
Currently, OBN have several branches like The last sentence is expressing a negative
OBN of Adama, OBN of Finfinne, Horn of opinion about a pricing. Generally, we extract
Africa OBN and international OBN. So, the four polarities of opinion classes. But, according
amounts of Afaan Oromo text on OBN Twitter to document level, this paragraph is analyzed
pages are increasing in day to day. Therefore, into majority of polarity in a paragraph which is
we have been proposed sentiment analysis positive.
system to analyzed OBN Tweets of Afaan
This research work extracts emotion by using
Oromo text. However, there are a few works
supervised machine learning techniques. These
studied on sentiment analysis system for Afaan
two techniques are Support Vector Machine and
Oromo text. Sentiment analysis is not a new
Random Forest. We used hand created dataset
field and it has been researched for different
with 1810 Afaan Oromo text for training and
natural language like English and other
testing purpose from OBN Twitter page. The
language like Arabic, Chinese, Bangla and
result was measured using performance metrics
Amharic [9] [10] [13] while for Afaan Oromo
such as precision, recall, f-score and accuracy
and other under-resourced language, this field is
for evaluating the effectiveness of the proposed
new and it is at the initial state.
techniques.
Sentiment analysis can be conducted at different
The rest of this thesis is organized as follows. In
levels such as document level, aspect level and section 2, some previous works related to our
sentence level [1] [10]. The document level work has been described. Section 3 gives a
deals with classify whether a whole opinion simple view of the overall proposed model.
document expresses a positive or negative Section 4 presented the experimental results of
sentiment [1]. This level is analysis opinions in the proposed system. Finally, Section 5
document on single entity (e.g., single product, concluded the thesis and provides some
single reviews). Aspect based sentiment analysis directions for future works.
level is an aspect/feature level in which
Related Works
identifying and extracting the feature of the
objects that have been commented on by the Recently, some sentiment analysis researches
opinion holder and determining whether the were studied regarding Afaan Oromo text. As
opinion is positive or negative. Shortly, the idea far as researchers know four researches have
of aspect based sentiment analysis is not only been proposed within various feature and
classify opinions into positive and negative, it techniques. The first paper [4] Proposed Aspect
also specifies the entity comment was based summarization of Afaan Oromo news text
commented on. The other is sentence level on the news domain. The researchers used
sentiment analysis which is determining whether lexicon-based approach and rule to specify the
each sentence expressed under defined polarity. opinions from ORTO news service dataset.
Sentence level sentiment analysis is that They have trained the system with corpus of
sentence usually contains single opinion size 400 reviews and achieved good results. Due
(although not true in any case) in a document to un-availability of lexical database and
typically contains multiple opinions [1]. linguistic resources such as POS, feeling on
Document level analysis is analyzed on general social media indirectly cannot handle in their
that means it is not identify specify area of system. [5] Developed unsupervised Opinion
document part. For this problem sentence level Mining Approach for Afaan Oromo Sentiments.
is analyzed each sentence of document into The researchers used lexicons, POS and N-gram
expressed emotion classes. As an example: as feature to identify the opinions from social
medial OPDO official Facebook page. They
“I have registered for leaning data mining
have trained the system with corpus of size 600
program. This program have amazing course. It
reviews. [6] Designed Sentiment Analysis for
has very good handout. But, it is very expensive
Afaan Oromo Using Deep Learning Approach.
to handout”.
They focused on investigating Convolutional

11 International Journal of Research Studies in Science, Engineering and Technology V7 ● I7 ● 2020


Multi-Class Sentiment Analysis from Afaan Oromo Text Based On Supervised Machine Learning Approaches

Neural Network and Long Short Term Memory Techniques- Naïve Bayes Classification
deep learning approaches for the development Algorithm and Topical approach to extract the
of sentiment analysis of Afaan Oromo social emotion from any Bangla text. They studied
media content such as Facebook posts sentiment analysis at sentence level and
comments. document level. They classified sentiment with
six individual emotion classes- happy, sad,
They system has been trained and tested on
tender, excited, angry and scared. They created
dataset size 1452 comments that were collected
manually data corpus with 75000 sentences and
manually from the official site of the Facebook
7400 sentences used for training dataset as well
page of Oromo Democratic Party/ODP. They
as 100 sentences used for test data. Their model
system has been analyzed document level of
achieved highest in topical approach 90%
Afaan Oromo review. They used preprocessing
accuracy. [7] Designed KNN classifier based
such as normalization, tokenization and stop
approach for multi-class sentiment analysis of
word removal on manually collected dataset.
twitter data. They were done sentiment
They used word embedding as feature to specify
classification into multiple classes. They
opinions and the result showed that
proposed methodology KNN were compared
Convolutional Neural Network achieved the
with other approach SVM. They extracted
accuracy of 89% as well as The Long Short
Twitter data by using python Tweepy. They
Memory achieved accuracy of 87.6%. [11]
used n-gram as feature and achieved 86%
Studied on Constructing Sentiment Mining
accuracy.
Model for Opinionated Afaan Oromo Texts on
Ethiopian Politics. They used two machine In the regarding to Afaan Oromo text sentiment
learning techniques such as Decision Tree (DT) analysis, researches [4], [6] and [11] have been
and Naïve Bayes (NB). Their corpus size 1800 proposed on Afaan Oromo review sentiment
reviews of Afaan Oromo text Ethiopian politics analysis on binary classification (positive and
collected from different blogs and websites. negative) and [5] add neutral on this class. They
They used tokenization, normalization, stop also studied on document and aspect level
words removal and stemming in text sentiment analysis that means they didn‟t
preprocessing and bag of word in unigram as studied on sentence level. Today‟s world
feature. Lastly, their experimental result showed focused on multi-label classes of sentiment
that Naïve Bayes algorithm (88%) performed analysis rather than binary classes such as:
better than Decision Tree algorithm (83%) in negative and positive. However, to the best
polarity classification. knowledge of the researcher, there is no
investigated on multi-class sentiment analysis
In addition to previous work of AO sentiment
and also, there is no researched sentiment
analysis, there were developed for others
analysis at sentence level for Afaan Oromo text.
language. From different worked on other
Therefore, in this study, we proposed on
language, we reviewed three papers from other
investigating sentence level sentiment analysis
language. We select these three papers based on
for Afaan Oromo tweets based on supervised
their working multiple classes. [9] Investigated a
machine approaches. The aim of this study was
machine learning approach to multi-scale
classifying the polarity a given text whether the
sentiment analysis on Amharic online posts.
expressed emotion of Afaan Oromo tweets into
They usedNaïve Bayes machine learning
five major classes such as: very negative,
algorithm and used unigram, bigram and hybrid
negative, neutral, positive and very positive
variants as features. They multi-scale were
based on trained dataset.
consisted -2,-1, 0, +1 and +2. They have corpus
size 608 posts that collected from Facebook, PROPOSED MODEL
Twitter, Dire Tube and Ethiopian Reporter
In this section, to investigate an automate
websites online sources. Lastly, their model
system for classifying opinions from Afaan
achieved a performance accuracy of 43.6%,
Oromo sentence; we applied two supervised
44.3% and 39.5% for unigram, bigram and
machine leaning approaches: Support vector
hybrid language models respectively. [10]
machine and Random forest. These approaches
Developed an Automated System of Sentiment
have been proposed for classifying opinions
Analysis from Bangla Text using Supervised
from Afaan Oromo text at sentence level into
Learning Techniques. These Supervised Learning
five multiple sentiment classes.

International Journal of Research Studies in Science, Engineering and Technology V7 ● I7 ● 2020 12


Multi-Class Sentiment Analysis from Afaan Oromo Text Based On Supervised Machine Learning Approaches

Figure1. Proposed Model Architecture


To conduct the procedure with a better contained different un-opinion text as well as
performance, the process working of this study other mixed language text. We take more time
has been consisted the following sub sections. with experts of language to clarify Afaan Oromo
text and labeled 1810 dataset into multiple
Data Collection And Preparation
classes-positive, negative, neutral, very negative
For this study, primary data sources have been and very positive for training phase. These
taken from Ethiopian governmental broad classes categorized depend on the power of
casting cooperation, Oromia broad casting emotion word in sentences. For example the
network (OBN) twitter page. This page have a tweets which have high power to sense the
huge user that give different opinion for reader, we categorized under very positive or
improving their life through politics, culture, very negative rather than positive or negative.
business, sport and some others topics which There is also the repetition of adjective, adverb
released every day by OBN the people of sound. and some others predicate of sentence give high
In the activity of dataset collection step, we got sense emotion to say very positive and very
above 10,000 un-labeled tweets text from OBN. negative. Otherwise, when the tweets are no
Therefore, we acknowledged OBN for getting opinion sentiment and the sensitive of negative
these huge amounts of tweet dataset without or very negative and positive or very positive
payment. For this study, we used supervised are equal, it is neutral. For this assumption, there
machine learning approaches. Since supervised is an example according to the following table1.
machine learning approaches needed labeled The bold sentence is opinion in Afaan Oromo
dataset for training purposes. Therefore, we language and normal sentence is meaning in
labeled dataset within discussion of experts. English.
But, these 10,000 un-labeled tweets were
Table1. An Example of dataset style
No Tweets Polarity
1 QophiisirnaGadaakeessanbaayyeebayyeegaariidha. Very Positive
(Your Gadaa System program is very very good.)
2 OBN gidduugidduusagantaakeessaanittiosoosirbootaHaacaaluuHundeessaa nu Positive
affeertaniigaariidha.
(If OBN can invite Hacalu Hundessa voice in between programs, it is good for us.)
3 Qophiinqe'eeqorannoodansaamiti. Negative
(The program studying about environment is not good.)
4 OBN Seenaa fi DudhaaOromooirrattisirrittihojjechaahinjiru. Very Negative
(OBN is not working perfectly on history and culture of Oromo.)
5 Magaalaanmuummeeoromiyaafinfinneedha. Neutral
(The Capital city of Oromia is Finfinne.)

From 1810 dataset size, 286 very negative, 476 Text Preprocessing
negative, 351 neutral, 401 positive and 296 is
Another step is text preprocessing in order to
very positive opinion text.
remove un-necessary part for opinion mining

13 International Journal of Research Studies in Science, Engineering and Technology V7 ● I7 ● 2020


Multi-Class Sentiment Analysis from Afaan Oromo Text Based On Supervised Machine Learning Approaches

from dataset. Text pre-processing is the process two in similar word. They also used elongated
of applying any type of computation on texts and numbers into explain their emotion.
shapeless raw data and transforms it into an Therefore, normalization need for improving
arrangement that more easily and effectively performance of the system. In this work
processed in another procedure. The architecture normalized according to the following rules:
of proposed model in this work has been The word “baay‟ee” and “baayyee” to mean
consisted two phases, training and testing. In the many has the same meaning with different
training phase, the machine learning needed to writing. The only difference is that the
learn from a set of annotated AO tweets and it apostrophe “‟” is replaced by “y”.
was used to classify un-labeled tweets in testing
The word “Bilisummaa” to mean Freedom is
phase. These two phases have the following
written as “Bilisummaa” or “BILISUMMAA” at
preprocessing steps: tokenization, stop word
the beginning of the sentence and it is written as
removal, normalization and stemming.
“bilisummaa” at the middle of a sentence while
Tokenization is the first step in preprocessing it has the same meaning at both locations in the
of both phases used to tokenize a sequence of sentence. We normalized lower letter which
text or document into sentences and changes most common in a text.
sentences into word. In this study, we proposed The other is emotion in numbers. For example:
sentence level sentiment analysis for Afaan “100% sin jaallanna” to mean we love 100%
Oromo tweets text. Since sentence level needed normalized into “persentiidhibbatokko sin
tokenized document text into sentence. jaallanna” to mean: we love one hundred
Document tokenization into sentence in Afaan percent.
Oromo is like English language based on period
There are also elongated texts, for example,
(.), question mark (?) and exclamatory mark (!).
OBN sin jaallannaaaaaaaaa to mean: we
After this, sentence is tokenized into word on
loveeeeeeeee OBN is normalized to OBN sin
space between sequences of each word in
jaallanna to mean we love OBN.
sentences for easy to apply the next step stop
word removal. Stemming is a component that reduces
morphological variants of words into root form.
Stop word removal is also another step of
In morphologically complex languages like
preprocessing used to remove most frequently
Afaan Oromo, a stemmer will lead to important
word in text that are not relevant or have no
improvements in opinion mining systems [4].
impact to classify sentiments. In this research,
For example in Afaanoromo, “afeeramuu”,
some Afaan Oromo stop words are significant
”afeerraa”, “afeeramtani”, “afeeraman”,
for the sentiment classification and need to
“afeeramaniiru”, all are rooted to “afeer” . For
remain in the text [6]. As an example “hin” stop
our case, we used Afaan Oromo stemmer
word of Afaan Oromo that influence the
algorithms developed by [12].
sentiment of classification. For example,
Feature Extraction
1. OBN Seenaa fi DudhaaOromooirrattisirritti
hojjechaahinjiru. “OBN is not working Feature extraction is the process of plotting from
perfectly on history and culture of Oromo”. textual data to real-valued vectors. In our case,
after tweets cleaned from stop words, normalized
2. OBN Seenaa fi Dudhaa Oromooirrattisirritti
and stemmed, we used Tf-idf as feature
hojjechaajira. “OBN is working perfectly on
extraction. Tf-idf (term frequency- inverse
history and culture of Oromo”.
document frequency) is a common feature
In the first sentences, “hin” stop word is extraction method in text mining. We used Tf-
influencing opinion into negative emotion and idfvectorizer from scikit-learn feature extraction
“sirritti” (perfectly) is adverb that comes for tools.
more explaining the negativity of OBN. This
The Methodology of Proposed Approaches
perfectly word is also having another significant
which classify sentiment form normal into very Support Vector Machine
negative or very positive. So for this study, we Support vector machine (SVM) is a type of
filtered removed stop words through a manual supervised machine learning classification
process that is not relevant for the classification algorithm which can be used for both binary
process. label classification and multi-label
Normalization: Many people use different classification. SVM is originally formulated for
character form to write the similar word and binary classification [14]. For example, in the
they can use uppercase, lower case or mix of the paper [13], they used SVM for sentiment

International Journal of Research Studies in Science, Engineering and Technology V7 ● I7 ● 2020 14


Multi-Class Sentiment Analysis from Afaan Oromo Text Based On Supervised Machine Learning Approaches

analysis into two classes such as positive and machine based on achieved highest
negative class. However, in our case, we have performance.
been proposed multi-class sentiment analysis. Random Forest Classifier
For multi class problem, support vector machine
Random forest is a popular supervised machine
can solve multi classification problem using
learning used for classification and regression.
one-against-one or one-against-rest method [14].
A random forest is an ensemble of tree
In the one-against-one approach, instead of
structured classifiers and every tree of the forest
trying to distinguish one class from all the
gives a unit vote, assigning each input to the
others, they seek to distinguish one class from
most probable class label [8]. Random forest is a
another one. In the one-against-rest binary of
tree based machine learning algorithm that
problem, SVM is trained for each class in order
leverages the power of multiple decision trees
to distinguish that class and the rest.
for making decisions. Random forest is a forest
Support vector machine (SVM) has many of trees. The main advantage of random forest is
classes library that is capable of performing that it does not suffer from over fitting, even if
binary classification and multi classification more trees are appended to the forest [8].
such as: SVC, NuSVC and linearSVC library. Random forest classifier uses many of the
SVC and NuSVC are similar methods and use parameters as Decision tree classifier. However,
kernel parameters. On the other hand, linear being a forest rather than an individual decision
SVC is another implementation of support tree, Random forest classifier has certain
vector classification for the case of a linear parameters that are either unique to random
kernel, but linear SVC does not accept keyword forests or particularly important [15].
kernel. In the multi-class classification process,
SVC and NuSVC implement the “one-against-
EXPERIMENTAL RESULTS
one” approach. If K is the number of classes, In this paper, our experiment was conducted by
then K(K-1)/2 classifiers are constructed and used supervised machine learning methods on
each one trains data from two classes. To OBN twitter annotated dataset. These methods
provide a consistent interface with other are Support vector machine and Random forest
classifiers, the decision function shape option classifier. For Support vector machine, we used
allows transforming the results of the “one- SVC library from scikit-learn and for Random
against-one” classifiers to a decision function of forest, we used Random Forest Classifier
shape (n_sample, K). Linear SVC automatically library from scikit-learn of python libraries by
uses the one-against-all strategy by default. It using Anaconda programming. This SVC
can also specify it explicitly by setting the library is with kernel parameter. The term kernel
multiclass parameter to ovr (one-vs-the-rest). is a support vector machine parameter which is
Theoretically, the term „One-against-one „is an to take data as input and transform it into the
approach used to classify one class from others required form. For user interface build, we used
classes. In this approach, instead of trying to tkinter library from python libraries.For
distinguish one class from all the others, it seeks example, when an input Tweets “Qophiisirna
to distinguish one class from another one. As a Gadaakeessanbaayyeebayyeegaariidha.”/‟Your
result, the train one classifier per pair of classes, Gadaa System program is very very good.‟
which leads to K(K-1)/2 classifiers for K entered and outputted the system was captured
classes. Each classifier is trained on a subset of the following contexts as in figure 2 below.
the data and produces its own decision After an inserted tweet, system was applied
boundary. In this work, we used SVC library. preprocessing steps and the proposed models
We select this library form other support vector were predicted the tweet categories based on
trained dataset.

Figure1. Sample Input Tweet opinion and the Polarity Value

15 International Journal of Research Studies in Science, Engineering and Technology V7 ● I7 ● 2020


Multi-Class Sentiment Analysis from Afaan Oromo Text Based On Supervised Machine Learning Approaches

The performance of the proposed system is prepared testing dataset that 20% of dataset and
analyzed on the basis of analysis metrics scikit- the left is training dataset. We reported the
learn namely recall, precision and accuracy. evaluated proposed system results according to
These methods were evaluated using the the following captured of python code.
Support Vector Machine Model

Random Forest Model

In our final experiment, we have compared our research were not more than three classes and
study with other papers regarding sentiment they used small dataset. Beside this, our dataset
analysis from Ethiopian language, Afaan Oromo is some more than the other research papers and
text. All of the previous researchers were we analyzed at sentence as well as we used
analyzed sentiment at document level. The multiple classes.
numbers of sentiment classes analyzed in their

International Journal of Research Studies in Science, Engineering and Technology V7 ● I7 ● 2020 16


Multi-Class Sentiment Analysis from Afaan Oromo Text Based On Supervised Machine Learning Approaches

Table2. Comparative Analysis with other papers for Afaan Oromo text sentiment analysis
Paper Title Features Methods Classes Dataset Accuracy
Aspect based lexicons and rule Rule based 2 (Positiveand ORTO news 88.5% for Pos
summarization of based Negative) service dataset 88.3% for Neg
Afaan Oromo news (400 corpus
text[4] size)
Opinion Mining for lexicons, POS Unsupervised 3 (Positive, Negative OPDO official Good in bigram
Afaan Oromo [5] and N-gram and Neutral ) Facebook page and low in tri-
(600 corpus gram
size)
Sentiment Analysis Word Convolutional 2 (Positive and Facebook page LSTM: 89%
for Afaan Oromo embedding Neural Network Negative) of ODP (1452 CNN: 87.6%
[6] and Long Short corpus size)
Term Memory
Constructing Bag of word Decision Tree 2 (Positive and Afaan Oromo NB: 88%
Sentiment Mining (unigram) and Naïve Negative) text Ethiopian DT: 83%
Model for Bayes. politics (1800
Opinionated Afaan corpus size)
Oromo Texts[11]
Our paper TF-IDF Support vector 5(Very Negative, OBN Twitter SVM: 90%
machine and Negative, Neutral, (1810 corpus RF: 89%
Random forest Positive and Very size)
Positive)
CONCLUSION AND FUTURE WORK The previous researchers were considered on
document level, feature level and our paper
In this paper, we analyzed the performance of focused on sentence level. Therefore, we
supervised machine learning approaches to forward research direction of combination
classify multiple class sentence level sentiment
document and sentence level sentence analysis.
analysis from Ethiopian language, Afaan Oromo
text. These approaches are Support vector REFERENCE
machine (SVM) and Random forest (RF). The [1] Bing Liu, “Sentiment Analysis and Opinion
dataset were used to train and test a sentiment Mining”, Morgan & Claypool Publishers, May
classifier based on SVM and RF uses 2012.
tokenization, stop word removal, normalization [2] Summary and Statistical Report of the 2007
and stemming as preprocessing. Then, we used Housing Census: Population Size by Age and Sex,
Tf-idf (term frequency- inverse document Addis Ababa, 2008.
frequency) as feature extraction as input to [3] Tabor Wami's, new book titled, Yewugena
machine learning algorithms. The experiment Dirsetochena Yetarik Ewunetawoch, 2004.
results that proposed approaches SVM [4] Wegderes Tariku. “Sentiment Mining and Aspect
performed accuracy 90% and RF achieved an Based Summarization of Opinionated Afaan
accuracy of 89% on OBN Twitter dataset with Oromo News Text”, Debre Berhan University,
1810 corpus size. But, due to un-availability of unpublished, 2017.
standard sentiment dataset of Afaan Oromo text [5] Jemal Abate. “Unsupervised Opinion Mining
Approach for Afaan Oromoo Sentiments”,
on social media, still there exist some
Haramaya University, 2019.
limitations. Therefore, in the future work, we [6] Megersa, Oljira. “Sentiment Analysis of Afaan
forward the following points on sentiment Oromoo Facebook Media Using Deep Learning
analysis regarding to Afaan Oromo text. One of Approach”, Institute of Technology, Ambo
the great challenges in sentiment and opinion University, 2020.
mining is having Afaan Oromo standard dataset. [7] Soudamini Hota and Sudhir Pathak. “KNN
So, future research should consider creating classifier based approach for multi-class sentiment
standard dataset within large size that can be analysis of twitter data”, International journal of
used for future research in this area. It is also Engineering and Technology, Chandigarh
improving performance by developing a hybrid University, Mohali, India.
approach mechanism using these two methods [8] Breiman Leo, Random forests- Random
and will also implement others approaches of features,Statistics Department University of
California Berkeley, CA 94720
sentiment analysis. The other research direction
[9] Wondwossen Philemon and Wondwossen
to be considered is going for Aspect based
Mulugeta, "A Machine Learning Approach to
sentiment analysis regarding Afaan Oromo text. Multi-Scale Sentiment Analysis online posts".

17 International Journal of Research Studies in Science, Engineering and Technology V7 ● I7 ● 2020


Multi-Class Sentiment Analysis from Afaan Oromo Text Based On Supervised Machine Learning Approaches

[10] Rashedul. Amin. et. “An Automated System of [13] Nurulhuda Zainuddin and Ali Selamat. “Sentiment
Sentiment Analysis from Bangla Text using Analysis Using Support Vector Machine”, IEEE
Supervised Learning Techniques”IEEE 4th 2014 International Conference on Computer,
International Conference on Computer and Communication, and Control Technology, 2014
Communication Systems, 2019. [14] Chih-Wei Hsu and Chih-Jen Lin,” A Comparison
[11] Dursa Abduselam. “Constructing sentiment mining of Methods for Multi-class support vector
model for opinionated Afaan Oromo texts on Machines”.
Ethiopian politics” Gondar University, unpublished [15] Chris Albon, Machine learning with python
2017. Cookbook, Published by O‟Reilly Media, Inc.,
[12] Debela Tesfaye and Ermias Abebe Designing a 1005 Gravenstein Highway North, Sebastopol,
rule based Stemmer for Afaan Oromo Text. 2018.

Citation: Negessa Wayessa*, Sadik Abas, “Multi-Class Sentiment Analysis from Afaan Oromo Text Based
On Supervised Machine Learning Approaches” International Journal of Research Studies in Science,
Engineering and Technology, 7(7), 2020, pp. 10-18.
Copyright: © 2020 Negessa Wayessa*, Sadik Abas. This is an open-access article distributed under the
terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and
reproduction in any medium, provided the original author and source are credited.

International Journal of Research Studies in Science, Engineering and Technology V7 ● I7 ● 2020 18

You might also like