Fake News Detection On Social Media by Using Deep Learning Approach For Afaan Oromoo Language

AMBO UNIVERSITY
FACULTY OF ENGINEERING AND TECHNOLOGY
FAKE NEWS DETECTION ON SOCIAL MEDIA BY USING DEEP

LEARNING FOR AFAAN OROMO LANGUAGE.
SUBMITTED BY
ABEBE WALDESANBET GINA
A THESIS SUBMITTED TO THE SCHOOL OF GRADUTE STUDIES OF

AMBO UNIVERSITY IN FULFILMENT OF THE REQUIREMENT FOR
THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE
ADVISOR
SUKHADAVE PRAMOD (PhD.)
AMBO
April. 2021
i|Page
Declaration
I, the undersigned, declare that the thesis comprises my own work. In compliance with
internationally accepted practices, I have dually acknowledged and refereed all materials
used in this work. I understand that non-adherence to the principles of academic honesty and
integrity, misrepresentation/ fabrication of any idea/data/fact/source will constitute sufficient
ground for disciplinary action by the university and can also evoke penal action from the
sources which have not been properly cited or acknowledged.
________________________________
Name of the student.
________________________
Signature.
___________________
Date.
ii | P a g e
AMBO UNIVERSITY
SCHOOL OF GRADUATE STUDIES
CERTIFICATION SHEET
A thesis research advisor, I hereby certify that I have read and evaluated this thesis prepared under
my guidance by _______________________entitled “FAKE NEWS DETECTION ON
SOCIAL MEDIA BY USING DEEP LEARNING FOR AFAAN OROMO LANGUAGE”.I
recommend that it be submitted as fulfilling the thesis requirement.
______________________________________________ ________________ ________________
Name of Major Advisor Signature Date

______________________________________________ ________________ ________________
Name of Co-Advisor Signature Date

As mentioned of the Board of Examiners of the M.Sc/MA. Thesis open defense examined.
We certified that we have read and evaluated the thesis prepared by ABEBE
WALDESANBET GINA and examined the candidate. We recommend that the thesis be
accepted as fulfilling the thesis requirements for the degree of Master of Science/Art in
COMPUTER SCIENSE.
_______________________ _________________ _____________

Chair Person Signature Date
_______________________ _________________ _____________
Internal Examiner Signature Date
_______________________ _________________ _____________
External Examiner Signature Date
iii | P a g e
Acknowledgment
First and foremost extraordinary thanks go for my Almighty God (Waaqa Gurraacha).
I would like to express my gratitude and heartfelt thanks to my advisor, Dr. Promod for his keen
insight, guidance, and unreserved advising. I am really grateful for his constructive comments and
critical readings of the study.
I am very grateful to the management and staff of Ambo University, especially to all Computer
Science staffs for constant support and providing appropriate professional comment.
I am immensely indebted to my beloved family especially, Ato. Ashenafi Tadessa I would like to
thanks you for you endeavor support and guidance throughout my life in any manner I am really
proud to have you.
The last but not the least, my sisters Tajitu W/sanbet you takes the role of mothers and you have
special place in my heart for giving me unconditional care, love, time, patient and support
throughout my life and thank you more (Galatoomi!)
My special thanks also goes to all my friend, family, that I did not mention your name and all
classmate for encouragement, support and the good friendship we shared during class.
iv | P a g e
Abbreviations
RNN- Recurrent Neural Networks
LSTM -Long Short Term Memory
Bi-LSTM – Bidirectional Long short – term memory
CNN: - Convolutional Neural Networks
DNN: -Deep Neural Networks
CBOW: -Continuous Bag of Words
TF-: Term -Frequency
NN: - Neural Network
IDF: - Inverse Document Frequency
SWOT: - Strength, Weakness, Opportunity, Threat.
ORORFN: - Oromoo Real and Fake News
Nltk: - Natural Language Toolkit
TF-IDF: - Term frequency–inverse document frequency
v|Page
Contents
Declaration .................................................................................................................................................... ii
Acknowledgment ......................................................................................................................................... iv
Abbreviations ................................................................................................................................................ v
Abstract ........................................................................................................................................................ xi
CHAPTER 1 ................................................................................................................................................ 1
1. INTRODUCTION............................................................................................................................... 1
1.1 Background Of The Study ............................................................................................................ 1
1.2 Statement of the problem and justification of the study ............................................................... 2
1.3 Research question?........................................................................................................................ 3
1.4 Objective of the study ................................................................................................................... 3
1.5 Motivation ..................................................................................................................................... 4
1.6 Significance of the study ............................................................................................................... 4
1.7 Scope and limitation of the study. ................................................................................................. 5
1.8 Research methodology. ................................................................................................................. 5
1.8.1 Literature review ........................................................................................................................... 5
1.8.2 Data collection .............................................................................................................................. 5
1.8.3 Preprocessing ................................................................................................................................ 6
1.8.3 Word embedding vector representation ........................................................................................ 6
1.8.4 SWOT analysis ............................................................................................................................. 7
1.9 Organization of the thesis ............................................................................................................. 8
CHAPTER 2 ................................................................................................................................................ 9
2. REVIEW ON FAKE NEWS DETECTION ..................................................................................... 9
2.1 Introduction ................................................................................................................................... 9
2.2 Contributors of fake news ............................................................................................................. 9
2.3 Basic concept of fake news detection in deep learning models .................................................. 10
2.3.1 Neural network............................................................................................................................ 10
2.3.3 Neural network for multi-label classification ............................................................................. 14
2.3.4 Squared error function. ............................................................................................................... 15
vi | P a g e
2.4 Recurrent neural networks (RNN) .................................................................................................. 15
2.5 Long short-term memory networks (LSTMS) ................................................................................ 17
2.6 Bi-directional long short-term memory. ......................................................................................... 18
2.7 Related works on fake news detection ........................................................................................ 19
2.8 Approaches of fake news detection ............................................................................................ 20
2.8.1 Content based approach ...................................................................................................... 20

2.8.2 Propagation based approach................................................................................................ 21
2.8.3 Linguistic approach ............................................................................................................. 21
2.9 Related work on local language .............................................................................................. 22
2.10 Afaan Oromoo language ......................................................................................................... 22
2.10.1 Afaan Oromo Qubee and writing system .................................................................................... 22
2.11 News writing structure on social media .................................................................................. 26

CHAPTER 3 .............................................................................................................................................. 27
3 MATERIALS AND METHODS ..................................................................................................... 27
3.1 Data acquisition, fake news detection in Afaan Oromo .............................................................. 27
3.2 Data preprocessing ...................................................................................................................... 30
3.2.1 Tokenization and padding ................................................................................................... 31

3.2.2 Stemming ............................................................................................................................ 31
3.2.3 Token Embedding ............................................................................................................... 32
3.2.4 Sequence Creation............................................................................................................... 32
3.3 Data visualization........................................................................................................................ 34
3.4 SYSTEM ARCHITECTURE ..................................................................................................... 36
CHAPTER 4 .............................................................................................................................................. 40
4 EXPERIMENT AND RESULTS ..................................................................................................... 40
4.1 Tools used ................................................................................................................................... 40
4.2 Data set creation .......................................................................................................................... 40
4.2.1 Dataset 1 (Real news dataset) ............................................................................................. 40

4.2.2 Dataset 2 (Fake news dataset) ............................................................................................. 41
4.3 Data pre‑processing using nltk and tokenizer ............................................................................ 42
4.4 Word Embedding ........................................................................................................................ 42
4.5 Sequential Model ........................................................................................................................ 43
vii | P a g e
4.6 Experimental evaluation and results ........................................................................................... 46
4.7 Discussion ................................................................................................................................... 49
CHAPTER 5 .............................................................................................................................................. 51
5. CONCLUSION AND RECOMMENDATION .............................................................................. 51
5.1 Conclusion .................................................................................................................................. 51
5.2 Recommendation and future work .............................................................................................. 52
References .................................................................................................................................................. 53
Appendix ..................................................................................................................................................... 59
viii | P a g e
List of Tables
Table 3-1 Examples of statements and side information in the dataset. ....................................... 30
Table 4-1 Dataset specification. .................................................................................................... 40
Table 4-2 Specification of News Dataset 1................................................................................... 41
Table 4-3 Specification of News Dataset 2................................................................................... 41
Table 4-4 Concatenated dataset specification. .............................................................................. 42
Table 4-5 Bi-LSTM Confusion matrix model. ............................................................................. 47
Table 4-6 Experimental results ..................................................................................................... 48
Table 4-7 LSTM model details. .................................................................................................... 48
ix | P a g e
List of figures
Figure 1-1 SWOT Analysis ............................................................................................................ 8
Figure 2-1Biological Neural Network. ......................................................................................... 11
Figure 2-2 Simple Artificial Neuron. ............................................................................................ 11
Figure 2-3Neuron model with logistic activation function. .......................................................... 13
Figure 2-4 Neural networks with 2 hidden layers for 3-label classification. ................................ 15
Figure 3-1Octoparse implementation. .......................................................................................... 28
Figure 3-2 Partial sequence diagram for implemented scraper. ................................................... 29
Figure 3-3The basic seq2seq model. ............................................................................................. 33
Figure 3-5 Word cloud for text that is real. .................................................................................. 35
Figure 3-6 word cloud for text that is real. ................................................................................... 35
Figure 3-7 Architecture of Fake news detection based on Bi-directional LSTM-recurrent neural
network. ........................................................................................................................................ 36
Figure 3-8 Architecture of Fake news Detector using Flask. ....................................................... 39
Figure 3-9 Serialization and de-serialization. ............................................................................... 38
Figure 4-1 Keras summary of RNN model. .................................................................................. 43
Figure 4-2 General-architecture-of-Bi-directional-LSTM-RNN. ................................................. 44
Figure 4-3 Number of epochs applied for training. ...................................................................... 45
Figure 4-4 Bi-LSTM Classification report. .................................................................................. 48
Figure 4-5 System Interface for Afaan Oromo Fake News Detection. ......................................... 50
x|Page
Abstract
In recent years due to the booming development on internet, social media facilitates the creation
and sharing of information that uses computer-mediated technologies. This social media
development has changed the way groups of people communicate and interact. Now a day,
majority of people search and consume news from social media rather than traditional news
organizations these days. At one side, where social media have become a powerful source of
information and bringing people together, but identifying the inaccurate news is a difficult
problem. Without the concern about the credibility of the information, the unverified or fake news
is spread in social networks and reach thousands of users and get published, which can lead the
followers to negative effects or even manipulation of public or private events. One of the
distinctive challenges of detecting fake news on social media is how to identify fake news about
recent events. Most Previous works proposed appropriate machine learning models of fake news
detection and classification for English text. However, Afaan Oromo language difficulty in
semantic analysis due to morpheme ambiguity and lack of adequate datasets did not get more
attention. As a result, Afaan Oromo fake news detection is essential to maintain robust online
media and social network. We worked to resolve Social media fake news for Afaan Oromoo
language by implementing deep learning models and classifying them into the pre-defined fine-
grained categories. At first, we develop the paper presents a fake news detection models based on
Bi-directional LSTM recurrent neural network and representations obtained from these models are
fed into a Multi-layer Perceptron Model (MLP) for the final classification. The model is able to
give predictions on a benchmark dataset shows a results with an F1 score of 90%, which
outperforms the current state of the art.
The models were trained and evaluated on the Afaan Oromo language Fake News dataset Scrapped
from Twitter and Facebook. Finally, python spyder IDE was used for the web-based prototype
development of the trained Bi-LSTM models.
Key Words: Fake News detection; Deep learning; Bi-directional LSTM; Afaan Oromoo
xi | P a g e
CHAPTER 1
1. INTRODUCTION
1.1 Background of the Study
In the earliest times, long before the advent of computers and the web, fake news (also known as
deceptive news) were communicated through the oral tradition it may related event or issues in
public or private concern in the form of rumors (face to face), either to innocently talk about other
people lives, or to intentionally harm the reputation of other people or rival companies. Now a day
people need low cost and easy access to the information and dissemination quickly to push people
to search for news and know what is happening at the beginning of events. In recent years social
media play a crucial role over the traditional news transmission means and online content has been
playing a vital role in convincing user decisions and opinions(Ahmed et al., 2017) (Gereme & Zhu,
2019). Due to the booming of social networks fake news for commercial, politics and personal
interest has been widespread in the online world, users can get infected by these social media fake
news easily, which has brought about marvelous effects on the offline society already(J. Zhang et
al., 2020).
Now a day, due to different politics and economic events the huge amounts of information are
generated on the social networking with various social media’s formats(Conroy et al., 2015).
However, when some event has occurred, many people discuss it on the web through different
social media. Consequently it is possible that some fake news or misinformation are generated and
propagate in the chain throughout the social media, unintentional (or not) to lead other users with
intention of deception, misleading, grabbing attention or even financial and political gain(LIAO
& LIN, 2018)(Cardoso Durier da Silva et al., 2019).
Fake news is a sensitive message with its contents that claim people to believe with falsifications
and were received they will rapidly dispersed in chain of today’s digital world to other peoples.
The dissemination of fake news through internet has made the confusion of truth with taking
advantage of social media content to mislead readers and get published, which can lead to negative
effects or even manipulation of public events. Reports indicate that the human ability to detect
deception without special assistance is only 54% (Girgis et al., 2019)(Conroy et al., 2015).
1|Page
Fake News detection is a challenging task to accomplish, as it requires models to summarize the
news text and compare it to the actual news in order to classify it as fake or real news. However
technologies such as Artificial Intelligence (AI) and Natural Language Processing (NLP) tools
offer great potential for us to build systems which could automatically detect and classify fake
news. Moreover, the task of comparing proposed news with the original news itself is a daunting
task as its highly subjective and opinionated(Thota et al., 2018a).
There are different approach and way to detect fake news, stance detection is one of the way and
which will be the focus of our study. Stance Detection is the process of detecting the relationship
between two pieces of text which is fake and real. In this study, we explore ways to predict the
stance by calculate the semantic similarity between two piece of texts by using a deep learning
models.
Through experimental procedures, we used pre-trained model which can detect fake news by
accurately predicting stance between news articles. We also studied how different hyper
parameters affect the model performance and summarized the details for future work.
1.2 Statement of the problem and justification of the study

According to the literature, fake news means false news, lie, deceive, cheating, illusion,
misleading, dummy, simulation, fabrication, manipulation and propaganda fabricated intentionally
(or Not) to mislead readers and get published, which can lead to negative effects or even
manipulation of public events(Kiros et al., 2018)(Girgis et al., 2019). Due to their low cost, easy
access, and rapid dissemination the traditional way of human communication changed into a new
digital form and textual information propagate through social media, like tweeter and facebook
which is the main news sources of information for millions of people around the globe. A recent
published( Kiros et al., 2018) has purposely reviewed and analyzed concerned with the unrest of
Ethiopia in autumn 2016, according to the empirical studies and observations of the author, social
media fake news had contribute of impacts on economy, peace and development of the country.
In Africa Afaan Oromo language is one of the major languages that is widely spoken and used in
the most dominant parts of Ethiopia and other neighbor countries, like Kenya and Somalia(Jimalo,
Babu P, et al., 2017)(Tesfaye, 2010b). It is highly increasing from time to time since Afan Oromoo
language became official language in Oromia regional state and the way of human communication
changed into a new digital form and textual information propagate through social media. Currently
enormous news releases in the language reach the readers from many social media sources. There
2|Page
are a number of media agencies produce an Afaan Oromo news articles in social media format.
Some of such are: VOA Afaan Oromo, BBC Afaan Oromo, Oromia communication bureau,
Oromia Broad cast service, etc.
Millions of news articles are being circulated every day on social media in Afaan Oromo text. As
a result of this, how one can trust which is real or fake. (Bahad et al., 2020)The news disseminated
on social media platforms may be of low quality carrying misleading information and comes at
the cost of dubious trustworthiness and significant risk of exposure to fake news and Afaan Oromo
text readers are not exceptional to suffer from this problem. It’s difficult to accurately distinguish
true from false information by just looking at these short pieces of information.
As a consequence of the above, fake news have become one of the major concerns because it’s
potential danger to modern society and there is no suitable hand engineering features model created
in the area which achieves the states-of-the arts results to identify fakeness of such statements.
1.3 Research question?

Social media Fake news detection is currently most concerned with English language and missing
African Languages, Such as Afaan Oromoo. In accordance with the previous section 1.2 the
research problem were identified, the thesis has investigate how deep learning model can be used
to detect and classify fake news. For most the thesis will presents a fake news detection models
based on Bi-directional LSTM recurrent neural network and attempt to answer the following
questions: -How can deep learning model can be used for fake news detection and classification
for Afaan Oromo news text?.
The results of the research will contribute towards existing research fake news detection using
machine learning and attempt to answer the research question.
1.4 Objective of the study

The general objective of the study is to build fake new detection on social media by using Deep
learning technique for Afaan Oromo news text.
The specific objectives of the study are try to achieve the following:-
To review related research works in the area of fake news detection in different approach.
3|Page
To introduces the topic of fake news and deep learning algorithms that would be effective
in classifying fake news.
To scrap news documents from social media for training and testing model.
To develop a methodology to achieve the objective of the research and evaluate the result
of pre-trained model by using the evaluation metrics.
To develop System prototype for Afaan Oromo language fake news detection.
To presents a possible solution and draw conclusions based on experimental result as well
as lays some ground work in further study on this area.
1.5 Motivation
In the case of our country Ethiopia, fake news has the potential for extremely negative impacts on
individuals and society that led to a lot of problem. The aforementioned facts and figures show
that Afaan Oromo Language is the widely spoken language in the horn of Africa. With the fact
this language is used in schools, offices, and social media, there is the huge amount of data
available that encourage study related to deep learning tasks associated with language. The
motivation for research on this topic was that this is a relatively new area of research with many
opinions but not many concrete solutions.
The development of deep learning application for this language required to cope up with the
current technology to bring awareness, propose a solution, and work towards minimizing the
effects of fake news.
1.6 Significance of the study

This thesis can serve as an input to the development of Fake news detection in Afaan Oromo and
has the importance to initiate further research in the area of Fake news for Afaan Oromo language.
Moreover, this study can also help to initiate researches in other Ethiopian languages for
compacting Social media fake news.
4|Page
1.7 Scope and limitation of the study.
This thesis focuses on fake news detection for Afaan Oromo news articles supervised
learning. Therefore, the experimentation has dealt with Afaan Oromo news texts only. On
the other hand, the absence of standard text corpus for Afaan Oromo language was a
limitation and the amount of data prepared for this study is relatively small and require
further enhancement of the size for further experimentation and evaluation.
1.8 Research methodology.

1.8.1 Literature review
In order to know the scientific facts and identify the research problem to achieve the objectives
stated in section 1.4 primarily, literature were reviewed on:-
Deep learning models and their application.

Current and past practices of fake news detection and classification for different
languages.
Approaches of fake news detection.
1.8.2 Data collection

In order to create an artifact capable of detecting fake news, a vast amount of Afaan Oromo text
article for fake and real news is required. Until the time of identifying the research problem, there
was no comprehensive dataset gathered in the magnitude needed to train Deep learning model on.
Thus, Afaan Oromo news data sets were collected using Octoparse and Facepager web page
scraping tools, from social media sources that have relatively large amounts of followers and
authorized by government in addition to verification of company. However because of different
social media accounts and pages can post the news in different languages, we have focused on
BBC and VOA Afaan Oromoo twitter page.
Precisely, we have used Octoparse tools to scrap the real news from the official tweeter page of
VOA and BBC Afaan Oromo, as well as Facepager, have used to scrap fake news from fake
accounts that Facebook has been working to stamp out and that’s remained manually edited,
because of the accounts and pages can post the news in different languages and they can easily
mislead or misinform the massive society.
5|Page
In this study we construct Afaan Oromo fake and real news datasets from Twitter and Facebook,
respectively and which consists 4500 real news and 2500 fake news. We name this data set as
ORORF2020 in this study. The amount of dataset prepared for the study is relatively larger than
(Zaman et al., 2020)news data set prepared for Indonesian language. However it’s smaller than
the data set prepared for English language and it requires more enhancement for further
experiment.
1.8.3 Preprocessing
Once data set is prepared, real-world data tend to be incomplete, noisy and inconsistent. This can
lead to a reduced quality of collected data, and, further to a low quality of models built on such
data. In order to address these issues, the data requires special preprocesses to implement deep
learning algorithm on them. We mainly focused on two issues: - Firstly the data must be organized
in proper form, for deep learning algorithms, and secondly, the data sets used must lead to the best
performance and quality for the models. To reduce the size of actual data, generic refinements like: -
stop-word removal, tokenization, a lower casing, and stemming were applying to removing the
irrelevant information that exists in the ORORF2020 data. We also provide insights into different
word vectors representations we used as part of our analysis.
1.8.3 Word embedding vector representation

The problem here is after we preprocessed data set neural network for this tasks do not operate
directly on texts, sentences, or words and its challenging to perform text analytics, but on their
representation in the numerical form. So far we have seen deterministic methods to determine word
to vectors representations of a text in n-dimensional space. Word to vectors is not a single
algorithm it’s a combination of techniques we consider one hot encoder for mapping categorical
integers and apply the context of word embedding by training it with Bi-directional LSTM and
cosine similarity measure are passed as input features to neural network. Once the classifier was
trained to classify, a threshold was applied to the output score to determine whether it’s considered
a True or Fake and statistical analysis, a confusion matrix was used to compare across varied
thresholds.
6|Page
1.8.4 SWOT analysis
In the research of fake news detection for Afaan Oromoo language deep learning approach ,SWOT
Analysis is a useful technique for understanding the Strengths and Weaknesses of the work, in
addition to identifying both the Opportunities open to our work and the threats for the future.
We have used Bi-LSTM as a part of recurrent neural network it has designed for long term
dependencies, therefore the idea which makes it different and unique from other neural network is
that it is able to remember information for a long span of time without learning, again and again,
making this whole process simpler and faster, it includes an inbuilt memory for storing
information. There is no special training step or units added, the idea is just to read a sentence
forward and backward to capture more information and proving that Bi-LSTM representations are
more robust than representation learned by other models. Despite recent concerns and
controversies, YouTube is the leading social network of the world next to Facebook. However we
are limited to Afaan Oromoo news texts and as well as, the news texts we collect is not sufficient
enough when compared to the research for English language. Table 1.1 shows summary of SWOT
analysis of our work.
Strength Weakness
 We are prepared around 7000 Afaan  Our data set is not sufficient enough.
Oromoo news text and trained the Because, the model requires large
model. The implementation of model is amount of data for training.
simple and effective  We did not consider to unlabeled data.
 Rather than spending hours and human
powers to check the validity of news,
configure deep learning model to
manage the problem of fake news in
Afaan Oromoo News text.
7|Page
Opportunity Threats
 Fake news detection is the hottest  As opportunity when combine with
research area in deep learning the new technology and automate
approach. So, combine with newer complexity of algorithm may raise
forms of technology like IoT (Internet when it comes to implementation
of Things) allow us automate and
chances to make greater profits in the
environment
 It can be applied to unlabeled data
Figure 1-1 SWOT Analysis
1.9 Organization of the thesis
This thesis report is organized into five chapters. The first chapter talks about the motivation
behind conducting the research and discusses: background of the study, statement of the problem,
the objectives, methodology and scope of the study.
The remaining chapters of the paper are structured as follows:
Chapter 2: will give an overview of the literature underlying this research, for fake news detection
in the area and summarize the main approaches proposed to address this problem. Chapter 3: will
describe in detail materials and methods used to detect fake news.
Chapter 4: will present and discuss in detail the findings and results of the various experiments
carried to evaluate the proposed detection methods and the corresponding datasets used. Chapter
5: will make concluding remarks, by discussing the overall results of the research in the context of
the related work. In addition, it will suggest possible improvements and recommendation for future
works.
8|Page
CHAPTER 2
2. REVIEW ON FAKE NEWS DETECTION

2.1 Introduction
Fake news is defined as falsehoods formatted and circulated in a way as to make them appear
authentic and legitimate to the readers(Talwar et al., 2019). With the advent of Facebook, Twitter,
and other social media, fake news can take the advantage of multimedia content to mislead readers
and get published, which can lead to negative effects or even manipulation of public
events(Chaudhry, A. K., Baker, D. & Thun-Hohenstein, 2017). While many social media as much
real, those who are malicious and out to spread lies may or may not be real people(Zhao et al.,
2020).
Fake news detection is a hot topic in the past few years, several studies focus uniquely on the text
of the news insight into the procedure of detecting fake news and its implementation (Kwon et
al., 2013)(Fang et al., 2019)(Kwon et al., 2013)(Brien et al., 2018). There is today a great deal of
controversy over digital and social media that facilitate sharing of information or ideas via virtual
community and network. According to (Kunapareddy et al., 2019) phony news spread through
social media are classified:- satire or parody, misleading news, Sloppy revealing and Intentionally
misleading news.
2.2 Contributors of fake news

While the presence of fake news is not new, the internet and social media has changed the ways
it’s created and spread. However in order to study fake news on social media, it is crucial to first
provide and considers previous and current methods for fake news contributors.
According to (Stahl, 2019) fake news contributor has classified into social bots, trolls and cyborg
user. If a social media account is being controlled by a computer algorithm, then it is referred to
as a social bot. However, fake humans are not the only contributors to the dissemination of false
information; real humans are very much active in the domain of fake news. As implied, trolls are
real humans who “aim to disrupt online communities” in hopes of provoking social media users
into an emotional response. While contributors of fake news can be either real or fake, what
happens when it’s a blend of both? Cyborg users are a combination of automated activities with
9|Page
human input. The accounts are typically registered by real humans as a cover, but use programs to
perform activities in social media. Acknowledging the impact of fake news, researchers are trying
different methodologies to find a quick and automatic solution to detect fake news in the recent
past years(Brien et al., 2018).
2.3 Basic concept of fake news detection in deep learning models

2.3.1 Neural network
2.3.2 Inspiration
Neural networks are inspired by the way the human brain works. A human brain can process
huge amounts of information using data sent by human senses (especially vision). The
processing is done by neurons, which work on electrical signals passing through them and
applying flip-flop logic, like opening and closing of the gates for signal to transmit through(Kong
et al., 2020).
In biological setting (Figure 2.1)(Trung Tin, 2018), one neuron receives a signal from its tree of
dendrites, or dendritic tree, and if the signal is strong enough, it will pass through an axon and link
to a dendrite from another neuron. Two neurons are actually separate from each other by synaptic
gaps and only become connected when the link of an axon from one neuron and dendrite from the
others are stimulated.
10 | P a g e
Figure 2-1Biological Neural Network.
Figure 2-2 Simple Artificial Neuron.

Figure 2.2 general model given the binary input x1 ɛ (0, 1) it gets multiplied by W1. This part is to
model the synaptic connection between two neurons; where W1 is corresponding to the degree of
connection; it is bigger if the connection is strong, and smaller otherwise(Kong et al., 2020)(Leea
11 | P a g e
& Song, 2020). In other words, it reflects the influence of synaptic connection to the decision
whether or not the axon is stimulated. Similarly, we also have x2, x3,……..., xn that get multiplied
by W2, W3,...,Wn respectively. All of the products are then summed into one unit to depict
collective influence of those inputs. But whether the input is strong enough to make the neuron
fired? To model this, we take summation of all input neurons and put the result through an
activation function. If the of output of the activation function greater than 0, the axon is stimulated.
Figure 2.2 describes a neuron network with logistic activation function. In this case, activation
hW(x) is computed as:
hӨ(x) = g(WT x) (2.1)

Where g(z) is activation functions logistic function used in this example.
g(z) = 1
1 + e−z (2.2)
In the same fashion, multiple connections are modeled by multiple layers with different sets of
weights. Suppose that we have a neural network with 3 layers as described in (Kresnakova et al.,
2019) Figure 2.2, activations of the hidden layer (layer 2) are computed as:
a0 (2) = g(W(1)00 x0 +W(1)01 x1 +W(1)02 x2 +W(1)03 x3) (2.3)
a1(2) = g(W(1)10 x 0 + W(1)11 x 1 +W(1)12 x2 +W(1)13 x3) (2.4)
12 | P a g e
Figure 2-3Neuron model with logistic activation function.
Note that biases x0 and a(2)0 are omitted in this figure.
a(2)2 = g(W(1)20 x0 +W(1)21 x1 +W(1)22 x2 +W(1)23 x3) (2.5)
a(2)3 = g(W(1)30 x 0 +W(1)31 x 1 +W(1)32 x 2 +W(1)33 x 3) (2.6)
For the literature focused based on deep learning approach equation (2.3), (2.4), (2.5) are written
in matrix form.
Firstly, weight matrix representing the connection between layers 1 and layer 2 is written as:
W (1)00 W (1)01 W (1)02 W (1)03
W (1)10 W (1)11 W (1)12 W (1)13

W1 =
W (1)20 W (1)21 W (1)22 W (1)23 (2.7)
W (1)30 W (1)31 W (1)32 W (1)33
13 | P a g e
Then,
Z (2) =
W (1) x (2.8)
a 02
a 12
a 2= a22 = g( z(2) ) (2.9)
a 32
Finally,
z(3) = W(2) α 2 (2.10)

The output can be calculated by applying the activation function over the net input.
hW(x) = α 3 = g(z(3)) (2.11)
Where as
hW(x) =Output, α 3 = function, and g(z(3) are net input calculated
2.3.3 Neural network for multi-label classification

Assume that we have to perform a 3-label classification task, a neural network in Figure 2.4 can
be a possible solution to the problem. Vector output hW is a 3 dimensional one hot vector.
14 | P a g e
Figure 2-4 Neural networks with 2 hidden layers for 3-label classification.
2.3.4 Squared error function.

Loss function denotes the difference between predicted output ŷ from the model and the ground
truth y. A naive approach may be applied by taking difference between them or norm 1:
L = |y − ŷ| (2.12)
2.4 Recurrent neural networks (RNN)

RNN is a subclass of neural networks enable to handle a variable-length sequence input by
comprising a recurrent hidden layer whose activation at each time is dependent on the previous
time(Kresnakova et al., 2019)(Vo & Lee, 2019)(Bahad et al., 2020). Where the same subnetwork
(also called cell) is repeated for multiple times to read different inputs and repetitive structure is
illustrated as in Figure 2.5
15 | P a g e
Figure 2-5 an unrolled recurrent neural network.
Given input xt and hidden state of previous step ht-1, new hidden state and output at time step t is
computed as:
ht = σh(Whxt + Uhht−1 + bh) (2.13)
yt = σy(Wy ht + by) (2.14)
Where:
 xt is input vector at time step, ht is hidden layer vector, yt is output
vector at time step t.
 W,U,b are parameter matrices and vector.
 σh, σy are activation functions.
Recurrent neural network is particularly designed to deal with sequential data where
inputs are not fed into the networks all at once, but are broken down into small pieces
which are later passed into the network cell one after another. Despite being designed
to deal mimic and work on sequence nature of some kinds of data, it is proved that
RNNs have limitations in capturing long dependencies(Kresnakova et al., 2019). As a
result, Long Short-Term Memory Network, a modified version of RNN with gating
mechanisms, is devised to get over the limitation of vanishing gradient Problem in the
layers of deep neural network.
16 | P a g e
Figure 2-6 internal structure of Long Short-Term Memory Networks.
2.5 Long short-term memory networks (LSTMS)

Long Short-Term Memory (LSTM) networks are a type of recurrent neural network capable of
learning order dependence in sequence prediction problems and its very effective solution for
addressing the vanishing gradient problem.(Hochreiter & Schmidhuber, 1997)(Bahad et al., 2020).
In LSTM-RNN the hidden layer of basic RNN is replaced by an LSTM cell as in Figure 2.7
Figure 2-7 Structure of LSTM cell.
17 | P a g e
2.6 Bi-directional long short-term memory.
Long short-term memory (LSTM) is a structure that learns how much of the previous network
state to apply when input data is received. It preserve the error that can be back-propagated through
time and in lower layers of a deep network is a sequence processing model that consists of two
LSTMs: one taking the input in a forward direction, and the other in a backwards direction(Yulita
et al., 2017)(Bahad et al., 2020). Figure 2.8 shows, a Bi-Directional LSTM network steps through
the input sequence in both directions at the same time. It resolves the long-term dependency
problem of conventional recurrent neural network (RNN) using both the hidden state and the cell
state, which is a memory for storing past input information and the gates that are used to regulate
the ability to remove or add information to the cell state. The multiplicative gates and memory are
defined for time t(Kong et al., 2020)
Figure 2-8 Architecture of Bi-directional LSTM.

ft = σ ( Wf . [ ht – 1 , Xt ] + bf) 2.15
it = σ (Wi . [ ht – 1 , Xt ] + bi ) 2.16
ot= σ (Wo . [ ht – 1 , Xt ] + bo ) 2.17
Ct=ft*Ct-1 + it * tanh (Wc . [ht – 1, xt ] + bc) 2.18
18 | P a g e
ht= ot * tanh( Ct ) 2.19
Where, σ (⋅) is the sigmoid function and ft, it, ot, Ct, and ht, ft, it, ot, Ct, and ht are the vectors of
the forget gate, input gate, output gate, memory cell, and hidden state, respectively. All of the
vectors are the same size.
Moreover, Wf, Wi, Wo, and Wc Wf, Wi, Wo, and Wc denote the weight matrices of each gates
and bf, bi, bo, and bc, bf, bi, bo, and bc denote the bias vectors of each gates. Another shortcoming
of conventional RNN is that they are only able to make use of previous context(Kong et al., 2020).
To resolve this, bidirectional-RNN (Bi-RNN) stacks two RNN layers. If the existing RNN is the
forward RNN that only forwards previous information, Bi-RNN stacks backward RNN that can
receive subsequent information, as shown in Fig. 8. Combing Bi-RNN with LSTM gives
Bidirectional- LSTM (Bi-LSTM), which can handle long-range context in both input
directions(Kong et al., 2020).
2.7 Related works on fake news detection

Due to rapid development of internet, social media online news are gaining the popularity.
Meanwhile, fake news are typically generated and becoming widespread to mislead the readers.
Hence in past, to overcome the problem of online social media fake news, the research on the field
of fake news has been intense in recent years. Many authors propose the use of text mining and
machine learning techniques to analyze news textual data to predict the news credibility.
(Ahmed et al., 2017) Propose fake news detection model that use n-gram analysis and machine
learning model. Other works like (Bajaj, 2017) studied the problem of fake news, to build a
classifier that can predict weather a piece of a news is fake or real, based on the content of the
news and compare the results from multiple different model, used pre-trained 300-dimensional
GloVe embedding’s. Further along a line, to fill the gap of binary classification (Thota et al.,
2018) present neural network architecture to predict the stance between given pair of headline and
article body which is more computational capabilities to handle massive datasets and outperforms
existing model architectures and achieve the better F1 score. . Accordingly, deep learning models
present a finer performance over machine learning techniques.
(Castillo et al., 2011) and (Ahmed et al., 2017)Took advantage of feature-based methods to assess
the credibility of tweets on Twitter and achieved certain success, on the other hand those study has
heavily relied on feature engineering, which is expensive and time-consuming. Consequently,
more recent endeavors using deep neural network were performed to get grid of the need of feature
19 | P a g e
engineering. (Ma et al., 2016)Modeled streams of tweets as sequential data, then used Recurrent
Neural Network (RNN) for predicting weather the streams were fake or not. This approach was
proved to yield better results than previous feature-based learning and effective at early rumor
detection. The researcher outperforms existing model and able to achieve better accuracy.
2.8 Approaches of fake news detection
Figure 2-9 Approaches of fake news detection
2.8.1 Content based approach

(Fang et al., 2019)(Zhao et al., 2020) Employed machine learning methods to detect the stance of
newspaper headlines on their bodies, which can serve as an important indication of content
authenticity and multiple methods are used to extract features relevant to stance detection from a
collection of headlines and news article bodies with different stances. However, the multilayer
perceptron (MLP) model yields the best score among all classification models when compared
with the single model.(Bajaj, 2017) compare and report the results from multiple different model
implementations, to build a classifier that can predict whether a piece of news is fake based only
its content, thereby approaching the problem from a purely NLP perspective. Those approaches
20 | P a g e
achieved certain success, but heavily relied on feature engineering, but the paper did not consider
domain related feature such as entity relationship.
2.8.2 Propagation based approach

(Zhao et al., 2020) demonstrates collective structural signals that help to understand the different
propagation evolution of news and track large databases of fake news and real news in both online
social networks that fake news spreads distinctively from real news even at early stages of
propagation, e.g. five hours after the first re-postings. In this study, there is a propagation dynamic
between real and fake news, false claims reached far more people than the truth - While truth rarely
propagated. Another work(Kwon et al., 2013) The validity of news spreading pattern on Twitter
and tried to classify rumors from non-rumors with three features are explored: temporal, structural
and linguistic. In this work to extract temporal and structural features addressed in time series
fitting model and network structure.
2.8.3 Linguistic approach

Most liars use their language strategically to avoid being cough and the language approach
considers all the words in a sentence and letters in a word, how they are structured and how it fits
together in a paragraph, grammar and syntax. Methods that contribute to the language approach:-
(Thota et al., 2018b) Present the solution to the task of fake news detection by using Deep Learning
architectures and the Bag of Words (BoW) technique to processes each news article as a document
and calculates the frequency count of each word in that document, which is further used to create
numerical representation of the data, also called as vector features of fixed length.
However, this methodology has drawbacks in terms of information loss and it’s not as practical
because context is not considered when text is converted into numerical representations, position
of a word is not always taken into consideration.
(Stahl, 2018)Considers fake news detection in textual formats while detailing how and why fake
news exists in the first place. The method of semantic analysis examines indicators of truthfulness
by defining explain that truthfulness can be determined by comparing personal experience with a
profile and contact on the topic derived from similar articles.
21 | P a g e
2.9 Related work on local language
Research on fake news detection in Ethiopian local language is still at an early stage. (Gurmessa,
2020b) describe the application of natural language processing techniques with multinomial naïve
Bayes for the detection of fake news on 752 Afaan Oromoo news text. They were used facebook
as the source of news article and apply term frequency-inverse document frequency (TF-IDF) of
unigram and bi-grams. With the best f1 score they were achieved good results based on their data
size. However, as well as the accuracy obtained is good the source of data set site’s information
are not 100% credible and the most accuracy could have been attained by considering non-credible
news into account. In order to building block for Fake news detection, (Gurmessa, 2020a) were
develop Content-Based Afaan Oromo fake news detection system using a machine learning
approach with a passive-aggressive classification algorithm. They were used the data set collected
manually from facebook pages and labeled as fake and real. Since the data set is critical issues in
Afaan Oromo language, the classification was tested on a small number of news dataset items and
increase the size of data to the news dataset test the consistency of the performance thereby
increasing the trust of users on the system.
2.10 Afaan Oromoo language

Afaan Oromo is an Afro-asiatic language belonging to Cushitic branch and Oromo people is a
native to Ethiopia state of Oromia and spoken predominately by the Oromo people and
neighboring ethnic group in the horn of Africa(Abera, 2015) with 33.8% Afaan Oromo speakers,
followed by 29.3% Amharic language speakers.
In general, Afaan Oromo is widely used as written and spoken language in Ethiopia with regard to
the writing system, “Qubee” (a Latin-based alphabet) has been adopted and become the official
script of Afaan Oromo since 1991(Tesfaye, 2010b).
2.10.1 Afaan Oromo Qubee and writing system

The alphabet of Afaan Oromo Language is called “Qubee Afaan Oromo”, and characterized by
capital and small letters as in the case of English alphabet(Fikadu, 2019). Afaan Oromo language
has vowels and consonants as the same as an English language. Afaan Oromo language vowels
are represented by the five basic letters such as “a“, “e“, “i“, “o“, “u“. Besides, it has the typical
Eastern Cushitic set of five short and five long vowels by doubling the five vowel letters: “aa“,
“ee“, “ii“, “oo“, “uu“(Tesfaye, 2010a). Afaan Oromo language texts consonants, on the other
22 | P a g e
hand, do not differ greatly from English letters, however there are few special combinations which
is called “Qubee Dachaa” such as “sh” and “ch” (same sound as English), “dh” in Afaan Oromo
is like an English "d" produced with the tongue curled back slightly and with the air drawn in, so
that a glottal stop is heard before the following vowel begins. Another combination is “ph” made
when with a smack of the lips toward the outside “ny” closely resembles the English sound of
“gn”. We commonly use these few special combination letters to form words. For example, dh
use in dhadhaa ‘butter’ , ch used in barbaachisaa ‘important’, sh used in shamarree ‘girl’, ph
used in buuphaa ‘egg’, and “ny” used in nyaata ‘food’ .
Afaan Oromo language has 36 letters (26 consonants and 10 vowels) called “Qubee”. Words in a
sentences are separated by white spaces the same way as it is used in English. Different Afaan
Oromo punctuation marks follow the same punctuation pattern used in English and other languages
that follow Latin writing system. For example, comma (,) is used to separate listing of ideas,
concepts, names, items, etc. and the full stop (.) in statement, the question mark (?) in interrogative
and the exclamation mark (!) in command and exclamatory sentences mark the end of a
sentence(Tesfaye, 2010a). In general, all letters in English language are also in Afan Oromo
language, except the way it is written.
In Afaan Oromoo languages Vowel can appear in initial, medial and final positions in a word in
Afaan Oromo language. A long vowel is interpreted as a single unit and occurs everywhere a short
vowel can occur.
The following examples show some of long vowels at word initial, medial and final positions.
Initial positions: eelee to mean ‘pan’, uumaa to mean ‘nature’,
Medial position: leexaa to mean ‘single’ keennaa to mean ‘gift’,
Final position: garaa to mean ‘belly’, daaraa to mean ‘ash’
The difference in length is contrastive, for example consider, ‘lafa’ in Afaan Oromoo which is to
mean 'land', and ‘laafaa’ in Afaan Oromoo which is to mean 'weak'. The difference between the
words ‘lafa’ and ‘laafaa’ is the length of vowel they have. Two vowels in succession indicate that
the vowel is long (called “Dheeraa” in Afaan Oromoo), while a single vowel in a word is short
(called “Gababaa” in Afaan Oromoo).
The difference in length is contrastive, for example consider, ‘lafa’ in Afaan Oromoo which is to
mean 'land', and ‘laafaa’ in Afaan Oromoo which is to mean 'weak'. The difference between the
words ‘lafa’ and ‘laafaa’ is the length of vowel they have. Two vowels in succession indicate that
23 | P a g e
the vowel is long (called “Dheeraa” in Afaan Oromoo), while a single vowel in a word is short
(called “Gababaa” in Afaan Oromoo). Table 2.1 Afaan Oromoo Vowels.
Front Central Back
High i, ii u, uu
Mid e, ee o, oo
Low a, aa
Table 2-1 Afaan Oromoo Vowels
Afaan Oromo vowels are pronounced in sharp and clear fashion which means each and every word
is pronounced strongly.
For example:
A: Fardda, haadha
E: Gannale, Waabee, Roobale, Colle
I: Arsii, Laali.
O: Oromo, Cilaalo, Haro, Caancco, Danbidoollo
U: Ulfaadhu, Arbba.
2.10.2 Afaan Oromoo punctuation mark

In language structure punctuation mark is placed in text to make meaning clear and reading easier.
Analysis of Afan Oromo texts reveals that different punctuation marks follow the same
punctuation pattern used in English and other languages that follow Latin Writing System(Tesfaye,
2011). As the same as English language text structure, the following are some of the most
commonly used punctuation marks in Afan Oromo:-
I. Tuqa Full stop (.):- is used at the end of a sentence and in abbreviations.
II. Mallattoo Gaffii Question mark (?):- is used in interrogative or at the end of a direct
question.
III. Rajeffannoo Exclamation mark (!):- is used at the end of command and exclamatory
sentences.
IV. Qooddu Comma (,):- it is used to separate listing in a sentence or to separate the elements
in a series.
V. Tuq-lamee colon (:):- the function of the colon is to separate and introduce lists, clauses,
and quotations, along with several conventional uses, and etc.
24 | P a g e
2.10.3 Afaan Oromo morphology
Every language has its own morphological structure that defines rules used for combining the
different components the language may have. The English language for instance is basically
different in its morphological structure from French, Arabic or Afaan Oromoo(Jimalo, P, et al.,
2017). There are a number of word formation processes in Afaan Oromoo. Affixation and
Compounding are among these word formation processes.
Affixation is generally described as the addition of affixes at the beginning, in between and/or at
the end of a root/stem depending on whether the affix is prefix, infix or suffix. Attaching one or
more prefixes and/or suffixes to a stem may form a word. The word durbumma ‘girlhood’ for
instance is formed from the stem durb- ‘girl’ and the suffix –umma.
Compounding is the joining together of two linguistic forms, which functions independently.
Examples compound nouns include; abbaa-buddenaa ‘step father’ from abba- ‘father’ and
buddena ‘food’. Like a number of other African and Ethiopian languages, Afaan Oromo has a
very complex and rich morphology. It has the basic features of agglutinative languages involving
very extensive inflectional and derivational morphological processes. In agglutinative languages
like Afaan Oromo, most of the grammatical information is conveyed through affixes (i.e. prefixes
and suffixes) attached to the root or stem of words. Obviously, these high inflectional forms and
extensive derivational features of the language are presenting various challenges for text
processing and information retrieval experiments in Afaan Oromo(Tune et al., 2008)(Tesfaye,
2011). Although, Afaan Oromo words have some prefixes and infixes, suffixes are the
predominant morphological features in the language. Almost all Oromo nouns in a given text have
persons, number, gender and possession makers which are concatenated and affixed to a stem or
singular noun form. In addition, Afaan Oromo noun plural markers/forms can have several
alternatives. For instance, in comparison to the English noun plural marker s (-es), there are more
than ten major and very common plural markers in Afaan Oromo including: -oota, -wwan, -lee, -
an, -een, -eeyyii, -oo, etc...).
As an example, the Afaan Oromo singular noun “mana” (house) can take the following different
plural forms: Manoota (mana + oota), manneen (mana + een). The construction and usages of
such alternative affixes and attachments are governed by the morphological and syntactic rules of
the language.
25 | P a g e
2.11 News writing structure on social media
Social media news, often referred to simply as social news, refers to a more modern tendency to
get news what is happening around us from social media rather than more traditional news sources.
it may involve current events, new initiatives, or other issues.
News writing structure or style is the way in which elements of the news are presented based on
relative importance, tone and intended audience. In addition, it is also concerned with the structure
of vocabulary and sentences(Tantiponganant & Laksitamas, 2014).
News writing attempts to answer all the basic questions about any particular event - who, what,
when, where and why (the Five W’s) and also often how - at the opening of the article. This form
of structure is sometimes called the "inverted pyramid", to refer to the decreasing importance of
information in subsequent paragraphs. The most important structural element of a story is the lead
which is contained in the story’s first sentence. The lead is usually the first sentence, or in some
cases the first two sentences, and is ideally 20-25 words in length (Tantiponganant & Laksitamas,
2014).
26 | P a g e
CHAPTER 3
3 MATERIALS AND METHODS

3.1 Data acquisition, fake news detection in Afaan Oromo
In the field of computer science the concept of fake news detection is the emerging research area,
there is significant literature already available on the topic for English based texts and datasets.
Even though there is a large recent interest in the field, fake news detection in Afaan Oromo
Language in its early stages. However, up until recently, there were no available datasets in Afaan
Oromo text to be used train the classifiers.
The key challenges in Afaan Oromoo text fake news detection in particular, is collecting a
sufficiently large, rich, and reliably labelled dataset on which the algorithms can be trained and
tested. Furthermore, the notion of ‘fake news’ itself is rather vague and nuanced(Monti et al.,
2019). As it has been discussed under section 1.8 the source of data for this study was mainly
Twitter and Facebook. To extract data from Twitter, we used an automated web scraping tool
Octoparse which allows you to pull all the information you see on the targeted website, such as
VOA Afaan Oromo, and BBC Afaan Oromo.
Now, let’s take a look at how to build Octoparse on twitter crawler to scrap VOA Afaan Oromo
twitter:-
27 | P a g e
Figure 3-1Octoparse implementation.
Step1: Input the URL and Build a pagination.
In this research work, we are scraping the official Twitter account of VOA Afaan Oromo and BBC
Afaan Oromo. As you can see on Fig 3.1, the website is loaded in the built-in browser and usually,
twitter account websites have a next page button that allows Octoparse to click on and go to each
page to grab more information from each page. Octoparse scraping the selected information by
scroll down the page and extract the tweets.
Figure 3.2 shows the sequence of steps implemented scraper in which twitter applies infinite
scrolling technique that we need to extract the data shown on the screen.
28 | P a g e
Figure 3-2 Partial sequence diagram for implemented scraper.
Step 2: Build a loop item to extract the data.
First, we build an extraction loop to extract the title and body of tweet fields one by one into
separate columns instead of just one, so we need to modify the extraction settings to select our
target extraction data manually. We have built a pagination loop earlier, but we made modification
set up the AJAX time and both the scroll repeats and the waiting time on the workflow setting.
Finally when the extraction is completed, we export the VOA and BBC Afaan Oromo tweets into
CSV file.
29 | P a g e
In this study, the size of the dataset to be extracted is 7000*3. It means that there are 7000 rows
along with 3 columns. The name of the columns are “Headline”, “Body” and “Label”. It is being
seen from the dataset approximate 3000 fake news articles and 4000 real news articles.
3.2 Data preprocessing

Text data requires special preprocessing to implement deep learning algorithms for subsequent
fake news classifications(Thota et al., 2018a). Afaan Oromo Texts is a series of sentences or a
single sentence consisting of several words with a large collection of text we call it as a Corpus.
We have actual dataset embodies approximately 7000 samples divided into separate train and test
which encompasses with news Headline “Mata-duree”, Body “Qaama” and Label. As it has been
discussed in section 3.3 of this study, the corpus is prepared from scratch as there is no previous
work in the area of Fake news detection on Afaan Oromo language.
The sources of the our news items for experimentation selected, are used different social media
like: - BBC Afaan Oromo Twitter page, VOA Afaan Oromo Twitter Page, Oromia Broadcast
Network (OBN) Facebook page and Fana Broad casting Facebook official websites written on
different topics.
Table 3-1 Examples of statements and side information in the dataset.
30 | P a g e
We have deal with two-label setting (i.e. True, fake) although the statements in the dataset are,
news articles considered to be on different topics of the community, social, economic,
technological and political issues so that they are a potential source for collecting balanced corpus
for the task of fake news detection for Afaan Oromo Language.
We summarize the following core techniques in terms of representing texts for the most
fundamental part of deep learning.
3.2.1 Tokenization and padding

The headline ’Mataduree’ and Body ‘Qaama’ of the article are concatenated, followed by
tokenization of texts, basically refers to splitting up a larger body of text into smaller lines and
split a sentence with several pieces, so that each piece (called a “token”) becomes a meaningful
unit and applying padding to the sentences that are longer or shorter than a certain length, which
in the case of inputs will be the length of the longest input sentence. We used word level units of
token in order to tokenize texts, Keras provides ‘tokenizer’ as one of functions for preprocessing
texts, and quickly tokenizes a sentence based on spaces after removing punctuation.
Word-level tokenizing functions for English texts from natural language toolkit (NLTK) libraries
it might not be a suitable method if we tokenize Afaan Oromo texts because, Afaan Oromo words
consists of various pre and postpositions. So we need to use Morpheme as token in Afaan Oromo
words because it is the smallest unit with meaning.
3.2.2 Stemming
Stemming is a technique for the reduction of words to their stems or root variant(Korenius et al.,
2004). Documents were contained several existences of words like “barattoota”, ”barattootni”,
“baratichi”, “baratichatu” , ”barattu‘, “barataa” Different words share the same word stem (i.e.
“barat”) and a module which was designed for this particular purpose used to convert those
different representations to their stems.
Doing this reduce computing time and space as different forms of words are stemmed to a common
word. In this thesis work a module which was developed to steam Afaan Oromo words designed
to remove suffixes from each words are: - Among those suffixes and prefixes some of them are
listed below.
Set of suffixes: - “oota”, ”icha”,‘tu‘, ”ichan”, ”ootni”, ”aa”, ”ichatu”, ”een”,…etc.
31 | P a g e
Set of Prefixes:- ”al”, ”hin” , ”ni”, ”wal”……etc.
3.2.3 Token Embedding

After we tokenize the sentence in to word levels, we give each tokens unique integer number as
index. If there exists X unique tokens in the entire dataset, we allocate each token a vector of
length X which contains X − 1 zeros. It becomes sparse and high-dimensional more as the dataset
are larger, resulting in exhausting spatial resources and the token is fed through the embedding
layer, which yields a token embedding. Embedding allocates a word token to a dense vector with
floating-point values. Further, in contrast to one-hot encoding, we set a desired length P of an
embedding vector and its P elements are trained by a neural network using several input tokens
for each model training step and It is a method that representing tokens in a P -dimensional space
as similar words locate as close as possible.
3.2.4 Sequence Creation

Seq2seq Models (see fig 2) for the abstractive summarizations composed of an encoder and a
decoder(Shi et al., 2018)(Yulita et al., 2017). The encoder reads a source article, denoted by x =
(x1, x2,… xj), and transforms it to hidden states he = (he1, he2,……… heJ ), while the decoder takes
these hidden states as the context input and outputs a summary y = (y1, y2,….. yt ). Here, xi and yj
are one-hot representations of the tokens in the source article and summary, respectively.
32 | P a g e
Figure 4-3 the basic seq2seq model.
Fig. 4.3 shows a basic RNN seq2seq model with a bi-directional LSTM encoder and an LSTM
decoder. The bidirectional LSTM is considered since it usually gives better document
representations compared to a forward LSTM. The encoder reads a sequence of input tokens x and
turns them into a sequences of hidden states h = (h1, h2, h3,……….., hj ) with following updating
algorithm:
it = σ ( Wii Ext-1 + bii +Whiht-1 + bhi) (Input Gate)
ft = σ ( Wif Ext-1 + bif +Whfht-1 + bhf ) (Forget Gate)
ot =σ ( Wio Ext-1 + bio +Whoht-1 + bho) (Output Gate)
gt = tanh (WigExt-1 + big +Whght-1 + bhg)
ct = ftct-1 + itgt
ht = ot tanh(ct)
Where σ is the logistic sigmoid function, i, f, o, and c are the input gate, forget gate, output gate
and cell activation vectors, and vector b are learnable parameters, Ext denotes the word embedding
of token xt and ct represents the cell states.
33 | P a g e
For the bi-directional LSTM, the input sequence is encoded as → he and ←he, where the right and
left arrows denote the forward and backward temporal dependencies, respectively.
Superscript e is the shortcut notation used to indicate that it is for the encoder. During the decoding,
the decoder takes the encoded representations of the source article as the input and generates the
summary y. In a simple encoder-decoder model, encoded vectors are used to initialize hidden and
cell states of the LSTM decoder.
In an attention based encoder decoder architecture (shown in Fig. 3.1), the decoder not only takes
the encoded representations (i.e., final hidden and cell states) of the source article as input, but also
selectively focuses on parts of the article at each decoding step and this attention can be achieved
by alignment mechanism(Bahdanau et al., 2015).
3.3 Data visualization

In fake news detection we used Data visualization to presenting unstructured data graphically to
find the connections between mountains of information’s and transform the invisible information
into the visible graph which helps the reader to discover key points quickly and clearly. To
visualize how frequently words appear in a given text, we plot word cloud by making the size of
each word proportional to its frequency and the words are then arranged in a cluster or cloud of
words. Figure 3.4 show word cloud for fake and real texts respectively. Long words are
emphasized over short words and Words whose letters contain many ascenders and descenders
may receive more attention.
34 | P a g e
Figure 3-4 Word cloud for text that is real.
Figure 3-5 word cloud for text that is real.
35 | P a g e
3.4 SYSTEM ARCHITECTURE
Figure 3.1 is a diagrammatic representation of the classification process. It illustrates the steps that
were involved in this research from text-preprocess to final prediction of entire texts.
Figure 3-6 Architecture of Fake news detection based on Bi-directional LSTM-recurrent neural
network.
In order to create an artifact capable of detecting fake news, in the first step news text comes in
different format, from social media and undergoes set of preprocessing to eliminate unwanted
characteristics left by the data acquisition phase, and the cleaning phase more actions than non-
textual characters, such as fixing spelling and syntax errors, standardizing data sets, and correcting
mistakes such as empty fields, missing values in addition to stop word removal, Tokenization and
stemming.
36 | P a g e
Then, after preprocess and padded the sequence of words transform to numbers, because computers
cannot read words, and word embedding work so well as well as the semantics of the words are
captured. After the words are converted into word embedding’s, the words are fed into a neural
network. This neural network consists of various layers. The first layer is a convolutional layer. A
convolution is a filter that can extract features from the data and max pooling layer iterates over the
tensors and takes the highest value. After representation the data is classified as train and test. The
training is carried out on the news article corpus and test data is used to know the predicted label
of news article based on trained model, Bi-Directional LSTM network each embedding layer
corresponding to training data is inspected in both orders at the same time and once the classifier
was trained to separate targets from clutter, a threshold was applied to the output score to determine
the output.
In order to serialize the model using some particular format after training, and de serialize that
model in the production environment. Commonly, Python is the language for deep learning
modeling which have different serialization recommendations. In particular:-Sklearn recommends
using Joblib package and Pickle is used for serializing and de-serializing Python object structures.
Serialization refers to the process of converting an object in memory to a byte stream that can be
stored on disk. The trained model is saved to disk using pickling by which we need not train the
model every time we need it. We just need to de serialize it for predicting news input(Srivastava,
2020). Figure 3.8 shows, a way to write a python object on the disk that can be transferred
anywhere and later de-serialized (read) back by a python script.
37 | P a g e
Figure 3-8 Serialization and de-serialization.
The first step would be to load the saved pipeline and requested query, then computes the predicted
text news based on our model. For this de- serialized the pickled model receives the inputs and
uses the trained model to make the prediction and returns that prediction result to get the labels
which can be accessed through the API endpoint.
Finally, system prototype deployment of trained deep learning models are available to the end
users or systems. However, there is complexity in the deployment of deep learning models. Figure
3.9 shows deployment of our trained deep learning models into production using Flask API.
38 | P a g e
Figure 3-9 Architecture of Fake news Detector using Flask.
39 | P a g e
CHAPTER 4
4 EXPERIMENT AND RESULTS
4.1 Tools used
We describe in this chapter the experiment evaluation of our approach and discuss obtained results.
For experiment of evaluation all cade are written in Python 3.7, using TensorFlow 2.3.1, Spyder
1.4.1 and NumPy. All experiments have been performed on a Core™ processor Intel® CPU i5-
3340M CPU @2.70 GHz with 4.00 GB RAM.
4.2 Data set creation

At the time this work was started, there was no available Afaan Oromo news dataset or corpus
provided. However, the two (i.e. Fake and real News) datasets utilized in this study are obtained
or Scrapping from Twitter and Facebook. Dataset specification is shown in Table 4.1.
Table 4-1 Dataset specification.
4.2.1 Dataset 1 (Real news dataset)

One of the tasks in this work is to scrap a news, automatically generated by trusted government
official social media. For real news we have used Octoparse web scraping tool and collect VOA
and BBC Tweeter Afaan Oromo News text.
Each news article in this dataset consists of its Headline, Body and text label (i.e. 1 as REAL). The
vocabulary size for this dataset is approximately 2 MB. Dataset 1 specification is shown in Table
4.2
40 | P a g e
Table 4-2 Specification of News Dataset 1
4.2.2 Dataset 2 (Fake news dataset)

The datasets utilized for fake news are obtained from unknown source and manually edited. Each
news article in this dataset consists of Headline “MataDuree”, Body ”Qaama” and text label (i.e.
0 as FAKE) as same as the real news we discussed under section 4.2. The vocabulary size for this
dataset is 1.3 MB. Dataset 2 specification is shown in Table 4.3.
Table 4-3 Specification of News Dataset 2.

For each real and fake news article in this dataset consists of article headline, body and binary
label as 1/0 is allotted to the real/fake news articles, respectively and concatenated together for
preprocessing. The vocabulary size for this dataset after concatenated is 3.5 MB.
41 | P a g e
Table 4-4 Concatenated dataset specification.
4.3 Data pre‑processing using nltk and tokenizer
Data pre-processing is an important step here, as in most NLP applications(Z. Zhang & Luo, 201
9). The Title and Body of the article are concatenated as FakeReal, followed by removal of stop
words, tokenization and lemmatization of text. While tokenizing, a maximum of 150559 words is
considered. All textual sequences are then converted to numerical sequences and padded or trimm
ed to a maximum sequence length that we set.
4.4 Word Embedding

Several recent studies(Basaldella et al., 2018)(Biswas et al., 2019) showed that to represent the
semantics of words, word embedding are better than an “one-hot" encoding word representation,
However we used word embedding to map semantic meaning and relationships into a geometric
space using dense vectors. These vectors show the estimate of the word into a continuous, high
dimensional vector space. This comes as an improvement over the earlier used Bag of Words
model where in large sparse vectors of vocabulary size were used as word vectors. The large
vectors cannot gave information about how two words were interrelated or any other useful
information. The words near any word in the text grant is its position within the vector space.
Glove embedding used here is, along with Keras embedding layer, which is play crucial role for
training neural networks on text data. This is a flexible layer, used here to load pre-trained GloVe
embedding of 100 dimensions of transfer learning. The embedding layer is initialized with weights
42 | P a g e
from this GloVe embedding. Since the learned word weights in this model are not to be updated,
therefore the trainable attribute for this model is set to be false.
As embedding layer passes the deeper understanding of the input data to the next layers and learns
the dense vectors representation. In other tests later, it is initialized with pre-trained embedding.
4.5 Sequential Model

Due to the capability of capturing sequential information in an efficient manner, Long Short Term
Memory (LSTM) networks are one of the most widely used models in text classification and
generation problems(Hochreiter & Schmidhuber, 1997).
The built model begins with the embedding layer, followed by Bi-directional LSTM layer which
specializes shown impressive performance by capturing sequential information from the both
forward and back ward directions in texts. The input text data to model here is reduced by
preprocessing like stop words and transformed to bags of words representation by Count
Vectorizer from sklearn library. The network only one fully-connected hidden layer consisting of
256 neurons with activation function, followed by an output layer with one neuron and sigmoid
activation function giving the true/false classification. The same output layer is used in every other
model tested in this chapter.
Figure 4-1 Keras summary of RNN model.
43 | P a g e
The optimization function used was Adam and the loss was computed using binary cross-entropy.
The model was fit in 200 epochs with a batch size of 256 samples. Within the first few thousand
samples being run through the dataset the loss was already approaching 1, and by the end of the
first epoch it had hit 1. Here, the input sequence is a series of integers representing an index in the
word index dictionary. Each integer in the sequence corresponds to a word that had been on that
place before the encoding. During training we follow the following entire procedures:-
The encoder accept an input source sequence and computes the state values. Then, the encoder
passes final state vector as the initial state of the decoder. From the initial symbol, the decoder
sequentially predict the most probable target token that comes to the next. Figure 4. 2 shows overall
description of architecture, training a seq2seq network.
Figure 4-2 General-architecture-of-Bi-directional-LSTM-RNN.

First, the encoder accepts ith token Xi’s embedding vector and computes state values hi until the
end point of the source sequence. Assume, if we define the number of units in the recurrent cell as
q, the length of states are q and it passes the last hidden state hT1 as the initial state of the decoder
RNN. The decoder accepts an embedding vector of the first input token start of sequence from the
target sequence. When the initial state and the input is given the decoder computes a state vector
s1 with length q. It enters a fully-connected layer with N2 units. Finally, we can compute the
probabilities with sigmoid activation. It mean that, which token is most likely to appear next among
all possible target tokens. Each time while training, the decoder accepts each jth true token Yj’s
44 | P a g e
embedding vector in a target sequence and maximizes the probability of the next true target token
Y j+1 for the following time steps j = 1,…………,T2.
Note that the final output token is end of sequence. In summary, the model is trained to maximize
each target sequence [Y1, Y2,…………,YT2, end of sequences’ probability using both [X1
,X2,……….,XT1] and start of sequence, [Y1,Y2,…………,YT2 ] for the encoder and decoder input.
To alleviate overfitting and to escalation the generalization capacity of the neural network, the
model should be trained for an optimal number of epochs. Figure 4.3 .Shows rounds of
optimization that are applied during training.
Figure 4-3 Number of epochs applied for training.

Through applying model training, the number of epochs is related to the number of rounds of
optimization and with more rounds of optimization, the error on training data will reduce further
and further. But, there may come a point where the model becomes over-fit to the training data
and will start to lose performance in terms of generalization to unseen (non-training) data. Neither
more training epochs nor hyper parameter tuning did not bring any significant improvement. To
start making predictions, we used the testing dataset in the model that we have created. Since we
have already trained the classifier with the training set, this code will use the learning from the
training process to make predictions on the test set.
We have created predictions using the predict method and set the threshold to automatically predict
whether circulating news is fake or not. However, while the model predict an unobserved target
45 | P a g e
sequence, the decoder works in a different ways from training. From the starting symbol start-of-
sequence (SOS) with the initial state from the encoder, it calculates the probabilities for all tokens
and finds the token that matches the highest probability. Then the decoder uses it as the input of
the next time step. It stops the prediction as the predicted token is end-of-sequence (EOS). To
evaluate how well the model performed on the predictions we have used confusion matrix to check
the number of correct and incorrect predictions.
4.6 Experimental evaluation and results

In this research work, the performance of described deep-learning models on Afaan Oromo fake
news dataset to evaluate how well the model performed on the predictions, the confusion matrix
represents the 4 possible outcomes of the classification. In our case:
True Positives (TP):- Piece of news is fake and has been classified as fake
False Positives (FP):- Piece of news is fake, and has been classified as true
False Negatives (FN):- Piece of news is true, and has been classified as fake
True Negative (TN):- Piece of news is true, and has been classified as true
From those measurements, we derive the following metrics which will be used for evaluating our
models:
 Accuracy is a measure calculated as the ratio of correct predictions over the total number
of data set.
 Precision: - Represents the proportion of positive identifications that where we actually
correct. It is expressed as:-
Precision= TP/TP+FP
 Recall:- Is to measure the percentage of correct predictions the classifier catch and is
defined as:
Recall = TP/TP + FN
 F-score: - It is a measure which takes on account both precision and recall and is
computed as:-
F1= 2*precision * recall / precision + recall.
46 | P a g e
In this experiment the Afaan Oromoo News text dataset were pre-processed and embedding with
creating a dense vector space for each word instead of traditional Bag of Words. We divided the
dataset into two sets: - 20% for testing set and 80% training set. While running both the cases for
number applied for each epochs, and a batch size of 128, accuracy and data loss changes are quite
visible, depicted in Figure 4.3 Number of epochs applied for training. An initial set of experiments,
were run using Bi-LSTM and the performance of the classifiers is also measured using F-measure
values, and outperform 90% of accuracy is obtained by using Bi-LSTM. When we feed
preprocessing data to an outstanding pre-trained model and of course, get output in probabilities,
better effectiveness and performance measurement of the model is good, compared to related work
(Chapter 2) and the Confusion matrix comes into the limelight. Precision and recall for each label
are also high (Table 4.2). More specifically, the following are observed:-
All examples labelled as “real” are classified correctly, with a class recall of 97% and a class
precision of 88% and all examples labelled as “fake” with class recall of 75% and a class precision
of 93%. Table 4.2 shows Confusion Matrix performance measurement for fake news detection in
deep learning model for Afaan Oromoo news text, with 4 different combinations of predicted and
actual values.
Predicted Fake Predicted Real Class precision
Actual Fake 352 72 93%
Actual Real 113 1364 88%

Class recall 75% 97%
Table 4-5 Bi-LSTM Confusion matrix model.

Out of the total sample in the test set, shown on table 3.5 the model accurately predicted 1716 out
of the total samples. The model incorrectly predicted 185 out of the total samples. These metrics
usually revolve on the amount of values on each of the quarters of the confusion matrix and
classification report show in fig 4.4
47 | P a g e
Figure 4-4 Bi-LSTM Classification report.
In the first experiment we used Vanilla RNN, then we used LSTM in the second experiment,
finally, in the last experiment we used the Bi-LSTM technique. The following table illustrates the
accuracy of each model.
Model Accuracy
LSTM 0.87
Bidirectional-LSTM 0.90
Vanilla-RNN 0.89
Table 4-6 Experimental results
As we can see from the results (Table 4.6), all models achieved very good performance. In the
model training, but the downside of RNN model suffers vanishing gradient problem due to the
deep network hierarchy. Bi-LSTM solves the vanishing gradient problem and the model has higher
prediction accuracy than the other model. Bi-LSTM model managed to detect all fake news articles
in the testing set, while still kept the FP ratio at a reasonable level. Table 4.2 depicts in more detail
the performance of the model using a confusion matrix.
It shows, that all positive examples (fake news articles) were correctly classified by the model,
while 352 of true news were predicted as fake ones.
Parameter name value

Activation function Sigmoid
Dropout rate 0.1
Epochs 50
Optimizer Adam
Embedding size 100
Table 4-7 LSTM model details.
48 | P a g e
4.7 Discussion
We examined the results of experiments, how the recurrent neural network handles sequential data
in detail as well as the main concept to build an RNN-based model from processing raw input
sentences to generating target sentences. Recurrent neural network (RNN) based architectures are
also proposed for Afaan Oromoo language fake news detection. RNNs process the word
embedding in the text sequentially, one word of token at a time, utilizing at each step the
information from the current word to update its hidden state which has aggregated information
about the previous sequence of words and the final hidden state is generally taken as the feature
representation extracted by the RNN for the given input sequence. However, the amount of training
required which is comparatively much more than other kinds of networks and RNNs suffer from
the problem of vanishing gradients, which hampers learning of data sequences and gradients carry
information used in the Recurrent Neural Network parameters update and when the gradient
becomes smaller, the parameter updates become insignificant which means no real learning is done
and to overcome the problem which alleviate some of the training difficulties in RNN we used
LSTM to its ability to effectively capture long range dependencies in the sequence of text, and
has been applied to fake news detection. The size of the data can affect the accuracy of the model
and as the size of the data increases, accuracy also increases as well as the number of training
increases. The shortcoming of the current model is that it requires more training data and training
time than the existing baselines. Further along a line as a future work, the inclusion of features like
source or the author of the article, user response, along with the model proposed, and increased
the volume of dataset can lead the way towards a state of- the-art solution to this potentially
hazardous “digital wildfire “now day. Finally based on the classification result Bi-LSTM was
chosen as the best model to determine the truth of Afaan Oromo news in social media and the
system prototype was developing and the user interface screenshot sample is shown in (Figure 12).
49 | P a g e
Figure 4-5 System Interface for Afaan Oromo Fake News Detection.
50 | P a g e
CHAPTER 5
1. CONCLUSION AND RECOMMENDATION

1.1 Conclusion
Deep learning is gaining lots of attention recently and improved the state-of-the-art for many
problems that AI and machine learning faced for a lot of years. The aim of this study is to propose
fake new detection on social media by using Deep learning technique for Afaan Oromo news text.
However, a model to predict and classify Afaan Oromo news text does not come out of thin air, it
must be preprocessed and trained on the collected sample data set. The neural network for this
tasks do not operate directly on news texts, and its challenging to perform text analytics, So far we
have seen deterministic methods to determine word to vectors which is combination of techniques.
We consider one hot encoder for mapping categorical integers and apply the context of word
embedding by training it with Bi-directional LSTM and cosine similarity measure are passed as
input features to neural network. Once the classifier was trained to classify, a threshold of 0.5 was
applied to the output score to determine whether it’s considered a True or Fake and statistical
analysis, a confusion matrix was used to compare across varied thresholds. Deep learning models
requires a huge amount of data, however dataset is a big issue in the Afaan Oromoo language,
relatively the model is trained on small data when we compared the dataset prepared for English
language. Adding more data to the news dataset to the consistency of the performance thereby
increasing the trust of users on the system.
Finally, this study provides the results of Bi-LSTM and system prototype which can be used as
basis for future work using this Afaan Oromoo news text datasets, with other Ethiopian local
language.
51 | P a g e
1.2 Recommendation and future work
In this research, an attempt is made to design methodology for Afaan Oromo fake news texts
detection. Future work will take into account other traits of Afaan Oromoo fake news detection, in
hopes we think the most successful approach would be automatic fact checking model in Afan
Oromo language texts and other local languages, that is, compelling the model with some kind of
knowledge base, the purpose of the model would then be to extract information for the text and
verify the information in the database. The problem with this approach would be that the
knowledge base would need to be constantly and manually update to stay up to date in addition to
multi-class prediction. The following are some of the recommendations and future work for further
research and enhancement:-
 The stop-word list used in this research is compiled during the data preparation and mostly
is news specific. The availability of standard stop-word list would definitely facilitate
researches in the area of prediction of fake news and other classification technique,
therefore a standard Afaan Oromo stop-word list should be developed.
 This research considers only multi-label classification, it does not consider multi-class
classification, and therefore, research in this direction would also improve the classification
quality and furthermore this study focusing on text-based fake news. However, now a day
Social media posts with video have more views and fake information appears in a variety
of forms, including videos, audio, images, and text. For future work this gap must be
considered.
 This study has been based on learning based approach. Future work could be make
knowledge based to be more confidence on the system.
 As the precision, recall, F1 scores of all the models were almost the same, they were not
considered as the main evaluation metric. Future work could possibly make more use of
these metrics to make a more thorough comparison.
52 | P a g e
References
Ahmed, H., Traore, I., & Saad, S. (2017). Detection of Online Fake News Using N-Gram
Analysis and Machine Learning Techniques. Lecture Notes in Computer Science (Including
Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),
10618 LNCS(October), 127–138. https://doi.org/10.1007/978-3-319-69155-8_9
Bahad, P., Saxena, P., & Kamal, R. (2020). ScienceDirect ScienceDirect Fake News Detection
using Bi-directional LSTM-Recurrent Neural Fake News Detection using Bi-directional
LSTM-Recurrent Neural Network Network. Procedia Computer Science, 165(2019), 74–
82. https://doi.org/10.1016/j.procs.2020.01.072
Bahdanau, D., Cho, K. H., & Bengio, Y. (2015). Neural machine translation by jointly learning
to align and translate. 3rd International Conference on Learning Representations, ICLR
2015 - Conference Track Proceedings, September.
Bajaj, S. (2017). “ The Pope Has a New Baby !” Fake News Detection Using Deep Learning. 1–
8.
Basaldella, M., Antolli, E., Serra, G., & Tasso, C. (2018). Bidirectional LSTM recurrent neural
network for keyphrase extraction. Communications in Computer and Information Science,
806(December), 180–187. https://doi.org/10.1007/978-3-319-73165-0_18
Biswas, E., Vijay-Shanker, K., & Pollock, L. (2019). Exploring word embedding techniques to
improve sentiment analysis of software engineering texts. IEEE International Working
Conference on Mining Software Repositories, 2019-May(May), 68–78.
https://doi.org/10.1109/MSR.2019.00020
Brien, N., Latessa, S., Evangelopoulos, G., & Boix, X. (2018). The Language of Fake News:
Opening the Black-Box of Deep Learning Based Detectors. 32nd Conference on Neural
Information Processing Systems, Nips, 1–5. https://dspace.mit.edu/handle/1721.1/120056
Cardoso Durier da Silva, F., Vieira, R., & Garcia, A. C. (2019). Can Machines Learn to Detect
Fake News? A Survey Focused on Social Media. Proceedings of the 52nd Hawaii
International Conference on System Sciences, 6, 2763–2770.
https://doi.org/10.24251/hicss.2019.332
53 | P a g e
Castillo, C., Mendoza, M., & Poblete, B. (2011). Information credibility on Twitter. Proceedings
of the 20th International Conference Companion on World Wide Web, WWW 2011, 675–
684. https://doi.org/10.1145/1963405.1963500
Chaudhry, A. K., Baker, D. & Thun-Hohenstein, P. (2017). Stance Detection for the Fake News
Challenge: Identifying Textual Relationships with Deep Neural Nets. Stanford, 1–10.
Conroy, N. J., Rubin, V. L., & Chen, Y. (2015). Automatic deception detection: Methods for
finding fake news. Proceedings of the Association for Information Science and Technology,
52(1), 1–4. https://doi.org/10.1002/pra2.2015.145052010082
Extension, A., & Sodo, W. (2018). RESEARCH ARTICLE THE IMPACTS OF FAKE NEWS ON
PEACE AND DEVELOPMENT IN THE WORLD : THE CASE STUDY OF ETHIOPIA * Dr
. Kiros Abeselom.
Fang, Y., Gao, J., Id, C. H., Peng, H., & Wu, R. (2019). Self Multi-Head Attention-based
Convolutional Neural Networks for fake news detection. 1–13.
https://doi.org/10.1371/journal.pone.0222713
Fikadu Dinsa, E., & Babu P, R. (2019). Application of Data Mining Classification Algorithms
for Afaan Oromo Media Text News Categorization. International Journal of Computer
Trends & Technology, 67(7), 73–79. https://doi.org/10.14445/22312803/ijctt-v67i7p112
Gereme, F. B., & Zhu, W. (2019). Early detection of fake news “before it flies high. ACM
International Conference Proceeding Series, 142–148.
https://doi.org/10.1145/3358528.3358567
Girgis, S., Amer, E., & Gadallah, M. (2019). Deep Learning Algorithms for Detecting Fake
News in Online Text. Proceedings - 2018 13th International Conference on Computer
Engineering and Systems, ICCES 2018, November, 93–97.
https://doi.org/10.1109/ICCES.2018.8639198
Gurmessa, D. K. (2020a). Afaan Oromo Fake News Detection Using Natural Language
Processing and Passive-Aggressive. December.
Gurmessa, D. K. (2020b). Afaan Oromo Text Content-Based Fake News Detection using
Multinomial Naive Bayes. 01, 26–37.
54 | P a g e
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8),
1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Jimalo, K. M., Babu P, R., & Assabie, Y. (2017). Afaan Oromo News Text Categorization using
Decision Tree Classifier and Support Vector Machine: A Machine Learning Approach.
International Journal of Computer Trends and Technology, 47(1), 29–41.
https://doi.org/10.14445/22312803/ijctt-v47p104
Jimalo, K. M., P, R. B., & Assabie, Y. (2017). Afaan Oromo News Text Categorization using
Decision Tree Classifier and Support Vector Machine : A Machine Learning Approach.
47(1), 29–41.
Kong, S. H., Tan, L. M., Gan, K. H., & Samsudin, N. H. (2020). Fake News Detection using
Deep Learning. ISCAIE 2020 - IEEE 10th Symposium on Computer Applications and
Industrial Electronics, 102–107. https://doi.org/10.1109/ISCAIE47305.2020.9108841
Korenius, T., Laurikkala, J., Järvelin, K., & Juhola, M. (2004). Stemming and lemmatization in
the clustering of finnish text documents. International Conference on Information and
Knowledge Management, Proceedings, January, 625–633.
https://doi.org/10.1145/1031171.1031285
Kresnakova, V. M., Sarnovsky, M., & Butka, P. (2019). Deep learning methods for Fake News
detection. IEEE Joint 19th International Symposium on Computational Intelligence and
Informatics and 7th International Conference on Recent Achievements in Mechatronics,
Automation, Computer Sciences and Robotics, CINTI-MACRo 2019 - Proceedings,
November, 143–148. https://doi.org/10.1109/CINTI-MACRo49179.2019.9105317
Kunapareddy, R., Madala, S., & Sodagudi, S. (2019). False content detection with deep learning
techniques. International Journal of Engineering and Advanced Technology, 8(5), 1579–
1584.
Kwon, S., Cha, M., Jung, K., Chen, W., & Wang, Y. (2013). Prominent features of rumor
propagation in online social media. Proceedings - IEEE International Conference on Data
Mining, ICDM, 1103–1108. https://doi.org/10.1109/ICDM.2013.61
Leea, H., & Song, J. (2020). Understanding recurrent neural network for texts using English-
55 | P a g e
Korean corpora. Communications for Statistical Applications and Methods, 27(3), 313–326.
https://doi.org/10.29220/CSAM.2020.27.3.313
LIAO, W., & LIN, C. (2018). Stance Detection in Fake News.

https://www.researchgate.net/profile/Wenjun_Liao3/publication/327634447_Stance_Detecti
on_in_Fake_News_An_Approach_based_on_Deep_Ensemble_Learning/links/5b9add4545
851574f7c62c19/Stance-Detection-in-Fake-News-An-Approach-based-on-Deep-Ensemble-
Learning.pdf
Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B. J., Wong, K. F., & Cha, M. (2016). Detecting
rumors from microblogs with recurrent neural networks. IJCAI International Joint
Conference on Artificial Intelligence, 2016-Janua, 3818–3824.
Monti, F., Frasca, F., Eynard, D., Mannion, D., & Bronstein, M. M. (2019). Fake News
Detection on Social Media using Geometric Deep Learning. 1–15.
http://arxiv.org/abs/1902.06673
Of, D., For, A. S., & Text, A. (2015). SCHOOL OF GRADUATE STUDIES SCHOOL OF
INFORMATION SCIENCE DEVELOPMENT OF A STEMMER FOR AFARAF TEXT.
Shi, T., Keneshloo, Y., Ramakrishnan, N., & Reddy, C. K. (2018). Neural Abstractive Text
Summarization with Sequence-to-Sequence Models: A Survey. ArXiv, February 2019.
Srivastava, A. (2020). Real Time Fake News Detection Using Machine Learning and NLP. June,
3679–3683.
Stahl, K. (2018). Fake news detection in online social media Problem Statement. May, 6.
https://leadingindia.ai/downloads/projects/SMA/sma_9.pdf
Stahl, K. (2019). Fake News Detector in Online Social Media. International Journal of
Engineering and Advanced Technology, 9(1S4), 58–60.
https://doi.org/10.35940/ijeat.a1089.1291s419
Talwar, S., Dhir, A., Kaur, P., & Zafar, N. (2019). Why do people share fake news ? Associations
between the dark side of social media use and fake news sharing behavior Journal of
Retailing and Consumer Services Why do people share fake news ? Associations between
the dark side of social media use and fak. September.
56 | P a g e
https://doi.org/10.1016/j.jretconser.2019.05.026
Tantiponganant, P., & Laksitamas, P. (2014). An analysis of the technology acceptance model in
understanding students’ behavioral intention to use university’s social media. In
Proceedings - 2014 IIAI 3rd International Conference on Advanced Applied Informatics,
IIAI-AAI 2014 (Issue September). https://doi.org/10.1109/IIAI-AAI.2014.14
Tesfaye, D. (2010a). ADDIS ABABA UNIVERSITY FACULTY OF INFORMATICS

DEPARTMENT OF INFORMATION SCIENCE Designing a Stemmer for Afaan Oromo
Text : A Hybrid Approach SCHOOL OF GRADUTE STUDIES FACULTY OF
INFORMATICS.
Tesfaye, D. (2010b). Afaan Oromo Search Engine. A Thesis Submitted To the School of
Graduate Studies of the Addis Ababa University in Partial Fulfillment for the Degree of
Masters of Science in Computer Science, November.
Tesfaye, D. (2011). A rule-based Afan Oromo Grammar Checker. 2(8), 126–130.
Thota, A., Tilak, P., Ahluwalia, S., Lohia, N., Ahluwalia, S., & Lohia, N. (2018a). Fake News
Detection: A Deep Learning Approach. SMU Data Science Review, 1(3), 10.
https://scholar.smu.edu/datasciencereviewhttp://digitalrepository.smu.edu.Availableat:https:
//scholar.smu.edu/datasciencereview/vol1/iss3/10
Thota, A., Tilak, P., Ahluwalia, S., Lohia, N., Ahluwalia, S., & Lohia, N. (2018b). Fake News
Detection: A Deep Learning Approach. In SMU Data Science Review (Vol. 1, Issue 3).
https://scholar.smu.edu/datasciencereviewhttp://digitalrepository.smu.edu.Availableat:https:
//scholar.smu.edu/datasciencereview/vol1/iss3/10
Trung Tin, P. (2018). A Study on Deep Learning for Fake News Detection.
Tune, K. K., Varma, V., & Pingali, P. (2008). Evaluation of Oromo-English Cross-Language
Information Retrieval Evaluation of Oromo-English Cross-Language Information Retrieval.
June.
Vo, N., & Lee, K. (2019). Learning from fact-checkers: Analysis and generation of fact-checking
language. SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on
Research and Development in Information Retrieval, 335–344.
57 | P a g e
https://doi.org/10.1145/3331184.3331248
Yulita, I. N., Fanany, M. I., & Arymuthy, A. M. (2017). Bi-directional Long Short-Term
Memory using Quantized data of Deep Belief Networks for Sleep Stage Classification.
Procedia Computer Science, 116, 530–538. https://doi.org/10.1016/j.procs.2017.10.042
Zaman, B., Justitia, A., Sani, K. N., & Purwanti, E. (2020). An Indonesian Hoax News Detection
System Using Reader Feedback and Naïve Bayes Algorithm. Cybernetics and Information
Technologies, 20(1), 82–94. https://doi.org/10.2478/cait-2020-0006
Zhang, J., Dong, B., & Yu, P. S. (2020). FakeDetector: Effective fake news detection with deep
diffusive neural network. Proceedings - International Conference on Data Engineering,
2020-April, 1826–1829. https://doi.org/10.1109/ICDE48307.2020.00180
Zhang, Z., & Luo, L. (2019). Hate speech detection: A solved problem? The challenging case of
long tail on Twitter. Semantic Web, 10(5), 925–945. https://doi.org/10.3233/SW-180338
Zhao, Z., Zhao, J., Sano, Y., Levy, O., Takayasu, H., Takayasu, M., Li, D., Wu, J., & Havlin, S.
(2020). Fake news propagates differently from real news even at early stages of spreading.
EPJ Data Science, 9(1). https://doi.org/10.1140/epjds/s13688-020-00224-z
58 | P a g e
Appendix
Appendix-1 Sample of Afan Oromo real news
ID Title Body Label
Barak Husen Obaman Pireeziantiin Ameerikaa duranii 1

kitabaa barreessan Baaraak Obaamaan turtii isaanii
01
Waayit Haawus waggaa sadeetiif
kaan irrattis kitaaba barreessaa turan
xumuruu beeksisan.
Hidhamtoota siyaasa Obbo Iskindir Naggaa fi Sintaayyoo 1

itiyoophiyaa Chakool mana amala sirreessaa asiin
02
dura itti hidhamanii turan Qaallittitti
jijjiiramuusaanii abukaatoonsaanii
Obbo Heenook AkliilAuu BBC?tti
himaniiru.
03 Oduu koronoo vayirasii Itoophiyaatti sa'aatii 24 darbee 1

keessatti namoonni haaraa 1,368
vaayirasichaan yoo qabamuun, 25
ammoo lubbuu dhabu Ministeerri
Fayyaa ibseera.
59 | P a g e
04 Humni tikaa Motummaa Pireezidaantiin Waldaa Maccaa 1
itiyophiya missense waldaa Tuulamaa Obbo Dirribii Damusee
maccaf tuulama duraani humnootii mootummaan reebichi
irratti reebicha irratti raawwatamuusaa BBC'tti
rawwataniiru bilbilaan himan.
Appendix-1 Sample of Afaan Oromo fake news
ID Title Body Label
Waaye ummatta oromoo Oromoon uummata afaan hortee 0

seem dubbatu dha kanaafu oromia
01
keessatti qonnaan bultootaf Xaahoon
tola raabsama jedhamane
Hirribni guyyaa sammun Akka qorannoonnon dhiyeenya kana 0

nama jeeqa baye mirkanessutti namoonni irribaa
02
gahaa guyyaatti yookiin boqotaan
rakkoo saammuuf isan saxila
jedhame
03 Dhukaasa qaamota hin Gaafa guyyaa Amajjii 22 bara 2012 0

beekmne gidduutti galgala naanno sahaati 11 tti
rawatame dhukkasni guddan magaala amboo fi
nannawashitti akka deema jiru
namootni ijaan arginee jedhan tokko
tokko madda oduu keenyaf isa
godhaniiru
60 | P a g e
04 Nannoo oromiyaa Manneen barnoota naannoole 0
oromiyaa keessas jiran hunda
keessatti afaan amariffa akka
fudhatamu ministerri barnoota
nannicha ifa godheera
1. Appendix-1: Afan Oromo stop-word.

Ibseera jedhameera aane kanneen keessatti keessas ture
beeksiisera eeramera baasen akkasumas himaniiru akkaata akkuma
Dabalatan alatti amma ammoo bahuun bira booda
booddee dabalates inni irra irraa dhaan dudduuba
dura eega eegana eegasi fuulle gararraa gama
garuu garas gidduu hamma haala akka amma
ammoo dabarsu boodarras himu haasa fi irraa
galatti gara garuu ibsan ibsaniiru ibsameera irra
irratti isa isaa isaanii isaatiif isaatin henna
hunda himuu hoggaa hoguu waan ofii akka

Kun sun an kan inni isaan isheen
nu nuyi keenya keenyaa koo kee sun
ani ini Isaan iseen isaa akka kan
akkasumas Booda Eega kanaaf kanaafuu tanaafuu Ammo
kanaafi Immoo akkuma Garuu yookiin yookaan tanaaf
61 | P a g e
tanaafi akka jechuun jechuu Illee Fi Moo
jechaan Osoo Odoo Akkum booda booddee Dura
Saniif tanaaf tanaafi tanaafuu itumallee waan hoggaa
Yeroo Akka ishee otuu akkasumas Malee Osoo
Appendix -3 List of Afaan Oromo abbreviations
W.K.F Waan kana fakkaatan Mud. Mudde

K.K.F Kan kana fakkaatan Gurr. Gurraandhala
FKN. Fakkeenyaaf Mil. Miliyoona
Add. Adde Hag. Hagayya
Onk. Onkololessa Bit. Bitotessa
W.D. Waare dura Obb. Obboo
W.B Waare Booda Ama. Amajjii
A.L I Akka lakkofsa Itiyophiya Bil. Biliyoona
62 | P a g e

Fake News Detection On Social Media by Using Deep Learning Approach For Afaan Oromoo Language

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Fake News Detection On Social Media by Using Deep Learning Approach For Afaan Oromoo Language

Uploaded by

Copyright:

Available Formats

AMBO UNIVERSITY

FACULTY OF ENGINEERING AND TECHNOLOGY

FAKE NEWS DETECTION ON SOCIAL MEDIA BY USING DEEP

A THESIS SUBMITTED TO THE SCHOOL OF GRADUTE STUDIES OF

Name of the student.

______________________________________________ ________________ ________________

Name of Major Advisor Signature Date

Name of Co-Advisor Signature Date

_______________________ _________________ _____________

LSTM -Long Short Term Memory

Bi-LSTM – Bidirectional Long short – term memory

CNN: - Convolutional Neural Networks

DNN: -Deep Neural Networks

CBOW: -Continuous Bag of Words

TF-: Term -Frequency

NN: - Neural Network

IDF: - Inverse Document Frequency

SWOT: - Strength, Weakness, Opportunity, Threat.

ORORFN: - Oromoo Real and Fake News

Nltk: - Natural Language Toolkit

TF-IDF: - Term frequency–inverse document frequency

1.2 Statement of the problem and justification of the study ............................................................... 2

1.3 Research question?........................................................................................................................ 3

1.4 Objective of the study ................................................................................................................... 3

1.5 Motivation ..................................................................................................................................... 4

1.6 Significance of the study ............................................................................................................... 4

1.7 Scope and limitation of the study. ................................................................................................. 5

1.8 Research methodology. ................................................................................................................. 5

1.8.1 Literature review ........................................................................................................................... 5

1.8.2 Data collection .............................................................................................................................. 5

1.8.3 Preprocessing ................................................................................................................................ 6

1.8.3 Word embedding vector representation ........................................................................................ 6

1.8.4 SWOT analysis ............................................................................................................................. 7

1.9 Organization of the thesis ............................................................................................................. 8

2.2 Contributors of fake news ............................................................................................................. 9

2.3.1 Neural network............................................................................................................................ 10

2.3.3 Neural network for multi-label classification ............................................................................. 14

2.3.4 Squared error function. ............................................................................................................... 15

2.8 Approaches of fake news detection ............................................................................................ 20

2.8.1 Content based approach ...................................................................................................... 20

2.11 News writing structure on social media .................................................................................. 26

3.2 Data preprocessing ...................................................................................................................... 30

3.2.1 Tokenization and padding ................................................................................................... 31

3.4 SYSTEM ARCHITECTURE ..................................................................................................... 36

4.2 Data set creation .......................................................................................................................... 40

4.2.1 Dataset 1 (Real news dataset) ............................................................................................. 40

4.4 Word Embedding ........................................................................................................................ 42

4.5 Sequential Model ........................................................................................................................ 43

4.7 Discussion ................................................................................................................................... 49

5.2 Recommendation and future work .............................................................................................. 52

1.2 Statement of the problem and justification of the study

1.3 Research question?

1.4 Objective of the study

1.6 Significance of the study

1.8 Research methodology.

Deep learning models and their application.

1.8.2 Data collection

1.8.3 Word embedding vector representation

1.9 Organization of the thesis

2. REVIEW ON FAKE NEWS DETECTION

2.2 Contributors of fake news

2.3 Basic concept of fake news detection in deep learning models

Figure 2-2 Simple Artificial Neuron.

______________________________

_______________ _ _____

Ct=ftCt-1 + it tanh (Wc . [ht – 1, xt ] + bc) 2.18