You are on page 1of 42

Similarity Report ID: oid:27535:16602143

PAPER NAME AUTHOR

AI report.pdf Nikhil Sharma

WORD COUNT CHARACTER COUNT

7980 Words 43754 Characters

PAGE COUNT FILE SIZE

38 Pages 1.3MB

SUBMISSION DATE REPORT DATE

May 7, 2022 11:49 PM GMT+5:30 May 7, 2022 11:51 PM GMT+5:30

53% Overall Similarity


The combined total of all matches, including overlapping sources, for each database.
23% Internet database 50% Publications database
Crossref database Crossref Posted Content database
15% Submitted Works database

Excluded from Similarity Report


Bibliographic material Cited material

Summary
Delhi Technological
University

REPORT
Prepared By:

Abhishek soren 2k19/CO/025

Ankit tomar 2k19/co/062

Anshul Joshi 2K19/CO/066

Submitted To : Mr Nikhil mishra sir


8
ACKNOWLEDGEMENT

At the very outset of this report, we would like to extend our sincere and
heartfelt obligation towards all the personages who have guided us with the
project.

8
A special thanks to Mr Nikhil mishra and for teaching us the subject
“Artificial intelligence”. He helped us visualize the subject and to find its
applications in real life. He supervised us with the intricacies of this subject.
She also offered many relevant and productive recommendations for the
project, for which we are very grateful. We would also like to extend sincere
8
gratitude towards our Vice-Chancellor Prof. Jai Prakash Saini for allowing
the students to improve their social skills with practical and crucial subjects
like Operation Research.

Finally, a thank you to all our friends who helped us with the project, gave
worthy ideas.
ABSTRACT

Recently the growing online social media has create a communication channel in
the world. Every person is kept up to date with current events. Unconfirmed and
25
misleading content on the internet refers to Fake news. A significant portion of
11
fake news on the internet has the potential to create troubles in society. (An
overview of online fake news: Characterization, detection, and discussion).

13
The propagation of high volume of misinformation on social media has become a
global risk. So fake news detection in social media is now grabbing much attention
2
in the research field. (Deep learning for misinformation detection on online social
12
networks: a survey and new perspectives). We are using machine learning and deep
21
learning techniques to detect fake news. Both play an important role in detecting
fake news but deep learning based techniques givingṣ us the more accurate result
as compared to ML techniques. We begin by introducing the implications of fake
news. Then we will be elaborate about the database and the technique that we have
26
used. We explain how difficult it is to acquire valid information for performing
18
fake news and rumour detection.(A Comprehensive Review on Fake News
Detection With Deep Learning).

However, we make further suggestions for better research directions to improve


3
fake news detection techniques.

keywords : Deep learning , Online social networks, Natural language processing,


Fake news .
INTRODUCTION

These days we are more dependent on the internet. Like in the past we got
22
information from newspapers, television, etc. but in today's world we get most of
the information from social media. The rise of social media platforms has been
critical in this revolution of getting information [2] . there are two terms rumour
and fake news, both of these related to each other. fake news is when someone is
intentionally spreading fake news and rumour is unverified and ambiguous
information [1]. Fake news is start spread fastly and grew in popularity in 2016,
when the US election happened [3] . In the current COVID-19 pandemic fake news
in spreading at a steadily increasing rate. Many organisations like facebook, twitter,
instagram, google, etc. are searching for how they can resolve this issue. Like
facebook made a program to aware its users and the posts of fake news are in the
2
lower in the search feeds. Instagram links anyone looking for information about the
virus to a customised message with valid information and Twitter assures that
27
searches for the virus produce trustable content [4] . various techniques are used to
find out the fake news, fake news cannot be detected easily because there is no
12
valid data on which we can say that it is fake news. Now machine Learning and
Deep Learning techniques are used to find out the fake news in the social
2
media.ML systems use fake news from major media outlets to train algorithms for
detecting fake news. Some ways detect fake news by analyzing metadata, such as a
comparison of the article's release date and timeliness of distribution, as well as
where the story originated [4] . ML methods use in detection of false information
are sentiment analysis, knowledge verification, NLP . DL is a revolutionary
technology in the academic community that has demonstrated to be more effective
3
than previous ML methods in detecting fake news. DL has several advantages over
ML, including a) automated feature extraction, b) reliance on data preparation, c)
capacity to extract high-dimensional features, and d) improved accuracy [1] .
2
Researchers in academia and industry have applied DL to a wide range of
16
decision-making applications [5] . The body of the paper is structured as follows.
Section II focuses on the fake news consequences . section 3 is describing the
dataset. Section 4 is related work where examines, analyses, and summarise
5
existing knowledge regarding Fake news. Section 5 is on the open issues and
challenges. Section 6 is about the conclusion.

1. A Comprehensive Review on Fake News Detection With Deep Learning


2. (A Survey on Fake News and Rumour Detection Techniques)
13
3. Behind the Cues: A benchmarking study for Fake News Detection
2
4. APPROACHES TO IDENTIFY FAKE NEWS: A SYSTEMATIC
LITERATURE REVIEW
2
5. Deep learning for misinformation detection on online social networks: a
survey and new perspectives
FAKE NEWS CONSEQUENCES
17
(From A Comprehensive Review on Fake News
Detection With Deep Learning)

There has always been the existence of fake news, but with the advancements in
the technologies around the globe the scale of spread of fake news has increased
exponentially. News can mold our views on a certain topic. And if it is fake news it
can have greater impacts on the whole social scenario. We, humans make our
decisions with whatever information we have, but if that information itself is
wrong we might go in the wrong direction.

Impact on Innocent People: Rumors can ruin someone's life. Social Media due to
its popularity can have its bad side as well, and if there is fake news involved
people may get accused of things they haven't done, celebrities get death threats,
bullies and whatnot. It may also affect their careers.

3
Impact on Health: The people who are searching health related news on Internet is
constantly increasing. And the occurrence of fake news is greatly affecting them
[36]. Misinformation on health had a great impact last year [37].

1
Financial Impact: Fake news is a crucial problem to industry sector. People spread
fake news for their profits. It changes peoples view on products. It can also ruin a
company by spreading fake news about it.

3
Democratic impact: Fake News played a great role in American elections. People
often use fake news to win over the democracy.
1
BENCHMARK DATASET

In this section, we discuss the datasets used in various studies for both training and
testing.

9
Datasets that can be used for fake news or rumors detection and for checking
facts.

WORK SOURCE URL DETECTION


Vlachos and PolitiFact https://sites.google Fake news
Riedel (2014) Channel4 .com/site/andreasv 2 classes
lachos/resources
Mitra and Gilbert CREDBANK https://github.com/ data Credibility
(2015 compsocial/CRED 5 classes
BANK
Derczynski et al. PHEME project https://www.phem Veracity
(2015) e.eu/software-dow
nloads/
Mihaylova et al. SemEval-19 T.8 https://competition Fact Checking
(2019) s.codalab.org/com Q&A classes
petitions/20022 Rumours
Gorrell et al. RumourEval-2019 https://competition Rumours Stance
(2019) s.codalab.org/com labels
petitions/20022

1
(From A Comprehensive Review on Fake News
Detection With Deep Learning)
Fake news, Twitter15, and Liar are some publicly available popular datasets for
research and training. Some studies have used their own “Self-Collected” data. A
comparative study is difficult because there is not much information about these
1
collected data. With the help of the Benchmark Dataset, a comparative study can
1
be done to use modern state-of-the-art technologies to detect fake news.
Kaliyar et al. [40] reported an accuracy of 93.50% with their comprehensive model
using the Kaggle dataset, which is said to be the highest for the same dataset for
fake news detection.
PREREQUISITES

Word embedding
6
These are widely used in both machine learning and deep learning models.
6
Word2Vec and GloVe are currently among the most widely used word
embedding models to convert words into meaningful vectors. For using pre-trained
embedding models for training, we displace the parameters of the processing layer
with input embedding vectors.

10
GloVe
GloVe is an unsupervised learning algorithm for obtaining vector representations
for words. Training is performed on aggregated global word-word co-occurrence
statistics from a corpus, and the resulting representations showcase interesting
linear substructures of the word vector space.

2
BERT
BERT is an advanced pre-trained word embedding model based on transformer
encoded architecture. We utilize BERT as a sentence encoder, which can accurately
get the context representation of a sentence. BERT removes the unidirectional
constraint using a mask language model (MLM). It randomly masks some of the
tokens from the input and predicts the original vocabulary id of the masked word
based only. MLM has increased the capability of BERT to outperform as compared
to previous embedding methods. It is a profoundly bidirectional system. It can
6
handle the unlabelled text by jointly conditioning both left and right context in all
layers.
RELATED WORK

11
From (An overview of online fake news: Characterization, detection, and
discussion)
24 5
The state-of-the-art studies on fake news detection, as well as the overall
categorizations of current research on online fake news detection, are discussed in
5
this section, from which we can learn about the differences in detecting different
types of false information in terms of features, data mining algorithms, and
platforms.

19
The widespread propagation of internet fake news has recently increased as a result
of the advent of online social media. Because the information supplied through
social networks is enormous, quick, broad, diversified, and heterogeneous, online
misleading information can have a significant impact on the entire society. As a
5
result, an increasing number of researchers are focusing their efforts on detecting
misleading information and fake news on social media sites. The number of studies
5
devoted to detecting false news is significantly higher than that of other topics
(e.g., rumor or satire detection). Because online fake reviews and rumors are
always compacted and information-intensive, their content lengths are often shorter
than online fake news. Therefore, traditional language processing and embedding
techniques such as bag-of-words and n-gram are good for processing reviews and
rumors, but not powerful enough to extract the underlying relationships of fake
news. Therefore, online fake news detection requires an advanced embedding
5
approach to capture the key emotional and semantic order of news content. As
already mentioned, with the recent development of deep learning techniques,
algorithms such as recurrent neural networks and autoencoders are used to embed
5
natural language, memorize important semantic sequential sequences, and capture
5
underlying semantic relationships. It's a powerful tool. Therefore, deep learning
algorithms are increasingly being used in online fake news detection.

3
A Comprehensive Review on Fake News
Detection With Deep Learning

A. CONVOLUTIONAL NEURAL NETWORK (CNN)

1
To deal with ambiguous detection challenges, a few deep learning models have
been presented. The most intriguing models are CNNs and RNNs[77]. Researchers
are attempting to improve the efficiency of CNN's false news detector by utilising
its ability to extract features and improve the categorization process[132]. CNNs,
on the other hand, are gaining prominence in the NLP field. It's used to map the
1
characteristics of n-gram patterns. The CNN is an unsupervised multilayer
1
feed-forward neural network, analogous to a multilayer perceptron (MLP)[45]. The
CNN consists of an input layer, an output layer, and a sequence of hidden layers.

The majority of the time, CNNs are utilised for image recognition and
1
classification. Recent research has found neural networks with 100 or more hidden
layers. In neural networks, backward-propagation and forward-propagation
1 1
algorithms are used. By updating the weights of each layer, these techniques are
used to train neural networks. The weights are updated using the cost function's
gradient (derivative).

1
When the sigmoid activation characteristic is applied, the price of the gradient
decreases consistent with layer. This lengthens the schooling time. This hassle is
referred to as the vanishing-gradient hassle. A deeper CNN or an immediate
connection in dense solves this hassle. Compared to a regular CNN, a deeper CNN
is likewise much less susceptible to overfitting [67]. Kaliyar et al. [40] proposed a
version FNDNet (deep CNN), that's designed to analyze the discriminatory
1
capabilities for faux information detection the use of more than one hidden layers.
The version is much less liable to overfitting however takes an extended time to
train. The convolutional layer, pooling layer, and regularization layer are the
maximum applied layers in CNNs for faux information detection. The enter facts
may be manipulated via pooling and convolution operations. Sections V-A1, V-A2,
and V-A3 describe the famous layers utilized in CNN.

1
B. RECURRENT NEURAL NETWORK (RNN)

A RNN is a particular sort of neural network. RNN connects nodes in a directed


graph in a sequential manner. The previous step's output is used as the current
step's input. RNNs are good at making predictions based on time and sequence. In
comparison to CNN, RNN is less feature compatible. RNNs are well suited to the
1
analysis of consecutive texts and expressions. When tanh or ReLU is used as an
activation function, unfortunately, it is unable to analyse very long sequences.

The RNN uses the backward-propagation method for training. To construct a


3
minimum error function, it is necessary to take small steps often in the direction of
the negative error derivative about network weights during training neural
networks. For each subsequent layer, the gradients shrink in size. As a result, the
1
RNN has a vanishing gradient problem in the network's bottom layers. We have
three options for dealing with the vanishing gradient problem:
3
(1) using rectified linear unit (ReLU) activation function,
(2) using RMSProp optimization algorithm, and
(3) using diverse network architecture such as long short-term memory networks
(LSTM) or gated recurrent unit (GRU).

Previous research has concentrated on LSTM and GRU rather than state-of-the-art
1
RNNs. For propagation tree classification, Bugueo et al. suggested a model based
3
on RNN. For sequence analysis, the author employed RNN. In comparison to their
1
training examples, the number of epochs was set at 200, which is a pretty high
quantity. Different authors have proposed various RNN models to predict false
1
news items, including LSTM, GRU, tanh RNN, unidirectional LSTM-RNN, and
vanilla RNN. RNNs, and in particular LSTM, excel in processing sequential data
(human language) and extracting relevant characteristics from a variety of data
1
sources. LSTM and GRU are also discussed in Sections V-B1 and V-B2.

1
Fig. RNN Architecture with n sequential layers. X represents input and y
represents output generated by the RNN
1
1) LONG SHORT-TERM MEMORY (LSTM)
20
LSTM models are front runners in NLP problems. The LSTM framework is a deep
learning framework that uses artificial recurrent neural networks. LSTM is a more
1
advanced version of RNN. Because back-propagation in recurrent networks takes a
long time, especially for the evolving backflow of errors, RNNs are incapable of
1
learning long-term dependencies. LSTM, on the other hand, can store 'Short Term
Memories' for 'Long durations.' LSTM calculates the hidden state using a mixture
1
of the three gates : input gate, an output gate, a forget gate and a cell. The cell has
3
the ability to recall values over a long period of time. Because the word's
connection at the start of the content can affect the output of the word later in the
1
phrase, LSTM is an extremely plausible method for dealing with the vanishing
gradient problem.

Bahad et al. [61] suggested an RNN model with the vanishing gradient problem.
They used an LSTM-RNN to solve this problem. LSTM, however, was unable to
3
completely resolve the vanishing gradient problem. When compared to the initial
state-of-the-art CNN, the LSTM-RNN model had a higher precision. For rumour
1
detection, Asghar et al. [135] developed bidirectional LSTM (Bi-LSTM) using
CNN. In both directions, the sequence information is preserved by the model.
Long-term reliance can be remembered using the Bi-LSTM layer. Despite the fact
1
that the BiLSTM-CNN outperformed the other models, the proposed method is
computationally expensive.

2) GATED RECURRENT UNIT (GRU)


1
GRU is a much easier and more capable than LSTM in terms of structure and
1
capabilities. This is due to the fact that there are only two gates: reset and update.
The GRU controls data flow in the same way that the LSTM does, but without the
usage of a memory unit. It basically reveals all concealed stuff with no way of
3
controlling it. When it comes to learning long-term dependencies, GRU
outperforms LSTM by a wide margin. As a result, it's a good contender for NLP
applications. When compared to LSTM, GRUs are both easier to use and more
1
effective. Because GRU is still in its early phases, it has recently been employed to
identify fake news. GRU is a newer algorithm that has similar performance to
1
LSTM but is more computationally efficient. As a fake news detection model, Li et
al. used a deep bidirectional GRU neural network (two-layer bidirectional GRU).
3
The model has a slow convergence rate. S and Chitturi [41] showed that it is
difficult to determine whether one of the gated RNNs (LSTM, GRU) is more
successful, and they are usually chosen based on the basis of the available
1
computing resources.Though GRU is said to have the best outcomes in different
studies, however it takes more training time.

1
Singhania et al. [87] used a bidirectional GRU for word-by-word annotation. It
captures the meaning of the word within the phrase by using preceding and
following terms. Shu et al. [100] introduced the dEFEND (Explainable false news
1 1
detection) sentence-comment co-attention subnetwork model, which uses news
content and user comments to detect fake news. To improve performance, the
authors combined textual information with bidirectional GRU (Bi-GRU). However
the model's learning efficiency is still very low.
1
C. GRAPH NEURAL NETWORK (GNN)

A Graph Neural Network is a form of neural network that operates on the graph
structure directly. Node classification is a common application of GNN.
Essentially, every node in the network has a label, and the network predicts the
labels of the nodes without using the ground truth. The network extends recursive
neural networks by processing a broader class of graphs, including directed,
undirected graphs, and cyclic, and it can handle node-focused applications except
any pre-processing steps [138]. The network extends recursive neural networks by
processing a broader class of graphs, including cyclic, directed, and undirected
graphs, and it can handle node-focused applications without requiring any
pre-processing procedures cite190. GNN captures global structural features from
graphs or trees better than the deep-learning models discussed above [139]. GNNs
are prone to noise in the datasets. Adding a little amount of noise to the graph via
node perturbation or edge deletion and addition has an antagonistic effect on the
GNN output. Graph convolutional network (GCN) is considered as one of the basic
graph neural networks variants. A study by Huang et al. [140] claimed to be the
first that experimented using a rich structure of user behavior for rumor detection.
The user encoder uses graph convolutional networks (GCN) to learn a
representation of the user from a graph created by user behavioral information. The
authors used two recursive neural networks based on tree structure:
bottom-up RvNN encoder and top-down RvNN encoder.

1
The proposed model performed worse for the non-rumor class cause user behavior
information brings some interference in non-rumor detection. Another study by
Bian et al. [139] proposed top-down GCN and bottom-up GCN using a novel
method DropEdge [141] for reducing over-fitting of GCNs. In addition, a root
feature enhancement operation is utilized to improve the performance of rumor
detection. Although it performed well on three datasets (Weibo, Twitter15,
Twitter16), the outliers in the dataset affected the models’ performance.

156160 VOLUME 9, 2021 M. F. Mridha et al.: Comprehensive Review on Fake


News Detection With Deep Learning

On the other hand, GCNs incur a significant memory footprint in storing the
complete adjacency matrix. Furthermore, GCNs are transductive, which implies
that inferred nodes must be present at the training time. And do not guarantee
generalizable representations [142]. Wu et al. [143] proposed an algorithm of
representation learning with a gated graph neural network named PGNN
(propagation graph neural network). The suggested technique can incorporate
structural and textual features into high-level representations by propagating
information among neighbor nodes throughout the propagation network. In order
to obtain considerable performance improvements, they also added an attention
mechanism. The propagation graph is built using the who-replies-to-whom
structure, but the follower-followee and forward relationships are omitted. Zhang
et al. [144] presented a simplified aggregation graph neural network (SAGNN)
based on efficient aggregation layers. Experiments on publicly accessible Twitter
datasets show that the proposed network outperforms state-of-the-art graph
convolutional networks while considerably lowering computational costs.
D. GENERATIVE ADVERSARIAL NETWORK (GAN)

GANs are DL - based generative models. GAN is divided into two sub-models :
first is discriminator and generator model. The generator model generates new
pictures that are similar to the original image using the features gained from the
training data. Generated real or fake images are predicted by the discriminator
model.

GANs have proven to be beneficial in generative modeling and for training


discriminators. GANs are effective when there are uneven categories or insufficient
data and continuous number GANs build synthetic data. Novel methods are used
1
to build GANs databases. The unique problem for detecting fake news is the
recognition of false news on recently emergent events on social media To solve
fake news problem, Wang et al. [44] proposed an event adversarial neural network
and used in the False news on an upcoming event is recognized. A multimodal
1
learning algorithm, a fake news detector, and an event discriminator are the 3 parts
of ANN.

E. ATTENTION MECHANISM (AM) BASED

Attention serves as a link between the decoder and the encoder, allowing the
1
decoder to access data from each encoder's private information. AMB selectively
concentrates on the valuable components from the input using this framework. This
model deals effectively with the larger input sentences. The advantage of AM is
that it adds more feature weights to the model. A study by LSTM methods can help
enhance fake news identification. three-level hierarchical attention networks
(3HAN) are words,sentences, and last one for headlines.Different weights are
assigned to different areas of an article by 3HAN.3HAN yields understandable
results compared to DL models. For believability analysis, 3HAN provides a
limited context. The semantics of the article may not be completely reflected by the
1
LSTM neural network. When all of the vector representations of words in the text
are connected, the result is a vast vector dimension.

1
F. BIDIRECTIONAL ENCODER REPRESENTATIONS FOR
TRANSFORMERS (BERT)

BERT is a deep learning model that has shown cutting-edge results across a wide
variety of natural language processing applications. BERT incorporates pretraining
language representations developed by Google. BERT is a complex pre-trained
word-embedding model built on a transformer-encoded architecture [89]. The
BERT method is distinctive in its capacity to identify and capture contextual
meaning in a sentence or text [90]. The main problem of most of the language
models is that they are unidirectional, which restricts the architectures that could be
utilized during pre-training. The BERT model eliminates unidirectional limitations
by using a mask language model (MLM). BERT employs the next sentence
prediction (NSP) task in addition to the masked language model to jointly pre-train
text-pair representations.

1 3
The data utilized in the BERT model are generic data which is picked up frop from
Wikipedia and the Book Corpus. Such data contains a wide range of information,
However specific information on individual domains is still lacking. To overcome
this problem, a study by Jwa et al. [75] took news data in the pre-training phase to
boost fake news identification skills. When compared to the state-of-the-art model
stack LSTM, the proposed model named exBAKE (BERT with extra unlabeled
news corpora) outperformed by a 0.137 F1-score.

Ding et al. [154] discovered that incorporating mental features such as a speaker's
credit history at the language level might considerably improve BERT model
performance. The history feature helps further the relationship's construction
between the event and the person in reality. But these studies did not consider any
pre-processing methods.

Zhang et al. [91] presented a BERT-based domain adaption neural network for
multimodal false news detection (BDANN). BDANN is made up of three major
components: a multimodal feature extractor, a domain classifier, and a false news
detector. The pre-trained BERT model was used to extract text features, whereas
the pre-trained VGG-19 model was used to extract image features in the
multimodal
feature extractor. The extracted features are then concatenated and sent to the
detector to differentiate between fake and real news.

.
3
Kaliyar et al. [92] proposed a BERT-based deep convolutional approach
(fakeBERT) for fake news detection. The fakeBERT is a combination of different
parallel blocks of a one-dimensional deep convolutional neural network (1d-CNN)
with different kernel sizes and filters and the BERT.
These different filters help in extracting information from dataset. The combination
of this BERT with 1d CNN deals with both large scale structure and unstructured
1
dataset. This combination can hence help us in dealing with ambiguity.
G. ENSEMBLE APPROACH

Ensemble approaches are methods that generate several models and combine them
to achieve better results. Ensemble models typically yield more precise solutions
than a single model does. An ensemble reduces the distribution or dispersion of
predictions and model efficiency. Ensembling can be applied to supervised and
unsupervised learning activities [86]. Many researchers have used an ensemble
approach to boost their performance [42], [133]. Agarwal and Dixit [63] combined
two datasets, namely, Liar and Kaggle, and tried to evaluate the performance of
LSTM and achieved an accuracy of 97%. They also used various models like
CNN, LSTM, SVM, naive bayes (NB), and k-nearest neighbour (KNN) for
building an ensemble model. They showed an average accuracy score of their used
algorithms but did not show the accuracy of their ensemble model, which is a
limitation of their work.

Often the CNN-LSTM ensemble approach has been used in previous DL-based
studies. Kaliyar [67] used an ensemble of CNN and LSTM, and the accuracy was
slightly lower than that of the state-of-the-art CNN model. However, the precision
and recall were effectively improved.

1
Asghar et al. [135] increased the efficiency of their model using an Bi-LSTM. The
Bi-LSTM retains knowledge from both former and upcoming contexts before
rendering its input to the CNN model. Even though CNN and RNN typically
require huge datasets to function successfully,
Ajao et al. [133] trained LSTM-CNN with a smaller dataset. The above mentioned
works mainly focused and considered text-based features for fake news
classification, whereas the addition of new features may generate a more
significant result. While most studies used CNN with LSTM, a study by Amine et
al. [131] merged two convolutional neural networks to integrate metadata with text.
They illustrate that integrating metadata with text will result in substantial
improvements in fine-grained fake news detection. Furthermore, when tested on
real-world datasets, this approach shows improvements compared to the text only
deep learning model. Moving further

Kumar et al. [86] employed the use of an attention layer. It assists the CNN
+LSTM model in learning to pay attention to particular regions of input sequences
rather than the full series of input sequences. Utilizing the attention mechanism
with CNN+LSTM was reported to be efficient by a small margin.
.
2
Deep learning for misinformation detection on online social networks:
a survey and new perspectives

Discriminative model for detecting misinformation

6
Social content and context-based features were incorporated in a range of
6
discriminative models for MID. Several research have been done in recent years to
combat the problem of disinformation, with promising preliminary results. As a
6
result, we'll take a quick look at the three discriminative models: CNN, RNN, and
2
RvNN. The discriminative-based models have shown significant improvements in
text classification and analysis.

The Convolutional Neural Network (CNN) is a type of neural network. CNN is one
of the most popular and commonly utilised models (LeCun et al. 2010). However,
it has recently been widely used in the NLP community as well (Jacovi et al.
2018). Chen et al. (2017), for example, proposed a convolutional neural
network-based classifier with single and multi-word embedding for detecting both
rumour and posture tweets.

2
To tackle MID, Kumar et al. (2019) used a CNN and a bidirectional LSTM
ensembled network with an attention mechanism. Furthermore, according to Yang
et al. (2018), online social media is rising in popularity, and real users are being
23
targeted by a large number of fake users. They stated that fake news is created with
6
the purpose to deceive users. They used the TI-CNN model to identify explicit and
2
latent features from text and image data in their work. They demonstrated that their
methodology effectively solves the problem of detecting fake news.
Network of Recurrent Neural Networks (RNN) RNN makes use of the network's
sequential information, which is critical in many applications where the data
sequence's underlying structure transmits useful knowledge (Alkhodair et al.
2
2020). The capacity of RNN to better capture contextual information is one of its
advantages. Existing methods for detecting rumours rely on handcrafted
characteristics and machine learning algorithms, which necessitate a significant
amount of manual labour. To address this problem, Ma et al. (2016) reported the
first use of RNNs for rumour detection, while Chen et al. (2018) and Jin et al.
(2017b) reported the first use of recurrent neural networks with attention
mechanisms. Different RNN architectures, such as tanh-RNN, LSTM, and Gated
Recurrent Unit (GRU), have been proposed by the authors (Cho et al. 2014a).
4
Among the proposed architectures, GRU has obtained the best results in both the
datasets considered, with 0.88 and 0.91 accuracy, respectively. Ma et al. (2016)
proposed a RNN model to learn and that captures variations in relevant information
in posts over time. Additionally, they described that RNN utilizes the sequential
information in the network where the embedded structure in the data sequence
conveys useful knowledge. They demonstrated that their proposed model can
capture more data from hidden layers which give better results than the other
models.

Recursive Neural Network (RvNN)


Researchers are more concerned to identify unscrupulous users in SN and want to
protect genuine users from fraudulent behavior (Guo et al. 2019). Therefore, RvNN
is one of the most widely used and successful networks for many natural language
2
processing (NLP) tasks (Socher et al. 2013; Zubiaga et al. 2016a). This architecture
processes objects that can make predictions in a hierarchical structure and classifes
the outputs using compositional vectors. To reproduce the patterns of the input
layer to the output layer, this network is trained by auto-association. Also, this
model analyzes a text word by word and stores the semantics of all the previous
texts in a fxed-sized hidden layer (Cho et al. 2014b). For instance, Zubiaga et al.
(2016b) proposed a RvNN architecture for handling the input of diferent
modalities. Ma et al. (2018) proposed a model that collects tweets from Twitter and
extracts features from discriminating information. It follows a non-sequential
pattern to present a more robust identifcation of the various types of rumour-related
content structures.

Generative model for detecting misinformation


2
Over time, online social media platforms have become the primary target of
7
misinformed beliefs, with misleading opinions (such as rumour, spam, troll, and
fake news) designed to sound genuine or to create uneducated opinions. Several
7
existing approaches for MID are based on opinion-related syntactic and lexical
patterns. As a result, the successful usage of five generative models, namely RBM,
DBN, DBM, GAN, and VAE, on diverse classification applications is presented in
this part.

Boltzmann Machine with Restrictions (RBM) A generative stochastic artificial


neural network, or RBM, is a type of artificial neural network. Over its collection
of inputs, it can learn a probability distribution (Liao et al.2016). Although learning
is impractical in broad Boltzmann machines, it can be quite effective in a
constrained Boltzmann machine architecture. It does not, however, permit
intra-layer communications between concealed units (Papa et al. 2015). As a result
of this strategy of stacking RBMs, it is possible to efficiently train several layers of
hidden units. RBMs have been used in a variety of applications, but just a few
4
studies have focused on MID.

However, in the last few decades, researchers are attempting to ft this method to
identify fake, rumour, spam, etc. on social media platforms. For instance, Da Silva
et al. (2018), da Silva et al. (2016), and Silva et al. (2015) applied RBMs to
automatically extract the features related to spam Deep Belief Network (DBN)
DBN is a generative graphical model composed of multiple layers of latent
variables (hidden units). It connects between the layers but not between units
within each layer. DBNs can be viewed as a composition of simple, unsupervised
networks such as restricted Boltzmann machines (RBMs) or autoencoders,where
each subnetwork’s hidden layer serves as the visible layer for the next. There are
already many works that have used this network (Li et al. 2018b; Yepes et al. 2014;
Alom et al. 2015; Selvaganapathy et al. 2018). For example, Tzortzis and Likas
(2007) stated that spam is an unexpected message which contains inappropriate
information and first applied to ft DBNs for spam detection. In another paper,Wei
et al. (2018) proposed a DBN-based method to identify false data injection attacks
in the smart grid. They demon strated that the DBN-based method achieves a better
result than the traditional SVM-based approach. Deep Boltzmann Machine (DBM)
DBM is a type of binary pairwise markov random field with multiple layers of
hidden random variables. This is a network of symmetrically coupled stochastic
binary units which have been used to detect malicious activities (Zhang et al. 2012;
Dandekar et al. 2017). For example, Jindal et al. used a multimodal benchmark
dataset for fake news detection. They presented results from a Deep Boltzmann
Machine-based multimodal DL model (Srivastava and Salakhutdinov 2012). Zhang
et al. (2012) generated a model based on DBMs to detect spoken queries. They
presented that their proposed method achieved 10.3% improvement compared to
that with theprevious Gaussian model. Generative Adverserial Network (GAN)
GAN is a class of ML systems (Goodfellow et al. 2014). Given a trainingset, this
technique learns to generate new data with the same statistics as the training set.
When considering earlier studies, we see that the widespread rumors usually result
from the deliberate dissemination of information which is gener ally aimed at
forming a consensus on rumor news events.

Ma et al. (2019) proposed a generative adversarial network model to make


automated rumor detection more robust and efficient and is designed to identify
powerful features related to uncertain or conficting voice production and rumors.
4
Variational Autoencoder (VAE) VAE models make strong assumptions concerning
the distribution of latent variables. The use of a variational approach for latent
representation learning results in an additional loss component and a specifc
estimator for the training algorithm called the stochas-tic gradient variational bayes
(SGVB) estimator. Qian et al. (2018) proposed a generative conditional VAE
model to extract new patterns by analyzing a user’s past meaningful responses on
true and false news articles and played a vital role in detecting misinformation on
social media. Wu et al. (2017) explored whether the knowledge from the historical
data analysis can beneft rumor detection. The result of their study was that similar
rumors always produce the same.

Hybrid model for detecting misinformation


The tasks of detecting misinformation (such as fake news, rumor, spam, troll, false
information, and disinformation) have been made in a variety of ways. A lot of
research works have been done using various DL models separately. However, to
increase the performance of individual models, the need for hybrid models are
immense. Therefore, over the last few decades, hybrid DL has been considered an
emerging technique for various purposes. In this section, we review some related
works on MID based on the deep hybrid model.

The hybrid model consists of CRBM, CRNN, EBF, and LSTM.

Convolutional Recurrent Neural Network (CRNN)


Currently, researchers are actively working on applying CNN and RNN models in
a hybrid fashion to gain higher performance in diverse applications. Real-world
data, they claim, are structured sequences with spatio-temporal sequences. Several
publications, for example, used a combination of CNN and RNN to model spatial
and temporal regularities (Lin et al. 2019; Xu et al. 2019; Wang et al. 2019). For
variable length expectations, their models can analyse time-shifting visual
2
contributions. A CNN for visual element extraction is combined with an RNN for
11
grouping learning in these neural network architectures. Furthermore, such models
have been successfully used to detect bogus news, rumour, misleading information,
and spammers.
Lin et al. (2019), for example, introduced a novel rumour identification approach
based on a hierarchical recurrent convolutional neural network to recognise rumour
for events on social media platforms. They learn contextual information using the
RCNN model and time period information using the bidirectional GRU network.
2
For rumour identification on Sina Weibo, Xu et al. (2019) developed a CRNN
model to extract data from textual overlays, such as captions, important ideas, or
2
scene level summaries. This CRNN model was proposed to generate training data
for textual overlays that occur frequently on the online Sina Weibo platform.

Convolutional Restricted Boltzman Machine (CRBM)


An extension of the RBM model, called the convolutional RBM (CRBM), was
2
developed by Norouzi (2009). CRBM, like the RBM, is a two-layer model in
which
visible and hidden arbitrary factors are organized as matrices. He presented a
method for combining a Boltzmann machine and a convolutional limited
2
Boltzmann machine into a deep network for image processing and feature
extraction. He also provided a simple and intuitive training strategy that optimises
all RBMs in the network at the same time, which has proven to be effective in
practise. For example, Norouzi et al. (2009) suggested a CRBM model for learning
object class-specific properties. The spatial structure of photographs is used to
6 6
build a layered progressive system of trading, separating, and pooling in which
associations are close by and loads are shared.

Ensemble-Based Fusion (EBF)


To study profile information, Wang (2017) proposed a hybrid model where he used
2
speaker profiles as a part of the input data. He also created the first large-scale false
news detection benchmark dataset, which included speaker attributes such as
speech location, affiliation, job description, credit history, and topic. Tschiatschek
et al. (2018) looked at the crucial problem of detecting fake news using crowd
2
signals. They used revolutionary techniques designed to perform Bayesian
inference to detect fake news by analysing user fagging habits. Their tests were
successful in detecting a real user's fagging behaviour. Zhang et al. (2018b)
2
investigated a novel method for detecting fake news on social media. They
identified certain deceptive words that these online phoney users could employ to
harm society.

Roy et al. (2018) presented misinformation and applied many various DL


2
techniques such as CNN, Bi-LSTM, and MLP to detect fake news. They claimed
that the rate of misinformation is increasing rapidly.

LSTM Density Mixture Model


Although earlier methods for automatically detecting fake news have relied on
6
lexical features, the hybrid deep neural network has gotten a lot of attention
throughout the world. For example, according to Ruchansky et al. (2017), false
news identification has received a lot of interest from both the research and
academic worlds. In their work, they identifed three types of fake news:

(1) the text of an article,


(2) the user response it receives, and
(3) the source on which users promote it.
They discovered that fake news had a significant impact on public sentiment.
6
Existing research has primarily focused on personalising solutions to a single
problem, with mixed results. Ruchansky et al. (2017), on the other hand, suggested
a hybrid model that combines all of their properties to predict a more accurate and
automated result. Similarly, Long et al. (2017) suggested an unique method for
detecting fake news that incorporates speaker profles into an attention-based
LSTM model. Furthermore, multiple research have shown that an LSTM-based
hybrid model works better for lengthier sentences, and attention models have been
proposed to rate the value of various words in their context (Tang et al. 2015;
Prova et al. 2019). According to Kudugunta and Ferrara (2018), bots have been
used to impact political elections by distorting online debate, as well as to
manipulate the stock market, potentially causing health outbreaks. To detect bots,
2
they used an LSTM-based architecture that takes advantage of both content and
metadata. They stated that their model can achieve a level of accuracy that is more
than 0.96 AUC. The automatic detection and filtering of unsuitable messages or
remarks, according to Yenala et al. (2018), has become a significant problem for
2
increasing the quality of discussions with users and virtual agents. They suggested
a new hybrid DL model to detect inappropriate language automatically.

Zhao et al. (2015) created a hybrid model namely C-LSTM where they combined
CNN and LSTM for the sentiment analysis of movie reviews and question-type
classifcation.
OPEN ISSUES AND CHALLENGES

15
fromDeep learning for misinformation detection on online social networks:
a survey and new perspectives

Here in this section we'll be summarising the limiations identified in various


proposed methods :
Semantics of various misinformation which is written to mislead people and
programs are difficult to understand. Existing studies (Shu et al. 2019a;
Braşoveanu and Andonie 2019) for MID cover into various language styles.

In existing studies like (Wang et al. 2018; Farajtabar et al. 2017; Jin et al.2017b)
it's stated how misinformed news can take form of images, videos and text. And
6
such different modalities can provide clues of MID. Extracting such features which
are very prominent and important is very difficult.

Just finding and detecting fake news is not enough to remove and solve this
problem rather we should find ways to find the source and ways through which this
2
fake news pipeline was built. Basically deeply understanding the role of nodes in
information spreading and finding capabilities of epidemic is very important.
2
most of the research works (a) tend to focus on alerting users but give no
explanation as to why this is misinformation; (b) focus more on directly engaged
users for the detection of misinformation. But if the users are not directly related,
some users play an effective role in spreading misinformation on online social
media. As they are not directly related, identifying them is a very difficult job.
Anomalous and Normal User Identification As the number of people who depend
on online social media are growing, dishonest users try to exploit this opportunity.
In most cases, dishonest people do this for their benefit (Zhao et al. 2014; Feng and
Hirst 2013). Although researchers have used many methods to identify dishonest
users, many more approaches can be investigated, for example, perhaps a new
technique or modified version of an existing technique could be developed.

2
Advanced and new Deep learning methods such as reinforcement learning can be
used to solve this problem, explore more information and better detect,dissect and
understand disinformation.

Most studies on fake news related problems work on various effects of static data
but have not analysed the effects of topology on real-time data (Wei and Wan
2
2017;Kim et al. 2018). Therefore, we need to consider dynamic model and data
2
capturing methods to capture the uncertainty of user behaviour to reduce the spread
of fake news and misinformation.
CHALLENGE RESOLUTION
AND
FUTURE RESEARCH DIRECTIONS

2
Just like anything in nature, everything has good and bad aspects around it.
Spreading misinformation is one such case.
There has been lot of research works on MID resulting in various new techniques.
Hence in this paper we have analyzed effective ways to eliminate misinformation
for the well-being of people. Misinformation can be detected in online media using
dl and SN techniques. In totality these are various finding we think will be possible
in future of this domain of study

2
● DL can work on large sized data but it has difficulty to find and process
these datasets and training a model. This difficulty results in lot of time
consumption in training DL models hence it should be further researched to
train model with dynamic datasets in lesser time.
● Analysing dynamic data is very important.
● If in MID a new function can be added which shows and tells us why the
news is fake might help us in building better research products and
acceptable to people.
● Reinforcement learning is gaining pace in research domain in which there is
a agent which learns how to interact in an environment and produce positive
result determined from positive feedback. Hence combining DL and RL can
be future of this domain.
● Tackling the over-fitting problem in DL which ill help in working model in
real life situations.
1
● Feature and classifier selection greatly affects the efficiency of model. The
1
long textual features require the use of sequence models (RNNs), but limited
research works have taken this into account. We believe that studies that
concentrate on the selection of features and classifiers might potentially
improve performance.
1
● User behaviour, user profile and social network behaviour need to be
explored. Political and religious biases of people can also help in building
1
profile. Fusion of deep text features and statistical features may result in
better results and accuracy.
1
● Transfer learning approach for training a neural network with online data
streams.
1
● Existing algorithms make critical decisions without providing precise
information about the reasoning that results in specific decisions,
predictions, recommendations, or actions [161]. Explainable Artificial
Intelligence (XAI) is a study field that tries to make the outcomes of AI
systems more understandable to
● humans [162]. XAI can be a valuable approach to start making progress in
this area.
CONCLUSIONS
14
Recently, fake news has emerged as one of the most threatening harms on
social media. Fake news can be used by people of power and malicious people
5
to add wrong ideas in human brain affecting decision of important daily
activities, like stock markets, health-care options, online shopping, education,
and even presidential election. Automatic detection of online fake news is an
extremely significant but challenging task for both industry and academia
(Ruchansky et al., 2017). In this survey, we present a comprehensive overview
of online fake news detection. And the key contributions of this paper can be
summarized as follows.

(1) We have discussed the in-depth understanding of the important aspects of


online fake news, such as the news creator/spreader, news targets, news
content and social context. The clear understanding and characterization of
online fake news or fraud can play a significant role in social communication
data analysis and anomalous information detection.

(2) We also compared the existing detection approaches, listing an exhaustive


set of hand-crafted features, and evaluating the existing datasets for training
supervised models, we provide a fundamental review for fake news detection.
As a detailed guideline, our survey can bring valuable knowledge and
practical convenient for both researchers and participators.

(3) Some potential research focuses are proposed in order to address the open
issues, improve the existing detection frameworks, and establish an effective
online fake news monitoring and detection system.
REFERENCES

1. Islam, M.R., Liu, S., Wang, X. et al. Deep learning for misinformation
detection on online social networks: a survey and new perspectives. Soc.
Netw. Anal. Min. 10, 82 (2020).
https://doi.org/10.1007/s13278-020-00696-x

2. de Beer, D., Matthee, M. (2021). Approaches to Identify Fake News: A


Systematic Literature Review. In: Antipova, T. (eds) Integrated Science in
Digital Age 2020. ICIS 2020. Lecture Notes in Networks and Systems, vol
136. Springer, Cham. https://doi.org/10.1007/978-3-030-49264-9_2

3. M. F. Mridha, A. J. Keya, M. A. Hamid, M. M. Monowar and M. S.


Rahman, "A Comprehensive Review on Fake News Detection With Deep
Learning," in IEEE Access, vol. 9, pp. 156151-156170, 2021, doi:
10.1109/ACCESS.2021.3129329.

4. Georgios Gravanis, Athena Vakali, Konstantinos Diamantaras, Panagiotis


Karadais, Behind the cues: A benchmarking study for fake news detection,
Expert Systems with Applications, Volume 128, 2019, Pages 201-213, ISSN
0957-4174, https://doi.org/10.1016/j.eswa.2019.03.036.

5. Alessandro Bondielli, Francesco Marcelloni, A survey on fake news and


rumour detection techniques, Information Sciences, Volume 497, 2019,
Pages 38-55, ISSN 0020-0255, https://doi.org/10.1016/j.ins.2019.05.035
Similarity Report ID: oid:27535:16602143

53% Overall Similarity


Top sources found in the following databases:
23% Internet database 50% Publications database
Crossref database Crossref Posted Content database
15% Submitted Works database

TOP SOURCES
The sources with the highest number of matches within the submission. Overlapping sources will not be
displayed.

M. F. Mridha, Ashfia Jannat Keya, Md. Abdul Hamid, Muhammad Most...


1 21%
Crossref

link.springer.com
2 8%
Internet

M. F. Mridha, Ashfia Jannat Keya, Md. Abdul Hamid, Muhammad Most...


3 6%
Crossref

Md Rafiqul Islam, Shaowu Liu, Xianzhi Wang, Guandong Xu. "Deep lear...
4 6%
Crossref

Xichen Zhang, Ali A. Ghorbani. "An overview of online fake news: Chara...
5 4%
Crossref

ncbi.nlm.nih.gov
6 2%
Internet

University of Wales Institute, Cardiff on 2021-11-03


7 1%
Submitted works

coursehero.com
8 1%
Internet

Sources overview
Similarity Report ID: oid:27535:16602143

Jamal Abdul Nasir, Osama Subhani Khan, Iraklis Varlamis. "Fake news ...
9 <1%
Crossref

deeplearning.lipingyang.org
10 <1%
Internet

cradpdf.drdc-rddc.gc.ca
11 <1%
Internet

University of London External System on 2022-04-02


12 <1%
Submitted works

researchgate.net
13 <1%
Internet

University of North Texas on 2020-07-13


14 <1%
Submitted works

University of Limerick on 2021-04-06


15 <1%
Submitted works

M. Gori, G. Monfardini, F. Scarselli. "A new model for learning in graph ...
16 <1%
Crossref

dblp.org
17 <1%
Internet

ieeexplore.ieee.org
18 <1%
Internet

"Responsible Design, Implementation and Use of Information and Com...


19 <1%
Crossref

Dilip Kumar Sharma, Bhuvanesh Singh, Abhishek Garg. "An Ensemble ...
20 <1%
Crossref

Sources overview
Similarity Report ID: oid:27535:16602143

University of Bristol on 2018-04-27


21 <1%
Submitted works

Dinesh Kumar Vishwakarma, Deepika Varshney, Ashima Yadav. "Detec...


22 <1%
Crossref

Liverpool John Moores University on 2021-09-20


23 <1%
Submitted works

University of Bristol on 2021-09-27


24 <1%
Submitted works

University of New South Wales on 2019-07-01


25 <1%
Submitted works

Alessandro Bondielli, Francesco Marcelloni. "A survey on fake news an...


26 <1%
Crossref

University of East London on 2021-05-12


27 <1%
Submitted works

Sources overview

You might also like