2022 - IJKL False News Spreading

Int. J. Knowledge and Learning, Vol. 15, No.
4, 2022 307
The impact on society of false news spreading on

social media with the help of predictive modelling
Riktesh Srivastava
City University College of Ajman,
Sheikh Ammar Road – Al Tallah 2, Ajman, UAE
Email: r.srivastava@cuca.ae
Jitendra Singh Rathore

Faculty of Management Studies,
Banasthali University,
Vanasthali Road, Dist., Vanasthali,
Rajasthan 304022, India
Email: jitendrasinghrathore@banasthali.in
Sachin Kumar Srivastava*

IILM Academy of Higher Learning,
Lucknow, India
Email: drsachinksrivastava@gmail.com
*Corresponding author
Khushboo Agnihotri
Amity Business School,
Amity University,
Amity Rd., Sector 125, Noida,
Uttar Pradesh 201301, India
Email: agnihotrikhushboo@gmail.com
Email: kagnihotri@lko.amity.ac.edu
Abstract: Nowadays, the interaction on social media for the latest news is an
excellent source of information. Most of the time we read online news that may
primarily appear authentic, but we cannot assure it because it does not happen
every time. According to Gartner’s published report, by 2022, most mature
economies will get fake information than the correct information, mainly
through social media. Fake news is one of the prevalent threats in our digitally
linked world. This paper proposes a model for recognising fake news through
the dataset from the Kaggle. There was 3,000 news collected from various
social media sources in the dataset, of which 2,725 news is a training dataset
and 275 for the test dataset. The fake and real news is classified and compared
using five machine learning classification algorithms and analysed accordingly.
The five classification algorithms are support vector machine (SVM), naïve
Bayes, logistic regression, random forest, and neural networks.
Keywords: support vector machine; SVM; naïve Bayes; logistic regression;

random forest; neural networks; classification accuracy; CA; precision; recall;
F-1 score.
Copyright © 2022 Inderscience Enterprises Ltd.

308 R. Srivastava et al.
Reference to this paper should be made as follows: Srivastava, R.,

Rathore, J.S., Srivastava, S.K. and Agnihotri, K. (2022) ‘The impact on society
of false news spreading on social media with the help of predictive modelling’,
Int. J. Knowledge and Learning, Vol. 15, No. 4, pp.307–318.
Biographical notes: Riktesh Srivastava received his PhD in Electronics

Engineering and obtained his Master’s in Management from the Indian Institute
of Management, Ahmedabad (IIMA). He also holds certifications on Marketing
and Customer Analytics from Wharton School, University of Pennsylvania,
USA; Marketing Analytics, ESSEC Business School, France; Fundamentals of
Project Planning and Management, University of Virginia, USA; Electronic
Commerce from Nanyang Technological University, Singapore; Innovation and
Information Technology Management at IIMB, India; and Six Sigma, Council
of Sigma. He has authored 34 research articles, 24 conferences paper and three
books papers. Six times, he has been awarded as best researcher award and also
won the most innovative research award by the UAE government in 2014.
Currently, he is an Associate Professor, MBA at City University College of
Ajman, UAE.
Jitendra Singh Rathore is an Assistant Professor at the Faculty of Management

Studies-WISDOM, Banasthali Vidyapith, India. He has over 16 years of
experience in both industry and academics and teaches marketing management,
sales and distribution management and e-business to undergraduate and
postgraduate students. Prior to joining academics, he worked with corporates
like SBI cards, Kotak Securities and Bajaj Allianz in various capacities
primarily responsible for training, channel management and generating sales.
His areas of research interest include multichannel sales, e-commerce and
customer relationship management (CRM).
Sachin Kumar Srivastava is an Accomplished Professional and Professor in the

field of Marketing and Sales. He received his BE and MTech in Mechanical
Engineering and MBA degree with major in Marketing and PhD in Public
Administration. He started his career in 2005 with Bharti Tele venture Services.
Later, he moved to academics as a lecturer. In 16 years of his career, he worked
as a lecturer, senior lecturer, an associate professor, professor and the dean of
management. Currently, he is working as a Professor IILM Academy of Higher
Learning, Lucknow, India. He has authored 24 research articles, ten
conferences paper and four books. Two times, he has been awarded as best
faculty award by well-known associations.
Khushboo Agnihotri is working as a Senior Assistant Professor in the field of

Economics and Marketing. She received her BCom, MCom and PhD in
Applied Economics. She started her career in academics in 2006. In 15 years of
her career, she worked as a Lecturer, Senior Lecturer, Assistant Professor, and
Senior Assistant Professor. Currently, she is working as a Senior Assistant
Professor of Economics, Amity Business School, Amity University Uttar
Pradesh Lucknow Campus.
1 Introduction
Web 3.0 has prompted the rise of user-generated content through social media that
empower users to examine any form of news and believe them. (Srivastava et. al., 2020)
The main objective of social media was to socialise with friends and colleagues and use it
The impact on society of false news spreading 309
for different purposes like education and business. Sadly, it becomes the platform for
performing unlawful activities and spreading fake news (Islam et al., 2020). They may be
fake news about the customer recently purchased online or expressing political views;
spreading them through social media has become a trend (Lazer et al., 2018). Social
media helps people communicate and spread the news (Duffy et al., 2020).
Unfortunately, it was observed that 62% of news spreads on social media are fake
(Statista, 2020). The fake news spread through social media is often presented
sensationally; thus, it is rapidly picked and circulated (Bergström and Belfrage, 2018;
Harper, 2010; Li et al., 2017). The spread of fake news through social media also caused
damage to different domains of society and are mentioned by different authors in their
study. These domains include financial markets (Kogan et al., 2020), online retailing
(Martens and Maalej, 2019), and healthcare (Lara-Navarra et al., 2020; Smaldone et al.,
2020). Manzoor et al. (2019 and Shu et al. (2017) states that fake news affects
individual’s people’s lives with a negative impact.
The fake news created through social media aims to misguide readers (Fernandez and
Alani, 2018; Zhang et al., 2018) via a false account (Kumar and Shah, 2018; Shu et al.,
2019). The fake news spread through social media is usually well-written, long, and well-
referenced (Collins et al., 2020; Pennycook and Rand, 2021). The researchers applied
various techniques to distinguish between fake and real news or real and fraudulent
accounts over the past years. However, it was challenging for the conventional methods
to analyse and predict all types of fake news. (Saxena et al., 2017).
This paper recommends a methodology to create a predictive model that will detect if
the news spread on social media is fake or real based on its words, phrases, sources, and
titles. The suggested predictive models apply supervised machine learning algorithms on
an annotated dataset. Then, feature selection picks the best-fit features to obtain the
highest precision. The predictive models get trained on the unseen data, and one with the
highest precision is selected. The selected model uses the test data for further analysis.
2 Literature review
Research to detect fake news spread through social media using various machine learning
algorithms is available. However, current research focuses primarily on using social
features and keywords using a specific classification algorithm.
Thota et al. (2018) came with a deep learning algorithm using binary classification to
detect fake news from social media with 94.21% accuracy.
Liu and Wu (2018) demonstrated the detection of fake news on social media through
a convolution algorithm using time series data. The author researched reports on Twitter
and Sina Weibo with 85% and 92% correct classification.
Aldwairi and Alwahedi (2018) used logistic classification algorithm for detection of
fake news on social media, claiming 99.2% accuracy.
Cardoso Durier da Silva et al. (2019) study neural networks to detect the spread of
fake news on social media. The authors claimed that the spread of fake news is because
of the deficiency of consensual evidence.
Ahmad et al. (2020) proposed ensemble machine learning approach to detect fake
news on social media using four performance metrics: accuracy, precision, recall, and
F-1 score.
Oliveira et al. (2020) proposed computational analysis using natural language

processing to classify between fake and authentic news. The author used accuracy and
precision, with 86% and 94% results, respectively.
Collins et al. (2020) used natural language processing for building a hybrid model to
detect fake news from social media. The author classified fake news into four categories:
Clickbait, Propaganda, Satire, and Hoaxes and claimed that all fake news falls into these
categories.
Kaliyar et al. (2021) proposed an algorithm called bidirectional encoder
representations through transformers (FakeBERT) using convolution neural networks.
The authors stated that the algorithm is 98.90% accurate in detecting fake news.
The research also adopted different classification algorithms used by other
researchers to differentiate between fake and authentic news but did a comparative
analysis. The classification algorithms are support vector machine (SVM), naïve Bayes,
logistic regression, random forest, and neural networks. For comparative analysis, the
evaluation metrics used are classification accuracy (CA), precision, recall, and F-1 score.
3 Research methodology
The research follows the steps from the TDSP framework, team data science process
(Martinez et al., 2021), and includes the following phases
• download and collect the dataset, which includes both the train and test observations
(Section 3.2)
• perform the data cleaning for the test and train dataset (Section 3.2)
• [data modelling] propose the predictive model to classify the news as fake or
authentic (Section 4)
• [performance analysis] use performance metrics to identify the machine learning
models with the best results (Section 5).
3.1 Train and test data sets

The data collected from the Kaggle data set 3,000 news from social media. This news is
randomly divided into train and test data sets with 2,725 (≈ 9 0%) train data and 275
(≈ 10%) as test data. Figure 1 gives the variables for the test data
Figure 1 Data description for the test data
label text title

discrete string string
class meta meta
3.2 Data cleaning

Data cleaning involves pre-processing of the text before being used as an input to train
the models. We adopted that the news with an unwanted variable title is filtered out.
4 Data modelling
The modelling process consists of choosing models based on different predictive models
used in the research. The study’s five predictive models are logistic regression, SVM,
decision tree, naïve Bayes, and neural networks. The accuracy of the predictive models
upsurges with the amount of data available during training. The dataset is divided into
two parts with 90:10 ratios, one used for training and testing (see Figure 2).
Figure 2 Fake news detection using predictive models (see online version for colours)
Raw Data
Training
Data
Document
Data Cleaning
Embedding
Test Data
Support Vector Machine

Model Training
Naïve Byes
Predictive Model
Trained Model
Logistics Regression Models Evaluation
Random Forest
CA, Precision, Recall,
F-1 score
Neural Networks
Optimal model selection
Prediction
Once the relevant attributes get selected after the data cleaning, the next step involves
document embedding – the document embedding groups similar text to word embeddings
and aggregated by calculating the mean.
The fake news posted on social media is identified and tested using the proposed five
different predictive models. All five predictive models use the binary classifications of
the news as fake and real. The section describes the mathematical formulation of each of
the selected predictive models and the mechanism of its classification. The selection of
the model is based on the receiver operating characteristic (ROC) curve accordingly.
4.1 Logistic regression

Logistic regression is a predictive model that can classify the news as fake and real in
binary format. With the two features in the downloaded dataset, the logistic regression.
1
hθ ( X ) = (1)
1 + e−( βθ + β1 X )
For the study, Y variable is independent variable Y with dependent variables as
( X ) ∈ { x1 , x2 } .
(2)
Y = w0 + w1 x1
Y ∈ [0, 1] for the binary classification problem, as fake and real.

The function used to calculate is called the sigmoid function, which gives the output a
probability value with a minimal cost function. The cost function is as shown in
equation (3).
log ( hθ ( X ) ) , y = 1 
Cost ( hθ ( X ), y ) =   (3)
− log ( −1 − hθ ( X ) ) , y = 0 
4.2 Support vector machine

SVM finds the optimal hyperplane concerning the feature sets to divide the data points
into two classes. Several hyperplanes separate the data points into different categories,
but SVM help in finding the optimal hyperplane, which directly affects the models’
performance. The model trains the data points through the closest information lying on
two hyperplanes (classified as fake and real) on either side of the ideal known help
vectors. The cost function for the SVM classifier is given mathematically in equation (4).
1

n
J (θ ) − θ 2j (4)
2 j =1
4.3 Naïve Bayes

Naïve Bayes is a probabilistic predictive model that depends on the bayes theorem, as
mentioned in equation (5)
 A  B  P ( A)
P  = P ∗ (5)
B  A  P( B)
The text data of the job description is converted into vectors. These encoding text in the
form of numbers will help decide whether the vector representation of news belongs to a
fake or real. A naïve Bayes classifier is trained to automatically categorise news into fake
or real using the probabilities defined in the Bayes theorem. From the bayes theorem,
replacing A and B from equation (5) to X and Y as feature matrix and response vector
respectively, the equation (6) will become:
X  y  P( X )
P  = P ∗ (6)
 y  X  P( y )
Thus, the probability of predicting a target with class k has given feature matrix X, given
a particular class (fake or real) of y times the probability of belonging to a specific class.
4.4 Artificial neural network

There are two steps to outline the artificial neural network (ANN) experimental setup for
classifying fraudulent employers and logistic regression for activation layer and
stochastic gradient descent (SGD) to identify the cost.
Based on the two features, we have a classification of news as fake or real. These data
are grouped into a matrix, column corresponds to features, and row represents a single
data point X, X ∈ (mxn). We will then have a vector containing the outputs, either fake o
real. For ANN, we also define the weight vector (adjust the values as per the cost
function) w, w ∈ n. The weighted sum is characterised as:
w1 x1 + w2 x2 + ... ... ... ... wm xm (7)
which is as:

m
wi xi = wT x (8)
i =1
where wT is wn.
The Logistic Regression uses probabilistic logistic function based on equation (8) and
is as:
1
P ( wT x ) = T
(9)
1 + e− w x
Based on equation (9), if the weighted sum for a data point is nearing 1, then we can
predict the data point to have a class 1, 0 otherwise, which is as:
1
P ( yˆ = 1 x : w ) = T
, for class 1 (10)
1 + e− w x
For correct classification, stochastic gradient descent (SGD) iterates the cost function as
1
 ( yˆ − y )
2
C= (11)
2
where initial y = 0, and ŷ is based on equation (10).
4.5 Random forest

The random forest predictive model uses decision trees and fits multiple decision trees
using averaging to improve the predictive accuracy and control over-fitting. The random
forest predictive model uses the Gini index to measure a variable’s probability of being
wrongly classified when randomly chosen. The degree of the Gini index varies between
0 and 1 and is denoted as:
c
Gini Index = 1 −  ( pi)
i =1
2
where pi is the probability of an object being classified to a particular class, and c refers
to the classes (fake or real).
The analysis and adoption of the model are based on the ROC curve (see Figure 3).
ROC curve is considered the most accurate and straightforward method of classifying the
classes as fake or real. By analogy, the ROC curve states higher the area under curve
(AUC), the better is the predictive model is at differentiating between fake or real news
on social media.
Figure 3 ROC curve of five predictive models (see online version for colours)
The AUC values for five different predictive models are given in Table 1.
Based on Table 1, the AUC value for the logistic regression predictive model gives
the highest value and is thus suitable to predict the news as fake or real. For detailed
analysis, in the following section, we will use four accuracy standards for all the five
predictive models and then select the suitable ones.
Table 1 AUC values for five predictive models
Predictive models AUC

SVM 0.867
Random forest 0.868
Neural network 0.539
Naïve Bayes 0.808
Logistic regression 0.888
5 Results evaluation
The data modelling process involves selecting machine learning techniques for predictive
modelling. The logistic regression model is used to recognise news as fake or real. The
accuracy standards from the confusion matrix are classification accuracy (CA), precision,
recall, and F-1 score (see Table 2).
Table 2 Accuracy standard to evaluate fake news on social media
Accuracy standards Evaluation criteria

Classification accuracy: It is the ratio of the number of Classification Accuracy > 0.5
correct predictions from total data points.
TP + TN
Classification Accuracy =
TP + TN + FP + FN
Precision: precision is the ratio between the true positives Precision > 0.5
TP
(TP) and all the positives. Precision =
TP − FN
Recall: Recall is the measure of the model correctly Recall > 0.5
TP
identifying true positives (TP). Recall =
TP + FN
F1-score: F-1 is the weighted harmonic mean for precision F1 > 0.5
and recall and creates a balance between precision and recall.
Precision ∗ Recall
F1 = 2 ∗
Precision + Recall
This phase assesses the predictive models’ abilities using four accuracy criteria
(classification accuracy, precision, recall, and F-1 score) from the confusion matrix,
summarised in Table 3. The outcomes are for 90% of the dataset for training the
predictive models.
As mentioned in Table 3, two predictive models, namely, logistic regression and
random forest, recognise almost 80% accuracy for all four accuracy criteria.
Nevertheless, we selected the logistic regression as its AUC is 0.888, higher than the
random forest AUC value (0.868).
Table 3 Evaluation metrics
CA Precision Recall F-1 Score

SVM 0.578 0.490 0.772 0.578
Random forest 0.796 0.798 0.796 0.796
Neural network 0.505 0.545 0.505 0.389
Naïve Bayes 0.735 0.735 0.735 0.735
Logistic regression 0.796 0.797 0.796 0.796
Table 4 testifies the confusion matrix of the logistic regression predictive model, which
suitably classified the news as fake or real with almost 70% accuracy for 10% of the test
data.
Table 4 Logistic regression predictive model outcomes for test data
CA Precision Recall F-1 score

Logistic regression
0.687 0.688 0.687 0.687
Random forest predictive model obtained:

• The true-negative rate (TNR) gave 67.8% results, indicating that the news on social
media was classified correctly by the model
• The true-positive rate (TPR) is also 69.8%, claiming that the proposed model
classified 189 news correctly out of the 275 from the test data.
This paper used a different predictive model to categorise the news as fake or real for the
downloaded Kaggle dataset. The results were investigated using four performance
metrics for assessing the suggested models: classification accuracy (CA), precision,
recall, and F-1 score. The experiment discovered that the logistic regression provides
tolerable outcomes (CA = 0.687, precision = 0.688, recall = 0.687 and F-1 score = 0.687).
The potential future work for this study will be a further development using other
predictive models as k-NN, Adaboost, or Tree predictive models. Data available have
restrictions regarding the facts of defaulters and timeline, which specifies the
comportment of default news obtainable from social media.
References
Ahmad, I., Yousaf, M., Yousaf, S. and Ahmad, M. O. (2020) ‘Fake news detection using machine
learning ensemble methods’, Complexity, Hindawi, 17 October, https://DOI.org/10.1155/
2020/8885861.
Aldwairi, M. and Alwahedi, A. (2018) ‘Detecting fake news in social media networks’, Procedia
Computer Science, Vol. 141, pp.215–222.
Bergström, A. and Belfrage, M.J. (2018) ‘News in social media’, Digital Journalism, Vol. 6, No. 5,
pp.583–598, https://DOI.org/10.1080/21670811.2018.1423625.
Cardoso Durier da Silva, F., Vieira, R. and Garcia, A.C. (2019) ‘Can machines learn to detect fake
news? A survey focused on social media’, Proceedings of the 52nd Hawaii International
Conference on System Sciences, 8 January, https://DOI.org/10.24251/HICSS.2019.332.
Collins, B., Hoang, D.T., Nguyen, N.T. and Hwang, D. (2020) ‘Trends in combating fake news on
social media – a survey’, Journal of Information and Telecommunication, pp.1–20,
https://doi.org/10.1080/24751839.2020.1847379.
Duffy, A., Tandoc, E. and Ling, R. (2020) ‘Too good to be true, too good not to share: the social
utility of fake news’, Information, Communication & Society, Vol. 23, No. 13, pp.1965–1979,
https://DOI.org/10.1080/1369118X.2019.1623904.
Fernandez, M. and Alani, H. (2018) ‘Online misinformation: challenges and future directions’,
Companion Proceedings of The Web Conference, pp.595–602, https://DOI.org/10.1145/
3184558.3188730.
Harper, R.A. (2010) ‘The social media revolution: exploring the impact on journalism
and news media organizations’, Inquiries Journal, Vol. 2, No. 3 [online]
http://www.inquiriesjournal.com/articles/202/the-social-media-revolution-exploring-the-
impact-on-journalism-and-news-media-organizations (accessed 20 October 2021).
Islam, M.R., Liu, S., Wang, X. and Xu, G. (2020) ‘Deep learning for misinformation detection on
online social networks: a survey and new perspectives’, Social Network Analysis and Mining,
Vol. 10, No. 1, pp.82, https://doi.org/10.1007/s13278-020-00696-x.
Kaliyar, R., Goswami, A. and Narang, P. (2021) ‘FakeBERT: fake news detection in social media
with a BERT-based deep learning approach’, Multimedia Tools and Applications,
https://DOI.org/10.1007/s11042-020-10183-2.
Kogan, S., Moskowitz, T.J. and Niessner, M. (2020) ‘Fake news in financial markets’, (SSRN
Scholarly Paper ID 3237763)’, Social Science Research Network, https://DOI.org/10.2139/
ssrn.3237763.
Kumar, S. and Shah, N. (2018) ‘False information on web and social media: a survey’, 23 April,
Vol. 1, No. 1, pp.1–35.
Lara-Navarra, P., Falciani, H., Sánchez-Pérez, E.A. and Ferrer-Sapena, A. (2020) ‘Information
management in healthcare and environment: towards an automatic system for fake news
detection’, International Journal of Environmental Research and Public Health, Vol. 17,
No. 3, p.1066, https://DOI.org/10.3390/ijerph17031066.
Lazer, D.M.J., Baum, M.A., Benkler, Y., Berinsky, A.J., Greenhill, K.M., Menczer, F.,
Metzger, M.J., Nyhan, B., Pennycook, G., Rothschild, D., Schudson, M., Sloman, S.A.,
Sunstein, C.R., Thorson, E.A., Watts, D.J. and Zittrain, J.L. (2018) ‘The science of fake
news’, Science, Vol. 359, No. 6380, pp.1094–1096, https://DOI.org/10.1126/science.aao2998.
Li, B., Stokowski, S., Dittmore, S.W. and Scott, O.K.M. (2017) ‘For better or for worse: the impact
of social media on Chinese sports journalists’, Communication & Sport, Vol. 5, No. 3,
pp.311–330, https://DOI.org/10.1177/2167479515617279.
Liu, Y. and Wu, Y-F. (2018) ‘Early Detection of Fake News on Social Media Through Propagation
Path Classification with Recurrent and Convolutional Networks.
Manzoor, S.I., Singla, J. and Nikita (2019) ‘Fake news detection using machine learning
approaches a systematic review’, 3rd International Conference on Trends in Electronics and
Informatics (ICOEI), pp.230–234, https://DOI.org/10.1109/ICOEI.2019.8862770.
Martens, D. and Maalej, W. (2019) ‘Towards understanding and detecting fake reviews in app
stores’, Empirical Software Engineering, Vol. 24, No. 6, pp.3316–3355, https://DOI.org/
10.1007/s10664-019-09706-9.
Martinez, I., Viles, E. and Olaizola, I.G. (2021) ‘Data science methodologies: current challenges
and future approaches’, Big Data Research, Vol. 24, p.100183, https://DOI.org/10.1016/
j.bdr.2020.100183.
Oliveira, N.R. de, Medeiros, D.S.V. and Mattos, D.M.F. (2020) ‘A sensitive stylistic approach to
identify fake news on social networking’, IEEE Signal Processing Letters, Vol. 27,
pp.1250–1254, https://doi.org/10.1109/LSP.2020.3008087.
Pennycook, G. and Rand, D.G. (2021) ‘The psychology of fake news. Trends in Cognitive
Sciences, Vol. 25, No. 5, pp.388–402, https://DOI.org/10.1016/j.tics.2021.02.007.
Saxena, A. and Srivastava, S. K. (2017) ‘Online to offline platform: a case study of Firstcry.com’,
International Journal of Economic Perspectives, Vol. 11, No. 3, pp.424–430.
Shu, K., Sliva, A., Wang, S., Tang, J. and Liu, H. (2017) ‘Fake news detection on social media: a
data mining perspective’, ACM SIGKDD Explorations Newsletter, Vol. 19, No. 1, pp.22–36.
https://doi.org/10.1145/3137597.3137600.
Shu, K., Zhou, X., Wang, S., Zafarani, R. and Liu, H. (2019) ‘The role of user-profiles for fake
news detection’, Proceedings of the IEEE/ACM International Conference on Advances in
Social Networks Analysis and Mining, pp.436–439, https://DOI.org/10.1145/3341161.
3342927.
Smaldone, F., Ippolito, A. and Ruberto, M. (2020) ‘The shadows know me: exploring the dark side
of social media in the healthcare field’, European Management Journal, Vol. 38, No. 1,
pp.19–32, https://doi.org/10.1016/j.emj.2019.12.001.
Srivastava, S.K. and Agnihotri, K. (2020) ‘Relational study between significance level of frontline
executives and their happiness level in an organisational setup: a critical analysis’, Int. J. Work
Organisation and Emotion, Vol. 11, No. 1, pp.62–76.
Statista (2020) ‘Media sources are believed to contain fake news worldwide in 2019’, Statista
[online] https://www.statista.com/statistics/1112026/fake-news-prevalence-attitudes-
worldwide/ (accessed 24 October 2021).
Thota, A., Tilak, P., Ahluwalia, S. and Lohia, N. (2018) ‘Fake news detection: a deep learning
Approach’, SMU Data Science Review, Vol. 1, No. 3 [online] https://scholar.smu.edu/
datasciencereview/vol1/iss3/10 (accessed 25 October 2021).
Zhang, H., Kuhnle, A., Smith, J. D. and Thai, M. T. (2018) ‘Fight under uncertainty: restraining
misinformation and pushing out the truth’, pp.266–273, https://DOI.org/10.1109/
ASONAM.2018.8508402.

2022 - IJKL False News Spreading

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2022 - IJKL False News Spreading

Uploaded by

Copyright:

Available Formats

Int. J. Knowledge and Learning, Vol. 15, No.

The impact on society of false news spreading on

Jitendra Singh Rathore

Sachin Kumar Srivastava*

Keywords: support vector machine; SVM; naïve Bayes; logistic regression;

Copyright © 2022 Inderscience Enterprises Ltd.

Reference to this paper should be made as follows: Srivastava, R.,

Biographical notes: Riktesh Srivastava received his PhD in Electronics

Jitendra Singh Rathore is an Assistant Professor at the Faculty of Management

Sachin Kumar Srivastava is an Accomplished Professional and Professor in the

Khushboo Agnihotri is working as a Senior Assistant Professor in the field of

Oliveira et al. (2020) proposed computational analysis using natural language

3.1 Train and test data sets

Figure 1 Data description for the test data

label text title

3.2 Data cleaning

Support Vector Machine

4.1 Logistic regression

Y ∈ [0, 1] for the binary classification problem, as fake and real.

4.2 Support vector machine

4.3 Naïve Bayes

4.4 Artificial neural network

w1 x1 + w2 x2 + ... ... ... ... wm xm (7)

4.5 Random forest

Predictive models AUC

Accuracy standards Evaluation criteria

CA Precision Recall F-1 Score

CA Precision Recall F-1 score

Random forest predictive model obtained:

You might also like