You are on page 1of 26

A PROJECT REPORT

ON
“FAKE NEWS DETECTION
USING LSTM”

SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE AWARD


OF
DIPLOMA IN
Computer Engineering

SUBMITTED TO
MAHARASHTRA STATE BOARD OF TECHNICAL EDUCATION,
MUMBAI
SUBMITTED BY

Name of Student Enrollment No.


1. Amol Borude 2111620075
2. Sarthak Deshmukh 2111620080
3. Vishal Ugale 2111620129
4. Piyush Chaudhari 2111620076

GUIDED BY
Mr.S.M.Bankar

Government Polytechnic Ambad


2023-24
Government Polytechnic Ambad

CERTIFICATE
This is to Certify that the project report entitled “FAKE NEWS DETECTION USING
LSTM” Was Successfully completed by Student of Sixth Semester DIPLOMA IN
COMPUTER ENGINEERING .

Sr. NAME Roll.No.


1. Amol Borude 08
2. Sarthak Deshmukh 11
3. Vishal Ugale 37
4. Piyush Chaudhari 50

in partial fulfillment of the requirements for the award of the Diploma in Computer
Engineering and submitted to the Department of Computer Engineering of
Government Polytechnic Ambad work carried out during a period for the academic
year 2023-24 as per curriculum.

Mr.S.M.Bankar Mr.B.S.Pawar Dr.A.M.Agarkar


Guide H.O.D Principal
(Computer Engineering) (G.P. Ambad)
ACKNOWLEDGEMENT

I would like to express my special thanks of gratitude to my project guide Mr.S.M.Bankar as


well as our principal Dr.A.M.Agarkar who gave me the golden opportunity to do this
wonderful project on the topic “Fake News Detection Using LSTM”, which also helpedme in
doing a lot of research and I came to know about so many new things I am really thankful to
them.

Secondly I would also like to thank my parents and friends who helped me a lot infinalizing
this project within the limited time frame.

Sr. Name Roll no


1. Amol Borude 08

2. Sarthak Deshmukh 11
3. Vishal Ugale 37
4. Piyush Chaudhari 50

Date:

Place: Ambad.
ABSTRACT

Designed to deceive readers and manipulate public opinion, fake news can be created for a variety
of reasons ranging from political propaganda to generating revenue through clickbait. Another
significant challenge in combating fake news is the difficult balance between curbing
misinformation and preserving free speech, though some argue for stricter regulations to control the
spread of fake news. Thus, the purpose of this study is to identify fake news using Long-Short Term
Memory (LSTM). LSTM models are often used to analyze the linguistic features of news articles or
social media posts. The dataset we used comes from a dataset of fake news on Kaggle's website.
The proposed method can identify fake news with average precision, recall, accuracy, and f-measure
values of 0.99, 0.99, 0.99, and 0.99. The results showed that LSTM provides superior performance
compared to the Support Vector Classifier, Logistic Regression, and Multinomial Naive Bayes
methods.
Keywords: Fake News Classification, LSTM, Deep Learning
Index
Sr. no. TITLE Pag
e no
1. Introduction 6.
2. Editor Brower & Server used
7.
3. Research Method
1. Data Collection 8-20.
2. Data Preprocessing
3. Model Training
4. Evaluation
5. Analysis and Result

4. Areas of Improvement
21-22
5. Hardware Requirement Specification
23.
6. RAM and Disk used at time of Model Training
24.
7. Conclusion
25.
8. References 26.

5
1. INTRODUCTION

The proliferation of fake news in recent years has become a significant challenge,
undermining the trustworthiness of information and disrupting societal discourse. Fake
news, deliberately crafted to deceive and manipulate public opinion, poses substantial
threats to democratic processes, social cohesion, and individual decision-making.
Motivated by various factors such as political agendas or financial incentives from
clickbait, creators of fake news exploit the accessibility and reach of social media
platforms, often prioritizing virality over accuracy.

The consequences of fake news extend beyond mere misinformation, eroding trust in
established sources of information and exacerbating societal polarization. Social media
platforms, designed for rapid information dissemination, have inadvertently facilitated the
spread of misinformation, overshadowing legitimate news sources and complicating the
task of distinguishing truth from falsehood. Moreover, addressing the challenge of fake
news is complicated by the need to balance efforts to combat misinformation with the
protection of free speech rights.

In response to these pressing challenges, researchers have increasingly turned to cutting-


edge technologies such as Long Short-Term Memory (LSTM) networks for fake news
detection. LSTM networks, a type of recurrent neural network (RNN), exhibit the
capability to analyze linguistic patterns and contextual cues in textual data, enabling the
identification of subtle indicators of falsehood. Leveraging LSTM's ability to capture long-
term dependencies in sequential data, researchers aim to develop more accurate and
effective models for detecting fake news.

This study advances existing research by proposing a novel approach to fake news
detection using LSTM classification models. By analyzing linguistic features extracted
from news articles and social media posts, the proposed LSTM-based system endeavors
to differentiate between trustworthy information and fake news with high precision and
recall. Through rigorous experimentation and evaluation, this research seeks to contribute
to the evolution of fake news detection techniques, ultimately fostering a more informed
and resilient society.

6
2. EDITOR BROWSER & SERVER USED

Editor Used :-
 Google Colab

Architecture Used :-
 System RAM :- 12.7 GB
 GPU RAM :- 15.0 GB

7
3. RESEARCH METHOD

1.Data Collection
2.Data Preprocessing
3.Model Training
4.Evaluation
5..Analysis and Result

8
1. Data Collection

We obtained a dataset of fake news from Kaggle. The data set used for analysis and
exploration in this research project consisted of a total of 44,898 news articles, with a clear
dichotomy between real and fake news. From a vast pool of news articles, 21,417 articles
were categorized as original news. On the other hand, the dataset also includes many
23,481 articles classified as fake news, which were deliberately created and disseminated
to deceive the audience and spread misinformation. The clear separation between genuine
and fake news in this dataset provides a unique opportunity for researchers to analyze
significant patterns and characteristics to develop accurate and robust approaches to
identifying fake news.

Total :- 44,898
Real :- 21,417
Fake :- 23,481

Subject count plot of fake :-

9
Subject Count Plot of Real :-

10
2. Data Preprocessing
NLP :- NLP Stands for Natural Language Processing.

Natural Language Processing (NLP) plays a crucial role in fake news detection models by
enabling computers to analyze and understand textual content for identifying
misinformation. Here's how NLP techniques have been applied in our project:

1. Text Preprocessing:

 Tokenization: Break down the text into individual words or


tokens.
 Stopword Removal: Remove common words (stopwords) that
do not carry significant meaning.
 Lowercasing: Convert all text to lowercase to ensure
consistency.
 Stemming/Lemmatization: Reduce words to their base forms to
handle variations of words.
 Removing Punctuation and Special Characters: Clean the text
by removing unnecessary symbols and characters.

2. Feature Extraction:
 Word Cloud: A word cloud is a visualization technique used to
represent text data, where the size of each word indicates its
frequency or importance within the text. In a word cloud, words
are typically arranged randomly, and the size of each word is
proportional to its frequency in the text.
 Word clouds are often used to provide a visual summary of the
most frequently occurring words in a document or a corpus of
text. They are particularly useful for identifying key themes,
topics, or trends within the text data at a glance.

In following Word Cloud, We can clearly see some pattern:


 Real news seems to have source of publication which is not present in
fake news set Looking at the data:

 Most of text contains reuters information such as "WASHINGTON


REUTERS"
 Some text are tweets from Twitter
 Few text do not contain any publication info
11
World Cloud of Fake News :-

World Cloud of Real News :-

12
3. Word To Vector Model:

Word to Vector (Word2Vec) is a popular model used in Natural Language


Processing (NLP) to represent words as dense vectors in a continuous vector
space. This model was introduced by Tomas Mikolov and his colleagues at
Google in 2013. The key idea behind Word2Vec is to learn distributed
representations of words in such a way that similar words have similar vector
representations, capturing semantic relationships between words.

Vectorization is used in various fields, including natural language processing


(NLP), machine learning, computer vision, and numerical computing. In the
context of NLP, vectorization is particularly important for representing
textual data in a format that can be processed by machine learning algorithms.

Total vocab length :- 228264

4. Tokenization:

In the context of Natural Language Processing (NLP), tokenization refers to


the process of breaking down a piece of text into smaller units, typically
words or subwords, called tokens. These tokens are the basic units of analysis
in NLP tasks and serve as the building blocks for further processing.

Tokenization is a crucial preprocessing step in many NLP tasks because it


enables computers to understand and manipulate human language. Here's
how tokenization works:

1. Word Tokenization:
- Word tokenization involves splitting the text into individual words based
on whitespace or punctuation boundaries.
- For example, the sentence "Natural Language Processing is fascinating!"
would be tokenized into the following tokens: ["Natural", "Language",
"Processing", "is", "fascinating", "!"].

2. Sentence Tokenization:
- Sentence tokenization involves splitting the text into individual sentences.
- For example, the paragraph "NLP is fascinating. It involves analyzing text
data." would be tokenized into the following sentences: ["NLP is
fascinating.", "It involves analyzing text data."].

3. Subword Tokenization:
- Subword tokenization involves breaking down words into smaller
13
linguistic units, such as prefixes, suffixes, or root words.
- This approach is particularly useful for handling out-of-vocabulary words
and morphologically rich languages.
- For example, the word "unhappiness" might be tokenized into ["un",
"happi", "ness"] using subword tokenization techniques like Byte-Pair
Encoding (BPE) or WordPiece.

Tokenization serves as the first step in many NLP tasks, including text
classification, sentiment analysis, machine translation, and named entity
recognition. Once the text has been tokenized, further processing steps such
as stopword removal, stemming, lemmatization, and feature extraction can
be applied to the tokens to extract meaningful information and patterns from
the text data.

Hist plot of Number of words per record :-

14
3. Model Training
Training a model using Long Short-Term Memory (LSTM) networks involves several
steps. LSTMs are a type of recurrent neural network (RNN) architecture that are well-
suited for sequence prediction tasks, such as time series forecasting, natural language
processing, and speech recognition. Here's a high-level overview of the process:

1. Data Preprocessing:
- Prepare your dataset for training. This typically involves tokenizing the text (if
working with text data), splitting it into sequences, and encoding it into numerical
format that the LSTM can process. For example, you might convert words into word
embeddings or use one-hot encoding for categorical variables.

2. Model Architecture:
- Define the architecture of your LSTM model. This includes specifying the number
of LSTM layers, the number of units (or neurons) in each layer, and any additional
layers such as dropout or dense layers. You'll also need to specify the input shape,
which depends on the format of your input data.

3. Compile the Model:


- Compile the LSTM model using an appropriate loss function, optimizer, and
evaluation metric. Common loss functions for sequence prediction tasks include
categorical crossentropy for classification and mean squared error for regression.
Popular optimizers include Adam, RMSprop, and SGD.

4. Training:
- Train the LSTM model on your training data using the `fit()` method. Specify the
training data, validation data (if applicable), batch size, and number of epochs. During
training, the model learns to adjust its parameters (weights and biases) to minimize the
loss function.

5. Validation:
- Monitor the model's performance on the validation data during training to detect
overfitting and adjust hyperparameters accordingly. You can visualize metrics such as
loss and accuracy over epochs using plots or callbacks.

6. Evaluation:
- Evaluate the trained LSTM model on a separate test dataset to assess its
performance on unseen data. Compute relevant metrics such as accuracy, precision,
recall, F1-score, or mean squared error depending on the task.

7. Fine-Tuning (Optional):
15
- Fine-tune the LSTM model by experimenting with different architectures,
hyperparameters, and preprocessing techniques to improve performance.

Throughout this process, it's essential to monitor the model's performance, iterate on
the architecture and hyperparameters, and incorporate best practices for training deep
learning models, such as regularization and early stopping, to prevent overfitting.

16
Values per Epoch :-

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
===============================================================
==
embedding (Embedding) (None, 1000, 100) 22826500

lstm (LSTM) (None, 128) 117248

dense (Dense) (None, 1) 129

===============================================================
==
Total params: 22943877 (87.52 MB)
Trainable params: 117377 (458.50 KB)
Non-trainable params: 22826500 (87.08 MB)

Epoch 1/6
737/737 [==============================] - 35s 42ms/step - loss: 0.1423 - acc:
0.9496 - val_loss: 0.0704 - val_acc: 0.9764
Epoch 2/6
737/737 [==============================] - 30s 41ms/step - loss: 0.0723 - acc:
0.9771 - val_loss: 0.0618 - val_acc: 0.9799
Epoch 3/6
737/737 [==============================] - 30s 41ms/step - loss: 0.0448 - acc:
0.9848 - val_loss: 0.0446 - val_acc: 0.9858
Epoch 4/6
737/737 [==============================] - 29s 40ms/step - loss: 0.0408 - acc:
0.9864 - val_loss: 0.0445 - val_acc: 0.9858
Epoch 5/6
737/737 [==============================] - 30s 41ms/step - loss: 0.0304 - acc:
0.9901 - val_loss: 0.0434 - val_acc: 0.9855
Epoch 6/6
737/737 [==============================] - 30s 41ms/step - loss: 0.0277 - acc:
0.9904 - val_loss: 0.0405 - val_acc: 0.9856
<keras.src.callbacks.History at 0x7f9b70636260>

17
4. Evaluation
After training your Long Short-Term Memory (LSTM) model for a specific task, such
as text classification or sequence prediction, it's crucial to evaluate its performance to
assess how well it generalizes to unseen data. Here are some common evaluation
metrics and techniques for assessing the performance of an LSTM model:

1. Accuracy:
- Accuracy measures the proportion of correctly predicted labels out of all samples
in the test dataset. It's a common metric for classification tasks. However, accuracy
alone might not provide a complete picture, especially if the classes are imbalanced.

2. Precision, Recall, and F1-Score:


- Precision measures the proportion of true positive predictions out of all positive
predictions. It indicates how many of the predicted positive instances are actually
positive.
- Recall (also known as sensitivity) measures the proportion of true positive
predictions out of all actual positive instances. It indicates how many of the actual
positive instances were correctly predicted.
- F1-score is the harmonic mean of precision and recall, providing a balanced
measure of both metrics. It's useful when there is an imbalance between the classes.
- These metrics are typically used in binary or multiclass classification tasks.

3. Confusion Matrix:
- A confusion matrix provides a detailed breakdown of the model's predictions by
comparing them to the actual labels. It shows the number of true positives, true
negatives, false positives, and false negatives. From the confusion matrix, you can
derive other metrics like precision, recall, and accuracy.

18
No. of epochs run :- 6 epochs

Classification Report :-

precision recall f1-score support

0 0.99 0.99 0.99 5887


1 0.98 0.99 0.99 5338

accuracy 0.99 11225


macro avg 0.99 0.99 0.99 11225
weighted avg 0.99 0.99 0.99 11225

Model Summery :-

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
===============================================================
==
embedding (Embedding) (None, 1000, 100) 22826500

lstm (LSTM) (None, 128) 117248

dense (Dense) (None, 1) 129

===============================================================
==
Total params: 22943877 (87.52 MB)
Trainable params: 117377 (458.50 KB)
Non-trainable params: 22826500 (87.08 MB)
_________________________________________________________________

19
5. Analysis and Result

Result :-

precision recall f1-score support

0 0.99 0.99 0.99 5887


1 0.98 0.99 0.99 5338

accuracy 0.99 11225


macro avg 0.99 0.99 0.99 11225
weighted avg 0.99 0.99 0.99 11225

20
4. AREAS OF IMPROVEMENT

1. Data Augmentation and Expansion:


- Collect additional labeled data or augment existing data through techniques like
paraphrasing, back translation, or data synthesis to increase the diversity and coverage
of the dataset.

2. Feature Engineering and Representation:


- Experiment with different text representations, including word embeddings,
character-level representations, or contextual embeddings like BERT, to capture more
nuanced semantic information.
- Incorporate additional features such as metadata (e.g., source credibility,
publication date), social context (e.g., user interactions, propagation patterns), or
linguistic features (e.g., sentiment, readability) to enrich the input representation.

3. Model Architecture Optimization:


- Explore more complex architectures beyond LSTM, such as attention mechanisms,
transformer-based models, or hierarchical architectures, to capture long-range
dependencies and improve performance.
- Consider ensemble methods or model stacking techniques to combine predictions
from multiple models and reduce prediction variance.

4. Hyperparameter Tuning and Regularization:


- Fine-tune hyperparameters such as learning rate, dropout rate, or optimizer settings
using techniques like grid search or random search to improve model convergence and
generalization.
- Apply regularization techniques like dropout, batch normalization, or weight decay
to prevent overfitting and improve the model's robustness.

5. Domain-specific Adaptation:
- Investigate techniques for domain adaptation or transfer learning to adapt the model
to specific domains or languages with limited labeled data.
- Pre-train the model on a large corpus of general text data and fine-tune it on the
task-specific dataset to leverage transfer learning.

6. Adversarial Robustness:
- Develop strategies for adversarial defense to mitigate the impact of adversarial
attacks on the model's predictions, such as adversarial training, input perturbation, or
robust optimization.

7. Interpretability and Explainability:


- Enhance the interpretability of the model by analyzing attention weights, feature
21
importances, or saliency maps to understand how the model makes predictions.
- Provide explanations or visualizations of model decisions to end-users to increase
trust and transparency in the model.

8. Continuous Monitoring and Updating:


- Implement a system for continuous monitoring of the model's performance in
production and updating it as new data becomes available or the distribution of data
changes.
- Develop mechanisms for detecting and addressing concept drift or data drift to
maintain the model's accuracy over time.

By focusing on these areas for improvement, we can enhance the effectiveness and
reliability of our fake news detection model, ultimately contributing to the fight against
misinformation.

22
5. HARDWARE REQUIREMENT SPECIFICATION

Hardware Recommended Specification


Component

Computer Multi-core processor with at least 12 GB of RAM and GPU of 15


GB RAM

Storage Sufficient storage space for development environment and project


files

Display Any display

Input Devices Keyboard and mouse or touchpad

Optional Hardware External hard drives, printers, scanners, etc. based on project needs

Software Google colab

23
RAM and Disk used at time of Model Training :-

24
6. CONCLUSION

In conclusion, the use of LSTM (Long Short-Term Memory) networks in the detection
of fake news represents a promising approach with significant potential for mitigating
the spread of misinformation in the digital era. Through the application of advanced
natural language processing techniques, such as LSTM, we can effectively analyze
textual data to discern patterns and characteristics indicative of fake news.

Throughout the course of this project, we have demonstrated the effectiveness of


LSTM networks in accurately classifying news articles as either genuine or fake based
on their textual content. By leveraging the ability of LSTM to capture long-range
dependencies in sequential data, we have achieved robust performance in
distinguishing between trustworthy information and deceptive content.

Furthermore, the development of this fake news detection system underscores the
importance of interdisciplinary collaboration between machine learning experts,
linguists, and domain specialists. By combining expertise from diverse fields, we have
been able to design a model that not only harnesses the power of deep learning but also
incorporates nuanced linguistic features essential for effective fake news detection.

Moving forward, the success of this project opens up avenues for further research and
refinement. Future efforts may focus on enhancing the model's accuracy and
scalability, exploring additional features or data sources, and addressing emerging
challenges in the ever-evolving landscape of online misinformation.

Ultimately, the deployment of LSTM-based fake news detection systems holds great
promise for promoting information integrity, fostering media literacy, and safeguarding
the public discourse against the pernicious influence of false information. Through
continued innovation and collaboration, we can strive towards a more informed and
resilient society in the digital age.

25
7.REFERENCES

Chat-GPT
Youtube
Github
Research Paper 1
Google Colab
Kaggle

26

You might also like