You are on page 1of 12

Mini Project Report on

Fake News Detection using ML

Submitted in partial fulfillment of the requirement for the award of the


degree of

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE & ENGINEERING

Submitted by:

Student Name: Sparsh Dhama University Roll No.: 2119275

Under the Mentorship of


Ms. Shruti Bhatla

Department of Computer Science and Engineering


Graphic Era Hill University
Dehradun, Uttarakhand
CANDIDATE’S DECLARATION

I hereby certify that the work which is being presented in the project report entitled “Fake
News Detection using ML” in partial fulfillment of the requirements for the award of the
Degree of Bachelor of Technology in Computer Science and Engineering of the Graphic Era
Hill University, Dehradun shall be carried out by myself under the mentorship of Ms. Shruti
Bhatla, Department of Computer Science and Engineering, Graphic Era Hill University,
Dehradun.

Name: University Roll no.:


Sparsh Dhama 2119275
Table of Contents

Chapter No. Description Page No.

Chapter 1 Introduction

Chapter 2 Literature Survey

Chapter 3 Methodology

Chapter 4 Result and Discussion

Chapter 5 Conclusion and Future Work

References
Chapter 1
Introduction

1.1 Introduction

The proliferation of fake news has become a significant concern in


today's digital age. Misinformation and deceptive content spread
rapidly through social media and online platforms, leading to
potential consequences such as public manipulation, erosion of trust,
and societal polarization. To combat this issue, there is a growing
interest in developing automated systems that can effectively detect
fake news articles. Machine learning algorithms provide a promising
approach to tackle this problem by leveraging patterns and features
within the data to identify deceptive information.

4
1.2 Problem Statement

The objective of this project is to develop a machine learning


model that can accurately classify news articles as either genuine
or fake. The model will analyze various textual features, such as
headline, content, source, and other metadata, to determine the
likelihood of an article being fake. The goal is to create a reliable
and efficient system that can assist users in distinguishing
between reliable news sources and potentially misleading
information.

5
1.3 Objectives of the Project

The main objectives of this project are as follows:


1. Collect a comprehensive dataset of labeled news articles, consisting
of both genuine and fake examples, to train and evaluate the machine
learning model.
2. Perform exploratory data analysis to gain insights into the
characteristics and patterns present in the dataset.
3. Preprocess the textual data by applying techniques such as
tokenization, stop-word removal, and stemming to transform the text
into a suitable format for machine learning algorithms.
4. Design and implement a machine learning pipeline that includes
feature extraction, model training, and evaluation stages.
5. Evaluate the performance of various machine learning algorithms,
such as Naive Bayes, Support Vector Machines, and Random Forest,
to identify the most effective model for fake news detection.
6. Fine-tune the selected model by optimizing hyperparameters and
evaluating its performance on validation data.
7. Assess the model's performance using appropriate evaluation
metrics, such as accuracy, precision, recall, and F1-score.
8. Provide recommendations for potential improvements and future
research directions in the field of fake news detection using machine
learning.

6
Chapter 2
Literature Survey

In this chapter, a comprehensive review of existing literature on


fake news detection using machine learning techniques will be
presented. The survey will cover various approaches,
methodologies, and performance metrics employed by researchers
in the field. It will also highlight the strengths and limitations of
different machine learning algorithms in detecting fake news.

System Architecture

7
Chapter 3
Methodology
3.1 Data Collection
A diverse dataset of labeled news articles will be collected from reliable
sources, including reputable news outlets and fact-checking organizations.
The dataset will consist of both genuine and fake news examples, ensuring
a balanced representation of different categories and topics.
3.2 Data Preprocessing
The collected dataset will undergo preprocessing steps to clean and
transform the textual data. Techniques such as tokenization, stop-word
removal, stemming, and vectorization will be applied to convert the text
into numerical features that can be used by machine learning algorithms.
3.3 Feature Extraction
Various features will be extracted from the preprocessed text, including
bag-of-words representations, TF-IDF scores, and word embeddings.
Additional metadata features, such as article source, publication date, and
author credibility, will also be considered to enhance the model's
performance.
3.4 Model Training and Evaluation
Several machine learning algorithms, such as Naive Bayes, Support
Vector Machines, Random Forest, and Neural Networks, will be trained
and evaluated using appropriate training and testing splits of the dataset.
The models will be assessed based on performance metrics such as
accuracy, precision, recall, and F1-score.
SYSTEM REQUIREMENTS

8
HARDWARE REQUIREMENTS:
 System - Pentium-IV
 Speed - 2.4GHZ
 Hard disk - 40GB
 Monitor - 15VGA color
 RAM - 512MB
SOFTWARE REQUIREMENTS:
 Operating System - Windows XP
 Coding language - PYTHON

9
Chapter 4
Results and Discussion

The results obtained from training and evaluating different


machine learning models will be presented in this chapter. The
performance metrics of each model will be compared to identify
the most effective algorithm for fake news detection. The
strengths and weaknesses of the selected model will be discussed,
along with potential reasons for its performance.

10
Chapter 5
Conclusion and Future Work

In this final chapter, the overall findings and conclusions of the project
will be summarized. The effectiveness of machine learning algorithms in
detecting fake news will be discussed, along with the implications and
potential applications of the developed model. Future research directions,
such as incorporating deep learning techniques and considering
multimodal features, will be suggested to improve the accuracy and
robustness of fake news detection systems.

11
References

[1]https://www.kaggle.com/fmendes/fmendesdat263xdemos

[2]https://machinelearningmastery.com/linear-regressionfor-machine-

learning/

[3]https://machinelearningmastery.com/xgboost-forregression/

[4]https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5496172

[5] https://devhadvani.github.io/calorie.html

12

You might also like