0% found this document useful (0 votes)
278 views14 pages

Fake News Detection with Python

This document provides an overview of a project to detect fake news articles using natural language processing and machine learning techniques in Python. It introduces what fake news is and describes using scikit-learn libraries for classification. The document outlines the prerequisites, packages, and machine learning algorithms used, including Flask, NumPy, Pandas, regular expressions, stopwords, PorterStemmer, TFIDFVectorizer, train-test splitting, logistic regression, and calculating accuracy scores.

Uploaded by

Adarsh Lenin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
278 views14 pages

Fake News Detection with Python

This document provides an overview of a project to detect fake news articles using natural language processing and machine learning techniques in Python. It introduces what fake news is and describes using scikit-learn libraries for classification. The document outlines the prerequisites, packages, and machine learning algorithms used, including Flask, NumPy, Pandas, regular expressions, stopwords, PorterStemmer, TFIDFVectorizer, train-test splitting, logistic regression, and calculating accuracy scores.

Uploaded by

Adarsh Lenin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

FAKE NEWS DETECTION

BY
• ADARSH LENIN
• ATHUL P
• BIMAL MURALI
• NIDHIN PHILIP ALEX
INTRODUCTION

• What is Fake News?


• A type of yellow journalism, fake news encapsulates pieces of news that may be hoaxes and is generally
spread through social media and other online media. This is often done to further or impose certain
ideas and is often achieved with political agendas. Such news items may contain false and/or
exaggerated claims, and may end up being viralized by algorithms, and users may end up in a filter
bubble.
• Fake News Detection in Python
• In this project, we have used various natural language processing techniques and machine learning
algorithms to classify fake news articles using sci-kit libraries from python.
FLOWCHART
PREREQUISITES
• PYTHON
• FLASK
• HTML
• CSS
FLASK

• Flask is a web framework, it’s a Python module that lets you develop web
applications easily. It’s has a small and easy-to-extend core: it’s a
microframework that doesn’t include an ORM (Object Relational Manager) or
such features.
• It does have many cool features like url routing, template engine. It is a WSGI
web app framework.
PACKAGES

NUMPY

NumPy, which stands for Numerical Python, is a library consisting of


multidimensional array objects and a collection of routines for
processing those arrays. Using NumPy, mathematical and logical
operations on arrays can be performed
PANDAS

Pandas is an open source Python package that is most widely used


for data science/data analysis and machine learning tasks. It is built
on top of another package named Numpy, which provides support
for multi-dimensional [Link] for crerating and storing data
frames
REGULAR EXPRESSION

Regular Expression, is a sequence of characters that


forms a search [Link] can be used to check if a
string contains the specified search pattern.
STOPWORDS

The stopwords in “nltk” library are the most common words in data.
They are words that you do not want to use to describe the topic of your
content. Words that doesn’t add much value to a paragraph
PORTERSTEMMER

The Porter stemming algorithm (or 'Porter stemmer') is a process for


removing the commoner morphological and inflexional endings from words
in English. It gives root word for a particular word
TFIDFVECTORIZER

Term frequency-inverse document frequency is a text vectorizer


that transforms the text into a usable vector. It combines 2
concepts, Term Frequency (TF) and Document Frequency (DF). The
term frequency is the number of occurrences of a specific term in
a document.
TRAIN AND SPLIT

The train-test split is used to estimate the performance of machine


learning algorithms that are applicable for prediction-based
Algorithms/Applications. This method is a fast and easy procedure to
perform such that we can compare our own machine learning model
results to machine results.
LOGISTIC REGRESSION

Logistic Regression is a Machine Learning classification algorithm that is


used to predict the probability of a categorical dependent variable. In
logistic regression, the dependent variable is a binary variable that
contains data coded as 1 (yes, success, etc.) or 0 (no, failure, etc.).
ACCURACY SCORE

The accuracy_score method is used to calculate the accuracy of either the


faction or count of correct prediction in Python Scikit learn. Mathematically
it represents the ratio of the sum of true positives and true negatives out of
all the predictions

You might also like