You are on page 1of 23

FOR FAKE OR REAL DISASTER TWEET ANALYSIS OF

MACHINE LEARNING ALGORITHMS

Domain : Machine-Learning

Ankit Kumar[RA1811003020467]

Praneeth[RA1811003020482]
INTRODUCTION:

• Twitter has become an important communication channel in times of emergency.


The ubiquitousness of smartphones enables people to announce an emergency they’re
observing in real-time. Because of this, more agencies are interested in programmatically
monitoring Twitter (i.e. disaster relief organizations and news agencies).

• But, it’s not always clear whether a person’s words are actually announcing a disaster.In
this paper we approaches to predict whether the tweet is real or fake . The mechanisms
include collecting dataset through various tweets posted on twitter. Result is derived
from extracted information. Here output expected is that the person tweeted fake or
real news. In this research work algorithms and classifiers of machine learning such as
Logistic Regression . The paper also demonstrates an example in which Twitter scrapping
tool Twint is used detect whether given Tweet is fake or real.
PROBLEM STATEMENT

• it’s not always clear whether a person’s words are actually announcing a
disaster. In this paper we approaches to predict whether the tweet is real or
fake . The mechanisms include collecting dataset through various tweets
posted on twitter. Result is derived from extracted information. Here output
expected is that the person tweeted fake or real news. In this research work
algorithms and classifiers of machine learning such as Logistic Regression . The
paper also demonstrates an example in which Twitter scrapping tool Twint is
used detect whether given Tweet is fake or real.
ABSTRACT:

• Twitter has become an important communication channel in times of emergency.


The ubiquitousness of smartphones enables people to announce an emergency they’re
observing in real-time. Because of this, more agencies are interested in programmatically
monitoring Twitter (i.e. disaster relief organizations and news agencies).

• But, it’s not always clear whether a person’s words are actually announcing a disaster.In
this paper we approaches to predict whether the tweet is real or fake . The mechanisms
include collecting dataset through various tweets posted on twitter. Result is derived
from extracted information. Here output expected is that the person tweeted fake or
real news. In this research work algorithms and classifiers of machine learning such as
Logistic Regression . The paper also demonstrates an example in which Twitter scrapping
tool Twint is used detect whether given Tweet is fake or real.
LITERATURE REVIEW

.
Base paper
Analysis of Machine Learning algorithm for Predicting Depression

Author MS. Purude Vaishali Narayanrao

Year 2020
In this paper different approaches to predict depression are studied in detail. The mechanisms include
Objective collecting dataset through questionnaires asked to the person, posts on social media, text used
throughout verbal communication and expressions on face. Result is derived from extracted information.
Here output expected is that the person needs attention or not

Techniques Used Classifier of machine learning such as Decision Trees , SVM, Naive Bayes
Classifier, Logistic Regression and KNN Classifier.

1.These networks can learn from examples and apply them when a similar event arises, making them able
Advantages to work through real-time events.
2.Even if a neuron is not responding or a piece of information is missing, the network can detect the fault
and still produce the output.

Limitations Unexplained behavior of the network: This is the most important problem of ANN. When ANN produces
a probing solution, it does not give a clue as to why and how. This reduces trust in the network. 
Multimodal Analysis of Disaster Tweets
Authors Ajit Kumar, Aakash Kumar Gautam, Luv Misra, Kush Misra

Year 2020

1.Naïve Bayes
Techniques Used 2.LSTM
3.Random Forest

Advantages 1. the baseline results of various deep learning based techniques for different disasters on
CrisisMMD and CleanCrisisMMD respectively. Using the pretrained deep learning models, it can be
concluded that the best accuracy for image based modality is achieved by the
InceptionV3 model.
2. The major contribution of this work is the improved performance of logistic regression-based decision
policy for incorporating both texts and image-based modality of tweets

1. The momentum variation is usually faster than simple gradient descent, since it allows higher learning
Limitations rates while maintaining stability, but it is still too slow for many practical applications.
2. Backpropagation will not always find the correct weights for the optimum solution. You may want to
reinitialize the network and retrain several times to guarantee that you have the best solution , it may lead to
increase the time complexity.
Authors Chigozie Enyinna Nwankpa, Winifred Ijomah, Anthony Gachagan, and Stephen Marshall
Activation Functions: Comparison of Trends in Practice and Research for Deep Learning
Year 2015

Techniques Used Sigmoid Function, Hyperbolic Tangent Function (Tanh), Softmax Function , Rectified Linear Unit (ReLU)
Function , Exponential Linear Units (ELUs) Function, Swish Function, LReLu
 

Advantages 1. In all studies different inputs are facial expression, Video,Text, Posts & comments on social
media & behavioral features. Videos of population are captured & used as input
for detecting depression . Facial expressions from captured video are identified. But expression
changes depending upon pose, lighting, person, angle and sensors. So it’s a challenging task to
encode features to detect depression.

1. The information present on Twitter relating to disaster can vary greatly. Often there is also the scope of
Limitations irrelevant or misleading information being distributed across the platform. Humanitarian organizations also
do not want to deal with noisy data which are of personal nature and do not contain any important
information. The classification provided by the proposed methodology can be used to gain situation
awareness about a particular disaster which is happening in the world..
Categorization of Earthquake-Related Tweets Using
Machine Learning Approaches
Author Lany L. Maceda, Jennifer L. Llovido, Arlene A. Satuito

Year 2020

Techniques Used SVM Classifier

Advantages 1. As described in paper, they investigate the real-time interaction of events such as earthquakes in Twitter
and proposed an algorithm to monitor tweets and to detect a target event. To detect a target event, they
devise a classifier of tweets based on features such as the keywords in a tweet, the number of words, and
their context. Similarly, the main purpose of the study by is to use Twitter, as text corpora
in the attainment of disaster risk reduction .

1. The study classifies the earthquake related tweets using supervised machine learning algorithms. With the
Limitations performance evaluation results, it showed that SVM through the use of 15-folds validations method
obtained consistently a high performance rating as compared to Linear Regression. Linear logistics
regression may however be considered also in classifying multiple classes of small size data since this
showed also a good performance rating next to SVM.
Research Paper on Basic of Artificial Neural Network
Author Ms. Sonali. B. Maind, Ms. Priyanka Wankar

Year 2014

1.Initializing the weights .


Techniques Used 2. Supervised Training: the inputs and the outputs are provided. The network then processes the inputs and
compares its resulting outputs against the desired outputs
3. Unsupervised, or Adaptive Training, the network is provided with inputs but not with desired outputs. The
system itself must then decide what features it will use to group the input data .

Adaptive learning: An ability to learn how to do tasks based on the data given for training or initial
Advantages experience.
 
Pattern recognition is a powerful technique for harnessing the information in the data and generalizing about
it. Neural nets learn to recognize the patterns which exist in the data set.

Limitations Hardware dependence:  Artificial neural networks require processors with parallel processing power, by
their structure. For this reason, the realization of the equipment is dependent . 
Artificial neural networks for neuroscientists: A primer
Author Guangyu Robert Yang , Xiao-Jing Wang

Year 2020

1.Backpropagation with Gradient Momentum


Techniques Used 2.Activation function(relu.leaky relu ,softmax rtc)
3.Variation and analysis of artificial , convolutional and recurrent Neural networks

Advantages Discused the complete analysis of all types of neural networks with proper set of examples.

Limitations 0 Hardware dependence:  Artificial neural networks require processors with parallel processing power, by
their structure. For this reason, the realization of the equipment is dependent . 
ARCHITECTURE OF ARTIFICIAL NEURAL NETWORKS
Project Flow Chart.
This will be the proposed flow chart that the system will look like

13
Data Flow Diagram

14
Proposed Model

15
General Architecture of the System

• There is a Twitter crawler component, which collects tweets and adds them to our database. When
we will need tweets from trustworthy sources to compare with our current one, we can retrieve them
directly from our database. The Processing module: when a user wants to know the credibility of a
new tweet, he inputs the link of the tweet in our interface.
• Our algorithm then uses an NER (Named
• Entity Recognition) component, which split the text into its composing parts: it brings out the entities
(generally,
• nouns and their relative importance in the context), the topics, the social tags, the overall tweet
sentiment and the
• hashtag sentiment
GENERAL ARCHITECTURE OF THE SYSTEM
Main modules
MAIN MODULES
Interactions between main modules
Core module: this module aggregates all components and orchestrates their behavior .
There is an endpoint where all the magic begins and behind the scenes online and offline processing is starting during
different requests from users or from another module. The online processing means a decision over a tweet; giving an
URL of a tweet, the application returns, in real time, a response to its trustworthy.
On the other way, there is an offline processing mechanism for progress evaluation of a user.
Collector module: to collect information means life for this application. An outdated system is useless and can
mislead to respond with a wrong decision about trustworthy of a user or tweet.
This module collects users and tweets and fires the action to core module to start offline processing whenever is
needed. At the beginning is initialized with some trusted sources that increase the validity of content.
A Database module that handles the requests to the database for add, remove, update and retrieve some data to
reduce the complexity of architecture of each module, by implementing the layer of persisting data plays its role like a
charm. Moreover, using a programming language that exploits the capabilities of the chosen database, give us a
generous bonus of performance.
IMPLEMENTATION
THANK YOU

You might also like