You are on page 1of 18

JSS MAHAVIDYAPEETHA

JSS ACADEMY OF TECHNICAL EDUCATION


Dr.Vishnuvardhan Road, Bengaluru-560060

DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


PROJECT PHASE-2 + SEMINAR

SUBMITTED BY:
DEEPIKA .N (1JS15IS023)
GUIDED BY:
PRATHIBA P. S (1JS15IS049)
Dr. DV Ashoka
VARSHA .S (1JS15IS058)
Professor, Dept. of ISE,
NEHA JANGHEL (1JS15IS092) JSSATE
DETECTION OF FAKE
HUMAN IDENTITIES
USING MACHINE
LEARNING
TECHNIQUES
ABSTRACT

The number of people using Social Media Platform (SMP) is


increasing day by day. A few users may hide their identity with
malicious intentions. Research work done has been done relatively
less in detection of fake accounts created by humans. In contrast,
previous research has detected fake accounts created by bots using
machine learning concepts. These ML concepts are uses engineered
features such as the ’following-to-followers ratio’. This information
can is generally available in their accounts. In previous studies these
similar set of features that were applied to bots are also applied to a
set of human accounts hoping for successful detection of fake
identities on SMPs. The research resulted with only 49.75%
detection of human fake accounts. This can be because of the fact
that characteristics and behaviours of human accounts are different
than bots. Frequency of creation of Human fake accounts is relatively
low than fake accounts generated by bots. Machine learning models
may overlook these sparse deceptions in the mass.

3
MODULES :
1. Data extraction and data cleaning
2. Data training and algorithm testing
3. Prediction and final output

4
1.
MODULE 1
Data extraction
And
Data cleaning
DATA EXTRACTION

➜ Create twitter and log in.


➜ Go to developer.twitter.com
➜ Finish the required formalities for
authentication
➜ Create app and gain consumer and
access key in order to extract data
from twitter DB.

6
7
8
9
10
11
TWEEPY
Tweepy is open-sourced, hosted on
GitHub and enables Python to
communicate with Twitter platform
and use its API.

12
FLOWCHART OF PROJECT PROGRESS

Twitter account creation and login to


developer.twitter.com

Creation of app and obtain the


authentication keys

Use the keys to obtain tweet in json


format

Convert the existing json data to lower


case

Only the unique tweets are considered


and saved the in .csv file (by ignoring
the retweets)
13
SNIPPETS OF CODE

14
15
16
FUTURE ENHANCEMENTS:

 Data cleaning using regular expression


this consists of removal of special characters(e.g. *,#,/,\,:,@)
 Module 2 would consists of 3 main sub parts:
1) Training of cleaned corpus using NLP
2) Implementation of algorithms i.e. Naive Bayes and logistic
regression.
3) Sentiment
 Module 3 would consist of:
1) Prediction rates of fake accounts detected w.r.t both the
algorithms
2) Display the Prediction Charts using tableau or powerBI
visualization tool

17
Thank You!!

18

You might also like