Professional Documents
Culture Documents
ON
IDENTIFICATION OF FAKE PROFILES USING DEEP
LEARNING TECHNIQUES
Sh. Sanjeevini
Associate Professor
CERTIFICATE
EXTERNAL EXAMINER
MALLA REDDY ENGINEERING COLLEGE FOR WOMEN
Autonomous Institution, UGC, Govt. of India
Accredited by NBA & NAAC with ‘A’ Grade
NIRF Indian Ranking, Accepted by MHRD, Govt. of India
Affiliated to JNTUH, Approved by AICTE, ISO 9001:2015 Certified Institution Maisammaguda,
Dullapally(post), Secunderabad, TELANGANA
DECLARATION
We ‘E. Mahalakshmi (21RH5A6201), L. Savithri (21RH5A6203) and M. Amulya
(21RH5A6204)’, are students of ‘Bachelor of Technology in CSE (Cyber Security)’, session: 2022
- 2023, Malla Reddy Engineering College for Women, Maisammaguda, Secunderabad, hereby declare
that the work presented in this Innovative Product Development entitled ‘IDENTIFICATION OF
FAKE PROFILES USING DEEP LEARNING TECHNIQUES’ is the outcome of our own
bonafide work and is correct to the best of our knowledge and this work has been undertaken taking
care of Engineering Ethics. It contains no material previously published or written by another person
nor material which has been accepted for the award of any other degree or diploma of the university
or other institute of higher learning, except where due acknowledgment has been made in the text.
Date:
E. Mahalakshmi (21RH5A6201)
L. Savithri (21RH5A6203)
M. Amulya (21RH5A6204)
ACKNOWLEDGEMENT
We feel ourselves honoured and privileged to place our warm salutation to our college
Malla Reddy Engineering College for Women and Department of CSE- CYBER
SECURITY which gave us the opportunity to have expertise in engineering and profound
technical knowledge.
We would like to deeply thank our Honourable Minister of Telangana State Sri.Ch.
Malla Reddy Garu, founder chairman MRGI, the largest cluster of institutions in the state
of Telangana for providing us with all the resources in the college to make our project success.
We wish to convey gratitude to our Principal Dr. Y. Madhavee Latha, for providing
us with the environment and mean to enrich our skills and motivating us in our endeavour
and helping us to realize our full potential.
We express our sincere gratitude to Mrs. A. Radha Rani, Director and Professor of
Computer Science and Engineering for inspiring us to take up a project on this subject and
successfully guiding us towards its completion.
We express our sincere gratitude to Dr. Putta Srivani, Associate Professor and
Head of the Department of Cyber Security for inspiring us to take up a project on this
subject and successfully guiding us towards its completion.
We would like to thank our internal guide Sh. Sanjeevini, and all the faculty members for
their valuable guidance and encouragement towards the completion of our project work
Innovative Development – 3.
Social networking platforms are now a common aspect of daily life for most
people. Every day, a large number of people create profiles on social networking sites
and interact with others, regardless of their location or time of day. Social networking
platforms not only benefit users, but also put their security and personal information
at danger. To find out who is spreading hazards on social media, we must classify user
profiles. The classification allows us to distinguish between legitimate profiles on
social networks and fake profiles. We generally employ a range of methods for
categorising fraudulent profiles on social networks. As a result, we must improve the
social network phoney profile identification system's accuracy rate. In this research,
we propose machine learning and natural language processing (NLP) approaches for
fraudulent profile detection. Both the Nave Bayes algorithm and the Support Vector
Machine (SVM) can be employed.
CONTENTS
INDEX Pageno:
1. INTRODUCTION 1-3
2. SYSTEM ANALYSIS 4-6
2.1. EXISTING SYSTEM AND DISADVANTAGES 5
2.2. PROPOSED SYSTEM AND ADVANTAGES 6
3. SYSTEM STUDY 7-8
3.1. ECONOMIC FEASIBILITY 7
3.2. TECHNICAL FEASIBILITY 7
3.3. SOCIAL FEASIBILITY 8
4. SYSTEM REQUIREMENTS AND SPECIFICATIONS 9
4.1 FUNCTIONAL REQUIREMENTS 9
4.2. SOFTWARE REQUIREMENTS 9
4.4. HARDWARE REQUIREMENTS 9
5. SYSTEM DESIGN 10 - 11
5.1. UML DIAGRAMS 10
5.1.1 CLASS DIAGRAM 10
5.1.2 USE CASE DIAGRAM 10
5.1.3. SEQUENCE DIAGRAM 11
5.1.4. ACTIVITY DIAGRAM 11
6. SYSTEM ENVINORMENT 12 - 13
6.1. PYTHON 12
6.2. DJANGO-ORM 12 - 13
7. IMPLEMENTATION 14 – 15
8. SYSTEM TESTING 16 - 19
8.1. UNIT TESTING 16
8.2. FUNCTIONAL TESTING 17
8.3. SYSTEM TESTING 17
8.4. WHITE BOX TESTING 17
8.5. BLACK BOX TESTING 18
8.6. INTEGRATION TESTING 18
8.7. ACCEPTANCE TESTING 19
9. OUTPUT SCREENS 20 – 24
10. FUTURE SCOPE 25
11. CONCLUSION 26
LIST OF FIGURES
1 Class Diagram 10
3 Sequence Diagram 11
4 Activity Diagram 11
Identification of fake profiles using deep learning techniques
1.INTRODUCTION
In today’s digital age, the ever-increasing dependency on computer technology has left the
average citizen vulnerable to crimes such as data breaches and possible identity theft. In the
present generation everyone in the society has become associated with the social media. These
social media have made effective in the we pursue our social media. The online social
networks have impact on the science, education, employment, business, etc. Researchers have
been studying these online social networks to see the impact they make on the people. Now
the social network sites are used by ever one in the world and they are spam. The users will
use the different sites and create the fake bank transaction, fake accounts, fake profiles. In this
project using ANN and classification techniques we will identify whether account is fake or
genuine.
1.1 MOTIVATION
It is not appropriate to use deep learning or any other technology for creating fake profiles.
Creating fake profiles is unethical and can be harmful to individuals and society. It can also be
illegal in many cases. Deep learning is a powerful tool that can be used for many useful
purposes, such as improving image and speech recognition, language translation, and
predictive modeling. However, it is important to use it responsibly and ethically. It is not
acceptable to use it to create fake profiles or engage in any other deceptive or fraudulent
activities.
for political, financial, or personal gain. Cyberbullying: Some people create fake profiles to
harass, intimidate, or bully others online. This can be a form of cyberbullying, which can have
serious consequences for the victims.
Phishing: Some people create fake profiles to trick others into giving them sensitive
information, such as passwords or financial information. This is known as phishing and can be
used to steal money or commit other crimes.
Marketing: Some businesses or individuals create fake profiles to promote products or
services. This can be unethical, as it may not be clear to the audience that the profile is fake
and the content is paid for.
Catfishing: Some people create fake profiles to deceive others into thinking they are someone
else, often for romantic or sexual purposes. This is known as catfishing, and it can be harmful
to both the person being deceived and the person behind the fake profile.
It's important to be aware of fake profiles and the potential risks they can pose. If you come
across a fake profile, it's a good idea to report it to the platform or service where it was
created and to be cautious about interacting with it.
2.SYSTEM ANALYSIS
2.1 INTRODUCTION
Various fake record recognition methodologies depend on the investigation of individual
interpersonal organization profiles, with the point of distinguishing the qualities or a
combination thereof that help in recognizing the legitimate and the fake records.In particular,
various features are extracted from the profiles and posts, and after that Machine learning
algorithms are used so as to construct a classifier equipped for recognizing fake records.
For instance, Nazir et al. (2010) describes recognizing and describing phantom profiles in
online social gaming applications. The article analyses a Facebook application, the online
game “Fighters club”, known to provide incentives and gaming advantage to those users who
invite their peers into the game. The authors contend that by giving such impetuses the game
motivates its players to make fake profiles. By presenting those fake profiles into the game,
the user would increase a motivating force of an incentive for him/herself.
Adikari and Dutta (2014) depict recognizable proof of fake profiles on LinkedIn. The paper
demonstrates that fake profiles can be recognized with 84% exactness and 2.44% false
negative, utilizing constrained profile information as input. Techniques, for example, neural
networks, SVMs, and Principal component analysis are applied. Among others, highlights, for
example, the number of languages spoken, training, abilities, suggestions, interests, and
awards are utilized. Qualities of profiles, known to be fake, posted on uncommon sites are
utilized as a ground truth.
Chu et al. (2010) goesfor separating Twitter accounts operated by humans, bots, or cyborgs
(i.e., bots and people working in concert). As a part of the detection problem formulation, the
Identification of spamming records is acknowledged with the assistance of an Orthogonal
Sparse Bigram (OSB) text classifier that uses pairs of words as features.
Stringhini et al. (2013) analyze Twitter supporter markets. They describe the qualities of
Twitter devotee advertises and group the clients of the business sectors. The authors argue that
there are two major kinds of accounts who pursue the “client”: fake accounts(“sybils”), and
compromised accounts, proprietors of which don’t presume that their followers 5 rundown is
expanding. Clients of adherent markets might be famous people or legislators, meaning to
give the appearance of having bigger fan base, or might be cybercriminals, going for making
their record look progressively authentic, so they can rapidly spread malware what’s more,
spam.
Friend requests: A fake profiles willsend friend requeststo many users with public profiles.
Fake profile contents: It contents name, e-mail, age, address, gender, and etc. Naive bayes
algorithm: It has less accuracy.
Some possible features that could be used to classify profiles as real or fake include:
Profile activity: The activity of a user on the social media platform can also be used to
identify fake profiles. For example, if a user has a large number of followers or friends, but
very little activity (e.g., no posts, comments, or likes), this could be a sign that the profile is
fake.
Network structure: The structure of a user's network of friends and followers can also
provide clues about whether a profile is real or fake. For example, if a user has a very
largenetwork of friends and followers, but very few interactions with them. (e.g., no mutual
friends or comments), this could be a sign that the profile is fake. Classification starts from the
selection of profiles that needs to be classified.
Once the profiles are selected, the useful features are extracted for the purpose of
classification. The extracted features are then fed to trained classifier. classifier is trained
regularly as new data is fed into the classifier. Classifier then determines the whether the
profile is genuine or fake. The result of classification algorithm is then verified and feedback
is fed back into the classifier. As the number of training data increases the classifier becomes
more and more accurate in predicting the fake profiles.
3. SYSTEM STUDY
The feasibility of the project is analyzed in this phase and business proposal is put forth with a very
general plan for the project and some cost estimates. During system analysis the feasibility study of the
proposed system is to be carried out. This is to ensure that the proposed system is not a burden to the
company. For feasibility analysis, some understanding of the major requirements for the system is
essential.
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
This study is carried out to check the economic impact that the system will have on the
organization. The amount of fund that the company can pour into the research and development
of the system is limited. The expenditures must be justified. Thus the developed system as well
within the budget and this was achieved because most of the technologies used are freely
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on the
available technical resources. This will lead to high demands on the available technical
resources. This will lead to high demands being placed on the client. The developed system
must have a modest requirement, as only minimal or null changes are required for
The aspect of study is to check the level of acceptance of the system by the user. This
includes the process of training the user to use the system efficiently. The user must not feel
threatened by the system, instead must accept it as a necessity. The level of acceptance by the
users solely depends on the methods that are employed to educate the user about the system
and to make him familiar with it. His level of confidence must be raised so that he is also able
to make some constructive criticism, which is welcomed, as he is the final user of the system.
• Front-End - python
• Back-End - Django-ORM
• RAM - 4GB
5. SYSTEM DESIGN
5.1.2 Use Case Diagram- A use case diagram is used to represent the dynamic behavior of a
system. The main purpose of a use case diagram is to portray the dynamic aspect of a system
5.1.3 Sequence Diagram- The sequence diagram represents the flow of messages in
the system and is also termed as an event diagram. It helps in envisioning several
dynamic scenarios.
5.1.4 Activity Diagram- Activity diagram is basically a flowchart to represent the flow
from one activity to another activity. The activity can be described as an operation of the
system. The basic purposes of activity diagrams is similar to other four diagrams.
6. SYSTEM ENVIRONMENT
6.1 PYTHON
Python is a computer programming language often used to build websites and software,
automate tasks, and conduct data analysis. Python is a general-purpose language, meaning
it can be used to create a variety of different programs and isn't specialized for any specific
problems
Here is an example of how you could use a supervised learning approach with a
convolutional neural network (CNN) to identify fake profiles on a social media network
in Python:
First, you would need to gather a dataset of real and fake profiles. This could include
features such as the profile's bio, the number and types of friends, the content of posts, and
patterns of activity. You should also label each profile as either real or fake.
Next, you would need to pre-process the data by converting any categorical features into
numerical format and splitting the dataset into training, validation, and test sets.
Then, you would need to define the CNN model and compile it with an appropriate loss
function and optimizer.
After that, you can train the model on the training data and use the validation data to tune
the model's hyperparameters and evaluate its performance.
Once the model is trained and tuned, you can use it to make predictions on the test data
and evaluate its overall performance in terms of metrics such as accuracy, precision, and
recall.
Finally, you can use the trained model to classify new profiles as either real or fake as they
are created on the social media network.
6.2 Django-ORM
Django
Django is a free and open-source, Python-based web framework that follows the model–
template–views architectural pattern. It is maintained by the Django Software Foundation,
an independent organization established in the US.
ORM
ORM stands for Object Relational Mapper. The main goal of ORM is to send data between
a database and models in an application. It maps a relation between the database and a
model. So, ORM maps object attributes to fields of a table. The main advantage of using
ORM is that it makes the entire development process fast and error-free. Essentially, it
eliminates the need to write SQL code.
It is possible to use Django's object-relational mapper (ORM) to store and manipulate data
for a deep learning model that is used to identify fake profiles on a social media network.
First, you would need to define your models in Django using the models. Model class. For
example, you might define a Profile model that includes fields for the profile's bio, the
number and types of friends, the content of posts, and patterns of activity, as well as a field
for the label indicating whether the profile is real or fake.
Next, you would need to create a Django project and app, and configure the database
settings in your Django project's settings.py file.
Once you have your Django models and project set up, you can use Django's ORM to
query and manipulate the data in your database. For example, you can use the
Profile.objects.all() method to retrieve all of the profiles in the database, or use filters such
as Profile.objects.filter(label='real') to retrieve only specific subsets of the data.
Technologies used
• Front-end: python
• Back-end: Django-ORM
7. IMPLEMENTATION
IMPORT LIBRARIES
import os import numpy as np import pandas
category_encoders as ce
data.info() To define
the
data:
data.columns
target:
(15,6)) sns.countplot(x=data["target"])
plt.show()
np.asarray(input_data)
input_data_reshaped = input_data_as_np_array.reshape(1,-1)
# Here we are doing reshape the array as we are predicting one instance
prediction = gbc.predict(input_data_reshaped)
if (prediction == 0):
8. SYSTEM TESTING
The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub assemblies, assemblies and/or a finished product It
is the process of exercising software with the intent of ensuring that the Software
system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of test. Each test type addresses a
specific testing requirement.
TYPES OF TESTS
Integration testing
Integration tests are designed to test integrated software
components to determine if they actually run as one program. Testing is event driven
and is more concerned with the basic outcome of screens or fields. Integration tests
Unit Testing
Unit testing is usually conducted as part of a combined code and unit
test phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.
Test objectives
Features to be tested
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
9. OUTPUT SCREENS
This technique can also be used for other social networking sites such as Facebook or Twitter
and LinkedIn with the minor changes.
The accuracy of proposed technique can also be improved using different feature
selection techniques. The accuracy of proposed technique can also be improved using
different feature selection techniques
The accuracy of proposed technique can also be improved using different feature
selection techniques Improving the accuracy of the model: One potential area for
improvement is to work on increasing the accuracy of the model.
11. CONCLUSION
Deep learning techniques can be a useful tool for identifying fake profiles on social
media or online dating websites. These techniques can analyze various aspects of a profile,
such as the language used, the social connections, the images, and the activity, to detect
patterns or characteristics that may indicate a fake profile. Fake profiles can be used for
various purposes, such as spamming, phishing, or impersonation, and can pose a risk to
users' privacy and security. It is important to protect your personal information online and
to be wary of suspicious or inappropriate messages or requests.
We use machine learning, namely an artificial neural network to determine what are the
chances that a friend request is authentic are or not
However, it is important to note that deep learning techniques are not fool proof and may
not always be able to accurately identify fake profiles. It is always a good idea to use
multiple methods and approaches to verify the authenticity of a profile, and to be cautious
when interacting with people online.
REFFERENCES
JOURNAL
4. "A Deep Learning Approach for Detecting Spammer Accounts in Online Social
Networks" by R. Zhang
BOOKS
4. "Social Media Fraud: Detection and Prevention" by Hsinchun Chen and Christopher
C.
Yang
SITES
1. https://www.deeplearningindaba.com
2. https://www.kaggle.com
3. https://arxiv.or