Professional Documents
Culture Documents
On
DETECTION OF CYBERBULLYING ON SOCIAL
MEDIA USING MACHINE LEARNING
Submitted by,
DECEMBER-2023
MALLA REDDY ENGINEERING COLLEGE
(AUTONOMOUS)
Maisammaguda, Secunderabad, Telangana, India 500100
BONAFIDE CERTIFICATE
SIGNATURE SIGNATURE
Dr. B. HARI KRISHNA Dr. P. S. R. C. MURTY
SUPERVISOR HOD
Associate Professor Professor
Computer Science Engineering Computer Science Engineering
Malla Reddy Engineering College, Malla Reddy Engineering College,
Secunderabad, 500 100 Secunderabad, 500 100
i
MALLA REDDY ENGINEERING COLLEGE
Maisammaguda, Secunderabad, Telangana, India 500100
ACKNOWLEDGEMENT
We express our sincere thanks to our Principal. Dr. A. Ramaswami Reddy, who took
keen interest and encouraged us in every effort during the project work.
We express our heartfelt thanks to our Dr. P .S .R .C Murty, Professor and Head,
Department of Computer Science and Engineering, for his kind attention and valuable
guidance throughout the project work.
We are thankful to our Project Coordinator Dr. Arun Kumar Kandru , Associate
Professor , Department of Computer Science and Engineering, for his cooperation
during the project work .
We are extremely thankful to our Project Guide Dr. B. Hari Krishna, for his constant
guidance and support in completing the project work.
We also thank all the teaching and non-teaching staff of the Department for their
cooperation during the project work.
ii
ABSTRACT
One of the most harmful consequences of social media is the rise of cyberbullying,
which tends to be more sinister than traditional bullying given that online records
typically live on the internet for quite a long time and are hard to control. In this paper,
we present a three-phase algorithm, called BullyNet, for detecting cyberbullies on
Twitter social network. We exploit bullying tendencies by proposing a robust method
for constructing a cyberbullying signed network. We analyze tweets to determine their
relation to cyberbullying, while considering the context in which the tweets exist in
order to optimize their bullying score. We also propose a centrality measure to detect
cyberbullies from a cyberbullying signed network, and we show that it outperforms
other existing measures. We experimented on a dataset of 5.6 million tweets and our
results shows that the proposed approach can detect cyberbullies with high accuracy,
while being scalable with respect to the number of tweets.
Keywords:
Cyberbullying , Protection, Security, Hate, offend, malicious activities, Trolling,
Vulnerable, Social media ,Internet, Harass, Mistreat, Machine-learning.
iii
TABLE OF CONTENTS
ABSTRACT iii
LIST OF FIGURES vii
LIST OF AND ABBREVIATIONS viii
iv
6 SYSTEM REQUIREMENTS 18
6.1. Hardware Requirements 18
6.2. Software Requirements 18
v
9.8. Pie Chart Of Message Prediction Type 38
9.9.User Interface 39
9.10. Sample Output 39
REFERENCES 44-45
vi
LIST OF FIGURES
FIGURE NO. TITLE PAGE NO.
9.1 Homepage 35
vii
LIST OF ABBREVIATIONS
- Signed Network
SN
viii
CHAPTER 1
INTRODUCTION
The Internet has created never before seen opportunities for human interaction and
socialization. In the past decade, social media, in particular, has had a popularity explosion.
From Myspace to Facebook, Twitter, Flickr, and Instagram, people are connecting and
interacting in a way that was previously impossible. The widespread usage of social media
across people of all ages created a vast amount of data for several research topics, including
recommender systems [1], link predictions [2], visualization, and analysis of social
networks [3].
While the growth of social media has created an excellent platform for
communications and information sharing, it has also created a new platform for malicious
activities such as spamming [4], trolling [5], and cyber bullying [6]. According to the Cyber
bullying Research Center (CRC) [7], cyber bullying occurs when someone uses the
technology to send messages to harass, mistreat or threaten a person or a group. Unlike
traditional bullying where aggression is a short and temporary face to- face occurrence,
cyber bullying contains hurtful messages which are present online for a long time. These
messages can be accessed worldwide and are often irrevocable. Laws about cyber bullying
and how it is handled differ from one place to another. For example, in the United States,
the majority of the state’s incorporate cyberbullying into their bullying laws, and cyber
bullying is considered a criminal offense in most of them [8].
Popular social media platforms such as Face book and Twitter are very
vulnerable to cyber bullying due to the popularity of these social media sites and the
anonymity that the internet offers to the perpetrators. Although strict laws exist to punish
cyber bullying, there are very less tools available to effectively combat cyber bullying.
Social media platforms provide users with the option to self-report abusive behavior and
content in addition to providing tools to deal with bullying. For example, Twitter has
features that include locking accounts for a brief period of time or banning the accounts
when the behavior becomes unacceptable. The body of work produced by the research
community with regards to cyber bullying in social networks also needs to be expanded to
get better insights and help develop effective tools and techniques to tackle the issue.
1
To identify cyber bullies in social media, we first need to understand how social
media can be modeled. The common way of modeling relationship in social psychology
[9] is to represent it as a signed graph with a positive edge corresponding to the good intent
and negative edge corresponds to malicious intent between people. Using the signed graph,
we model the Twitter social network as a signed network to represent users’ behavior [10]
where nodes correspond to users, directed edges correspond to communications and/or
relations between the users with assigned weight in the range.
[-1, 1], as illustrated in Figure 1.
2
In this paper, we study the problem of cyber bullying in social media in an attempt to
answer the following research question: Can tweet contexts (conversations) help improve
the detection of cyber bullying in Twitter? Our intuition is that each tweet should be
evaluated not only based on its contents, but also based on the context in which it exists.
We call such a context a conversation, which is a set of tweets between two or more people
exchanging information about a certain subject. Thus, our solution consists of three parts.
First, for each conversation, a conversation graph is generated based on the sentiment and
bullying words in the tweets. Second, we compute the bullying score for each pair of users
in a conversation graph, and then combine all graphs to create an SSN called bullying
signed network (B). The inclusion of negative links can bring out information that would
otherwise be missed with only positive links [16]. Finally, we propose a centrality measure
called attitude & merit (A&M) to detect bullying users from the signed network KB. Our
main contributions are organized as follows:
3
CHAPTER 2
LITERATURE SURVEY
1. M. Di Capua, et al. [1] raises the unsupervised how to develop an online bullying model
based on a combination of features, based on traditional textual elements and other "social
features". Features were divided into 4 categories as Syntactic features, Semantic features,
Sentiment features, and Community features. The author has used the Growing
Hierarchical Self Map editing network (GHSOM), with 50 x grid 50 neurons and 20
elements as the insertion layer. M. Di Capua, and others used an integration algorithm k-
means to separate the input database and GHSOM in the Form spring database. The effects
of this hybrid the unsupervised way surpasses the previous one results. The author then
checked the YouTube database at 3 p.m. Different Machine Learning Models: Naive Bayes
Classifier, Decision Tree Classifier (C4.5), and Support Vector Machine (SVM) with
Linear Kernel. It was saw that the combined effects of hate speech turned around so that
we have lower accuracy in the YouTube database compared to Form Spring tests, as in the
text analysis and syntactical features work differently in on both sides. When this hybrid
method is used in Twitter Database, led to weak memory and F1 Score. The model
proposed by the authors can also be improved and used in building constructive mitigation
applications cyberbullying problems.
2. J. Yadav, et al. [2] suggests a new approach to this the discovery of internet
cyberbullying on social media in using a BERT model with a single line neural and was
tested on the Form spring forum and Wikipedia Database. The proposed model provided
performance 98% accuracy of spring Form databases and 96% accuracy in a relatively
comprehensive Wikipedia database previously used models. The proposed model provided
better Wikipedia database results due to its size g without the need for excessive sampling
while I The spring data form requires multiple samples.
3. R. R. Dalvi, et al. [3] suggests a way to do this detect and prevent online exploitation on
Twitter using Classified supervised machine learning algorithms. In this study, the live
Twitter API is used for compilation tweets and data sets. The proposed model tests both
Support Vector Machine and Naive Bayes on the data sets are collected. To remove a
feature, use it TFIDF vectorizer. The results show that it is accurate of an online bullying
model based on Vector Support. The machine is about 71.25% better than Naive Bayes
was almost 52.75%.
4
4. Trana R.E., et al. [4] The goal was to design a machine learning model to minimize
special events including text extracted from image memes. The author include a site that
contains approximately 19,000 text views that have been published on YouTube. This
study discusses the operation of three learning machines equipment, Uninformed Bayes,
Support Vector The machine, as well as the convolutional neural network used in on the
YouTube website and compare the results with existing details of the Form. They do not
write continuously investigate cyber bullying algorithms sub-sections within the YouTube
website. Naive Bayes beat SVM and CNN in the next four categories: race, nationality,
politics, and general. SVM passed well with the inexperienced Naïve Bayes again CNN is
in the same gender group, with all three algorithms showing equal performance with the
middle body group accuracy. The results of this study provided inaccurate data used to
distinguish between incidents of abuse and non-violence. Future work can focus on the
construction of a two-part separation system used to test text taken from photos to see that
the YouTube website provides a better context for aggression related collections.
5
be used continuously to increase system accuracy. Such a model can and used for adoption
in English and local languages if possible.
7. P. K. Roy, et al. [7] details about creating a request for hate speech on Twitter via the
help of a deep neural convolutional network. Machine learning algorithms such as Logistic
Regression (LR), Random Forest (RF), Naïve Bayes (NB), Support Vector Machine
(SVM), Decision Tree (DT), Gradient Boosting (GB), and K-nearby Neighbors (KNN)
have it used to identify tweets related to hate speech in Twitter and features have been
removed using tf-idf process. The leading ML model was SVM, but it managed to predict
53% hate speech tweets on a 3: 1 database to be tested train. The reason behind the low
forecast scale was unequal data. The model is based on the prediction of hate speech tweets.
Advanced learning-based methods on the Convolutional Neural Network (CNN), Long-
Term Memory (LSTM), and its Content LSTM combinations (CLSTM) have the same
effects as separate distributed database. 10 times cross confirmation was used in
conjunction with the proposed DCNN model once you got a very good rate of memory. It
was 0.88 hate speech and 0.99 non-hate speech. Test results confirmed that the k-fold
opposite verification process is a better resolution with unequal data. In the future, the the
current database can be expanded for better performance accuracy.
8. S. M. Kargutkar, et al. [8] had proposed a program to give the characters twice for
cyberbullying. The system uses Convolutional Neural Network (CNN) and Keras content
testing as appropriate strategies then providing them.
6
CHAPTER 3
SYSTEM ANALYSIS
The first method of determining bullying messages was done using a combination of text-
based analytics and a mix of text and user features. Zhao et al. [18] proposed a text-based
Embeddings-Enhanced Bag-of-Words (EBoW) model that utilizes a concatenation of
bullying features, bag-of-words, and latent semantic features to obtain a final
representation, which is then passed through a classifier to identify cyberbullies.
The second method was aimed at identifying the person behind the cyberbullying
incidents. Squicciarini et al. [22] used Myspace data to create a graph, which integrated
user, textual, and network features. This graph was used to detect cyberbullies and predict
the spreading of bullying behavior through node classification. Galan-García et al. [23]
used supervised machine learning to detect the real users behind troll profiles on Twitter
and demonstrated the technique in a real case of cyberbullying. In a recent paper on
aggression and bullying in Twitter, Chatzakou et al. [24] found cyberbullies and aggressors
using user, text, and network-based features.
7
3.2 EXSISTING SYSTEM DISADVANTAGES
❖ The system is less effective due to lack of Constructing bullying signed network.
❖ The system isn’t effective due to lack of training large scale datasets.
a) Built conversation.
3) Experimented on 5.6 million tweets collected over 6 months. The results show that our
approach can detect cyberbullies with high accuracy, while being scalable with respect to
the number of tweets.
8
3.4 PROPOSED SYSTEM ADVANTAGES
❖ The system is more effective due to the presence of Conversation Graph Generation
Algorithm, Bullying Signed Network Generation Algorithm, and Bully Finding
Algorithm.
❖ The system is more effective due to the techniques to analyze large number of datasets.
9
CHAPTER 4
SYSTEM STUDY
The feasibility of the project is analyzed in this phase and the business proposal is put
forth with a very general plan for the project and some cost estimates. During system
analysis the feasibility study of the proposed system is to be carried out. This is to ensure
that the proposed system is not a burden to the company. For feasibility analysis, some
understanding of the major requirements for the system is essential.
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
This study is carried out to check the economic impact that the system will have
on the organization. The amount of funds that the company can pour into the research and
development of the system is limited. The expenditure must be justified. Thus, the
developed system as well within the budget and this was achieved because most of the
technologies used are freely available. Only the customized products had to be purchased.
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand for the
available technical resources. This will lead to high demand for the available technical
resources. This will lead to high demands being placed on the client. The developed system
must have a modest requirement, as only minimal or null changes are required for
implementing this system.
10
4.4 SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by the user.
This includes the process of training the user to use the system efficiently. The user must
not feel threatened by the system, instead must accept it as a necessity. The level of
acceptance by the users solely depends on the methods that are employed to educate the
user about the system and to make him familiar with it. His level of confidence must be
raised so that he is also able to make some constructive criticism, which is welcomed, as
he is the final user of the system.
11
CHAPTER 5
SYSTEM DESIGN
12
5.1.1 INPUT DESIGN
Input Design plays a vital role in the life cycle of software development, it requires very
careful attention of developers. The input design is to feed data to the application as
accurately as possible. So inputs are supposed to be designed effectively so that the errors
occurring while feeding are minimized. According to Software Engineering Concepts, the
input forms or screens are designed to provide to have a validation control over the input
limit, range and other related validations.
This system has input screens in almost all the modules. Error messages are
developed to alert the user whenever he commits some mistakes and guides him in the right
way so that invalid entries are not made. Let us see deeply about this under module design.
Input design is the process of converting the user created input into a computer-
based format. The goal of the input design is to make the data entry logical and free from
errors. The error is in the input are controlled by the input design. The application has been
developed in user-friendly manner. The forms have been designed in such a way during
the processing the cursor is placed in the position where must be entered. The user is also
provided with in an option to select an appropriate input from various alternatives related
to the field in certain cases.
Validations are required for each data entered. Whenever a user enters an
erroneous data, error message is displayed, and the user can move on to the subsequent
pages after completing all the entries in the current page.
13
created by the administrator himself or a user can himself register as a new user but the
task of assigning projects and validating a new user rest with the administrator only.
The application starts running when it is executed for the first time. The server
has to be started and then the internet explorer in used as the browser. The project will run
on the local area network so the server machine will serve as the administrator while the
other connected systems can act as the clients. The developed system is highly user friendly
and can be easily understood by anyone using it even for the first time.
The goal is for UML to become a common language for creating models of object-
oriented computer software. In its current form UML is comprised of two major
components: a Meta-model and a notation. In the future, some form of method or process
may also be added to; or associated with, UML.
The UML represents a collection of best engineering practices that have proven
successful in the modeling of large and complex systems.
The UML is a very important part of developing object-oriented software and the
software development process. UML uses mostly graphical notations to express the design
of software projects.
14
5.3 USECASE DIAGRAM
15
5.4 CLASS DIAGRAM
16
5.5 SEQUENCE DIAGRAM
17
CHAPTER 6
SYSTEM REQUIREMENTS
18
CHAPTER 7
SOFTWARE ENVIRONMENT
7.1 PYTHON
• Python is Interactive: You can actually sit at a Python prompt and interact with
the interpreter directly to write your programs.
Invoking the interpreter without passing a script file as a parameter brings up the following
prompt −
$ python
Python 2.4.3 (#1, Nov 11, 2010, 13:34:43) [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on
linux2
>>>
19
Type the following text at the Python prompt and press the Enter −
If you are running new version of Python, then you would need to use print statement with
parenthesis as in print ("Hello, Python!");. However, in Python version 2.4.3, this produces
the following result −
Hello, Python!
Invoking the interpreter with a script parameter begins execution of the script and continues
until the script is finished. When the script is finished, the interpreter is no longer active.
Let us write a simple Python program in a script. Python files have extension .py. Type the
following source code in a test.py file −
Live Demo:
We assume that you have Python interpreter set in PATH variable. Now, try to run this
program as follows −
$ python test.py
Hello, Python!
Let us try another way to execute a Python script. Here is the modified test.py file −
Live Demo
#!/usr/bin/python
We assume that you have Python interpreter available in /usr/bin directory. Now, try to run
this program as follows −
20
$./test.py
Hello, Python!
• Easy-to-learn: Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
• Easy-to-read: Python code is more clearly defined and visible to the eyes.
• A broad standard library: Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh
• Interactive Mode: Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
• Portable: Python can run on a wide variety of hardware platforms and has the
same interface on all platforms.
• Extendable: You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more
efficient.
• GUI Programming: Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows
MFC, Macintosh, and the X Window system of Unix.
• Scalable: Python provides a better structure and support for large programs than
shell scripting.
21
Python has a big list of good features:
• It provides very high-level dynamic data types and supports dynamic type
checking.
• It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java
7.3. DJANGO
Django is a high-level web framework that enables developers to build web applications
rapidly by providing a robust foundation of tools, patterns, and components. It follows the
"batteries included" philosophy, which means that it includes many built-in features and
utilities, allowing developers to focus on creating unique and functional applications rather
than reinventing the wheel.
Key Features:
22
3. Admin Interface: Django's admin panel is an automatic, customizable interface for
managing the application's data. It enables developers to create, read, update, and delete
records without having to write extensive code for CRUD operations.
4. URL Routing: Django's URL routing system helps in mapping URLs to specific views.
This decouples the URL structure from the actual views and enhances maintainability.
6. Form Handling: Django provides a robust form handling system that simplifies form
creation, validation, and processing. It includes built-in security features to protect against
common web vulnerabilities like Cross-Site Scripting (XSS) and Cross-Site Request
Forgery (CSRF).
Components:
1. Models: Models define the structure of the application's data and how it interacts with
the database. Django's ORM enables developers to define models using Python classes and
automatically generates the necessary database schema.
2. Views: Views handle the logic behind rendering web pages. They process data, interact
with models, and pass data to templates for rendering.
3. Templates: Templates are responsible for generating HTML dynamically. They allow
developers to mix HTML with template tags and variables, enabling the creation of
dynamic and data-driven pages.
23
4. URLs and Routing: The `urls.py` file defines the URL routing configuration for the
application. It maps URLs to views, allowing users to access different parts of the
application.
5. Forms: Django's form handling system simplifies user input validation and data
processing. It helps in creating and handling HTML forms while providing security against
common web vulnerabilities.
6. Admin: Django's admin interface is a powerful tool for managing application data. It is
automatically generated based on the defined models and allows for easy data management
without writing custom admin views.
Django's rich ecosystem, extensive documentation, and active community make it a go-to
choose for building a wide range of web applications, from simple prototypes to complex,
high-traffic sites.
Django is a high-level Python Web framework that encourages rapid development and
clean, pragmatic design. Built by experienced developers, it takes care of much of the
hassle of Web development, so you can focus on writing your app without needing to
reinvent the wheel. It’s free and open source.
Django's primary goal is to ease the creation of complex, database-driven websites. Django
emphasizes reusability and "pluggability" of components, rapid development, and the
principle of don't repeat yourself. Python is used throughout, even for settings files and
data models.
Django also provides an optional administrative create, read, update and delete interface
that is generated dynamically through introspection and configured via admin models.
24
FIGURE 7.3.1: MVT Model
Create a Project
25
Whether you are on Windows or Linux, just get a terminal or a cmd prompt and navigate
to the place you want your project to be created, then use this code −
myproject/
manage.py
myproject/
__init__.py
settings.py
urls.py
wsgi.py
The “myproject” folder is just your project container, it actually contains two elements −
manage.py − This file is kind of your project local Django-admin for interacting with your
project via command line (start the development server, sync db...). To get a full list of
command accessible via manage.py you can use the code −
The “myproject” subfolder − This folder is the actual python package of your project. It
contains four files −
urls.py − All links of your project and the function to call. A kind of ToC of your project.
26
Setting Up Your Project
Your project is set up in the subfolder myproject/settings.py. Following are some important
options you might need to set −
DEBUG = True
This option lets you set if your project is in debug mode or not. Debug mode lets you get
more information about your project's error. Never set it to ‘True’ for a live project.
However, this has to be set to ‘True’ if you want the Django light server to serve static
files. Do it only in the development mode.
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': 'database.sql',
'USER': '',
'PASSWORD': '',
'HOST': '',
'PORT': '',
The database is set in the ‘Database’ dictionary. The example above is for SQLite engine.
As stated earlier, Django also supports -
MySQL (django.db.backends.mysql)
27
PostGreSQL (django.db.backends.postgresql_psycopg2)
MongoDB (django_mongodb_engine)
Before setting up any new engine, make sure you have the correct db driver installed.
INSTALLED_APPS = (
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'myapp',
28
CHAPTER 8
IMPLEMENTATION
8.1 MODULES
➢ Service Provider
➢ View and Authorize Users
➢ Remote User
In this module, the admin can view the list of users who all registered. In this,
the admin can view the user’s details such as, username, email, address and admin
authorize the users.
29
8.2 ALGORITHMS
While the Naive Bayes classifier is widely used in the research world, it is not
widespread among practitioners which want to obtain usable results. On the one hand, the
researchers found it is especially easy to program and implement it, its parameters are easy
to estimate, learning is very fast even on very large databases, its accuracy is reasonably
good in comparison to the other approaches. On the other hand, the final users do not obtain
a model that is easy to interpret and deploy, they do not understand the interest of such a
technique.
Thus, we introduce in a new presentation the results of the learning process. The
classifier is easier to understand, and its deployment is also made easier. In the first part of
this tutorial, we present some theoretical aspects of the naive bayes classifier. Then, we
implement the approach on a dataset with Tanagra. We compare the obtained results (the
parameters of the model) to those obtained with other linear approaches such as the logistic
regression, the linear discriminant analysis and the linear SVM. We note that the results
are highly consistent. This largely explains the good performance of the method in
comparison to others. In the second part, we use various tools on the same dataset.
30
8.2.2 SVM
In classification tasks a discriminant machine learning technique aims at finding, based on
an independent and identically distributed (iid) training dataset, a discriminant function
that can correctly predict labels for newly acquired instances. Unlike generative machine
learning approaches, which require computations of conditional probability distributions,
a discriminant classification function takes a data point x and assigns it to one of the
different classes that are a part of the classification task. Less powerful than generative
approaches, which are mostly used when prediction involves outlier detection, discriminant
approaches require fewer computational resources and less training data, especially for a
multidimensional feature space and when only posterior probabilities are needed. From a
geometric perspective, learning a classifier is equivalent to finding the equation for a
multidimensional surface that best separates the different classes in the feature space.
31
Logistic regression competes with discriminant analysis as a method for
analyzing categorical-response variables. Many statisticians feel that logistic regression is
more versatile and better suited for modeling most situations than is discriminant analysis.
This is because logistic regression does not assume that the independent variables are
normally distributed, as discriminant analysis does.
Decision tree classifiers are used successfully in many diverse areas. Their most important
feature is the capability of capturing descriptive decision-making knowledge from the
supplied data. Decision tree can be generated from training sets. The procedure for such
generation based on the set of objects (S), each belonging to one of the classes C1, C2, …,
Ck is as follows:
Step 1. If all the objects in S belong to the same class, for example Ci, the decision tree for
S consists of a leaf labeled with this class.
Step 2. Otherwise, let T be some test with possible outcomes O1, O2…, On. Each object
in S has one outcome for T so the test partitions S into subsets S1, S2… Sn where each
object in Si has outcome Oi for T. T becomes the root of the decision tree and for each
outcome Oi we build a subsidiary decision tree by invoking the same procedure recursively
on the set Si.
32
8.3 INPUT AND OUTPUT DESIGN
INPUT DESIGN
Input Design plays a vital role in the life cycle of software development, it requires very
careful attention from developers. The input design is to feed data to the application as
accurately as possible. So, inputs are supposed to be designed effectively so that the errors
occurring while feeding are minimized. According to Software Engineering Concepts, the
input forms or screens are designed to provide to have a validation control over the input
limit, range and other related validations.
This system has input screens in almost all the modules. Error messages are
developed to alert the user whenever he commits some mistakes and guides him in the right
way so that invalid entries are not made. Let us look deeply into this module design.
Input design is the process of converting the user created input into a computer-
based format. The goal of the input design is to make the data entry logical and free from
errors. The error is in the input are controlled by the input design. The application has been
developed in a user-friendly manner. The forms have been designed in such a way that
during the processing the cursor is placed in the position where must be entered. The user
is also provided with an option to select an appropriate input from various alternatives
related to the field in certain cases.
Validations are required for each data entered. Whenever a user enters an erroneous
data, an error message is displayed, and the user can move on to the subsequent pages after
completing all the entries in the current page.
33
OUTPUT DESIGN
The Output from the computer is required to mainly create an efficient method of
communication within the company primarily among the project leader and his team
members, in other words, the administrator and the clients.
The output of VPN is the system which allows the project leader to manage his
clients in terms of creating new clients and assigning new projects to them, maintaining a
record of the project validity and providing folder level access to each client on the user
side depending on the projects allotted to him. After completion of a project, a new project
may be assigned to the client. User authentication procedures are maintained at the initial
stages itself. A new user may be created by the administrator himself or a user can himself
register as a new user but the task of assigning projects and validating a new user rest with
the administrator only.
The application starts running when it is executed for the first time. The server
has to be started and then the internet explorer is used as the browser. The project will run
on the local area network so the server machine will serve as the administrator while the
other connected systems can act as the clients. The developed system is highly user friendly
and can be easily understood by anyone using it even for the first time.
34
CHAPTER 9
SAMPLE OUTPUT
35
9.3 ALGORITHM ACCURACY:
36
9.5 TRAINED SET ACCURACY RESULTS
37
9.7 LINE CHART OF MESSAGE PREDICTION TYPE
38
9.9 USER INTERFACE
39
CHAPTER 10
SYSTEM TESTING
The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, subassemblies, assemblies and/or a finished product It is the
process of exercising software with the intent of ensuring that the Software system meets
its requirements and user expectations and does not fail in an unacceptable manner. There
are various types of test. Each test type addresses a specific testing requirement.
40
10.1.3 Functional test:
41
10.2 User Acceptance Testing
User Acceptance of a system is the key factor for the success of any system. The
system under consideration is tested for user acceptance by constantly keeping in touch
with the prospective system users at the time of development and making changes wherever
required. The system developed provides a friendly user interface that can easily be
understood even by a person who is new to the system.
After performing the validation testing, the next step is output testing of the
proposed system, since no system could be useful if it does not produce the required output
in the specified format. Asking the users about the format required by them tests the outputs
generated or displayed by the system under consideration. Hence the output format is
considered in 2 ways – one is on screen and another in printed format.
.
10.3 TESTING STRATEGY :
A strategy for system testing integrates system test cases and design techniques
into a well-planned series of steps that results in the successful construction of software.
The testing strategy must co-operate test planning, test case design, test execution, and the
resultant data collection and evaluation .A strategy for software testing must accommodate
low-level tests that are necessary to verify that a small source code segment has been
correctly implemented as well as high level tests that validate major system functions
against user requirements.
42
CHAPTER 11
CONCLUSION
Although the digital revolution and the rise of social media enabled great advances in
communication platforms and social interactions, a wider proliferation of harmful behavior
known as bullying has also emerged. This paper presents a novel framework of Bully Net
to identify bully users from the Twitter social network. We performed extensive research
on mining signed networks for better understanding of the relationships between users in
social media, to build a signed network (SN) based on bullying tendencies. We observed
that by constructing conversations based on the context as well as content, we could
effectively identify the emotions and the behavior behind bullying. In our experimental
study, the evaluation of our proposed centrality measures to detect bullies from signed
network, we achieved around 80% accuracy with 81% precision in identifying bullies for
various cases.
There are still several open questions deserving further investigation. First, our
approach focuses on extracting emotions and behavior from texts and emojis in tweets.
However, it would be interesting to investigate images and videos, given that many users
use them to bully others. Second, it does not distinguish between bullying and aggressive
users. Devising new algorithms or techniques to distinguish bullies from aggressors would
prove critical in better identification of cyber bullies. Another topic of interest would be to
study the relationship between conversation graph dynamics and geographic location and
how these dynamics are affected by the geographic dispersion of the users? Does proximity
increase bullying behavior?
43
REFERENCES
[1] J. Tang, C. Aggarwal, and H. Liu, “Recommendations in signed social. networks,” in
Proceedings of the International Conference on WWW,2016, pp. 31–40.
[2] D. Liben-Nowell and J. Kleinberg, “The link-prediction problem for social networks,”
Proceedings of the ASIS&T, vol. 58, no. 7, pp. 1019– 1031, 2007.
[3] U. Brandes and D. Wagner, “Analysis and visualization of social networks,” in Graph
drawing software, 2004, pp. 321–340.
[4] X. Hu, J. Tang, H. Gao, and H. Liu, “Social spammer detection with sentiment
information,” In Proceedings of IEEE ICDM, pp. 180––189,2014.
[5] E. E. Buckels, P. D. Trapnell, and D. L. Paulhus, Trolls just want to have fun, 2014, pp.
67:97–102.
[6] S. Kumar, F. Spezzano, and V. Subrahmanian, “Accurately detecting. trolls in Slashdot
zoo via decluttering,” in Proceedings of IEEE/ACMASONAM, 2014, pp. 188–195.
[7] J. W. Patchin and S. Hinduja, “2016 cyberbullying data,” 2017.
[8] C. R. Center, “https://cyberbullying.org/bullying-laws.”
[9] D. Cartwright and F. Harary, “Structural balance: a generalization of Heider’s theory.”
Psychological review, vol. 63, no. 5, p. 277, 1956.
[10] J. Leskovec, D. Huttenlocher, and J. Kleinberg, “Signed networks in social media,” in
Proceedings of the SIGCHI CHI, 2010, pp. 1361–1370.
[11] R. Plutchik, “A general psych evolutionary theory of emotion,” in Theories of
emotion, 1980, pp. 3–33.
[12] W. Medhat, A. Hassan, and H. Korashy, “Sentiment analysis algorithms and
applications: A survey,” Proceedings of the Ain Shams engineering journal, vol. 5, no. 4,
pp. 1093–1113, 2014.
[13] L. Tang and H. Liu, “Community detection and mining in social media, “Synthesis
lectures on data mining and knowledge discovery, vol. 2, no. 1,pp. 1–137, 2010.
[14] S. Bhagat, G. Cormode, and S. Muthukrishnan, “Node classification in social
networks,” in Social network data analytics, 2011, pp. 115–148.
[15] J. Tang, Y. Chang, C. Aggarwal, and H. Liu, “A survey of signed network mining in
social media,” In Proceedings of the ACM Comput. Surv., no. 3, pp. 42:1–42:37, 2016.
[16] J. Kunegis, J. Preusse, and F. Schwagereit, “What is the added value of negative links
in online social networks?” in Proceedings of the International Conference on WWW,
2013, pp. 727–736.
[17] Z. Wu, C. C. Aggarwal, and J. Sun, “The troll-trust model for ranking in signed
networks,” in Proceedings of the ACM International Conference on WSDM, 2016, pp.
447–456.
[18] R. Zhao, A. Zhou, and K. Mao, “Automatic detection of cyberbullying on social
networks based on bullying features,” in Proceedings of the ICDCN, 2016.
[19] V. K. Singh, Q. Huang, and P. K. Atrey, “Cyberbullying detection using. socio-textual
information fusion,” In Proceedings of the IEEE/ACM ASONAM, pp. 884—-887, 2016.
[20] H. Hosseinmardi, S. A. Mattson, R. I. Rafiq, R. Han, Q. Lv, and S. Mishra, “Detection
of cyberbullying incidents on the Instagram social network,” In Proceedings of the CoRR,
2015.
[21] J.-M. Xu, X. Zhu, and A. Bellmore, “Fast learning for sentiment analysis on bullying,”
in Proceedings of the First International WISDOM, 2012, pp. 10:1–10:6.
44
[23] P. Galan-Garcia, J. De La Puerta, C. Gómez, I. Santos, and P. Bringas, “Supervised
machine learning for the detection of troll profiles in twitter social network: Application to
a real case of cyberbullying,” vol. 24, pp. 42–53, 2014.
[24] D. Chatzakou, N. Kourtellis, J. Blackburn, E. De Cristofaro, G.Stringhini,and A.
Vakali, “Mean birds: Detecting aggression and bullying on twitter,” in Proceedings of the
ACM on WebSci, 2017, pp. 13–22.
[25] L. Cheng, J. Li, Y. N. Silva, D. L. Hall, and H. Liu, “Xbully: Cyberbullying detection
within a multi-modal context,” in Proceedings of the Twelfth ACM International
Conference on Web Search and Data Mining, 2019, pp. 339–347.
45