You are on page 1of 12

SURAKSHA

An Anti-cyber bullying system for women security with


mental health services

Capstone Project Proposal


Submitted by:

Navya Mehta 101815071

Himmat Singh Chahal 101803160

Shubhanshu Khirwar 101803519

Manasvee Bhatia 101803675

BE Third Year- COE

CPG No. 176

Under the Mentorship of

Dr.Sushma Jain

Computer Science and Engineering Department

Thapar Institute of Engineering and Technology, Patiala

March,2021
TABLE OF CONTENTS

• Mentor Consent Form 3


• Project Overview 4
• Need Analysis 4
• Literature Survey 5
• Objectives 8
• Methodology 8
• Work Plan 9
• Project Outcomes & Individual Roles 10
• Course Subjects 10
• References 11

Page 2 of 12
112 12
Mentor Consent Form
I hereby agree to be the mentor of the following Capstone Project Team

SURAKSHA

Roll No Name Signatures

101815071 Navya Mehta

101803160 Himmat Singh Chahal

101803519 Shubhanshu Khirwar

101803675 Manasvee Bhatia

NAME of Mentor: Dr. Sushma Jain

SIGNATURE of Mentor:

Page 3 of 12
112 12
Project Overview

Cyberbullying is the use of technology as a medium to bully someone. Although it has been an
issue for many years, the recognition of its impact on women has recently increased. The advent of
social media, particularly Twitter, raises many issues due to a misunderstanding regarding the
concept of freedom of speech. Social networking sites provide a fertile medium for bullies, and
teens and young adults who use these sites are vulnerable to attacks.

Therefore, detection of cyberbullying without the involvement of the victims is necessary. Through
machine learning, we can detect language patterns used by bullies and their victims, and develop
rules to automatically detect cyberbullying content. A dataset will be used for our project that
contains a high percentage of bullying content. The data will be labelled , in conjunction with
machine learning techniques , to train a computer to recognize bullying content.

Our focus in this project is to make social media safer for women and also provide mental health
services to those who are affected by cyberbullying. We are going to design a web app that will
detect harsh comments and modify those comments in advance before the user sees them. Also,
mental health services will be offered by a chatbot. Using this chatbot the user can also report the
bully with an automated message.

Need Analysis

According to an international study carried out by the Plan International children's rights
organization, 58% of the girls and young women surveyed reported personal experiences of online
harassment on social media platforms, and this abuse is driving girls to quit social media platforms.
Of the girls who have been harassed very frequently, 19% said they use the social media platform
less as a result and 12% stopped using it altogether. In one year alone, cyberbullying of Indian
women and teenagers rose by 36%. Nearly, half of girls targeted had been threatened with physical
or sexual violence, according to the poll. Many said the abuse took a mental toll, and a quarter felt
physically unsafe.

One in five girls and young women has abandoned or cut down on using a social media platform
after being targeted, with some saying harassment started when they were as young as 8. Driving
girls out of online spaces is hugely disempowering in an increasingly digital world, and damages
their ability to be seen, heard and become leaders.

The abuse was suppressing girls' voices at a time when the COVID-19 pandemic was increasing
the importance of communicating online. It called on social media companies to take urgent action
to address the issue and urged governments to pass laws to deal with online harassment. As this
global pandemic moves our lives online, women are more at risk than ever.

Women don’t seek help for cyberbullying for many of the same reasons because they feel
embarrassed or ashamed to be a target. They don’t want to be seen as losing even more social
Page 4 of 12
112 12
status. They feel like it’s their responsibility to deal with it and don’t recognize it as bullying or as
something serious.

In addition, cyberbullying has been associated with adverse mental health effects, including
depression, anxiety, and other types of self-harm, suicidal thoughts, attempted suicide, and social
and emotional difficulties.

So, we need an efficient anti-cyber bullying system to prevent this from happening. A web app
which modifies comments beforehand will help in halting the processes of cyber bullying. Besides,
if the comment is changed prior to the user reading them, it will not affect the mental health of the
user. Our chatbot with an automated message will make it easier for victims to report the bully. The
chatbot will also provide mental health assistance, which is an essential if the user is struggling
during this whole process.

Literature Survey

The existing anti-cyberbullying system research paper are given below :-

• Cyberbullying Detection System on Twitter

In this paper ,the implementation of the Cyberbullying Detection System on Twitter is


based on PHP and HTML with the MySQL and Twitter API. This system will detect
cyberbullying related tweets that have matching keywords from the database. By utilizing
the Twitter APIs, the cyberbullying related tweets will be retrieved by the connection of the
APIs and the database that matched based on the cyberbullying keywords identified, and
the cyberbullying users (cyberbully and/or victims). The system first fetches matched
tweets with the cyberbullying tweets and words, also the cyberbullying users’ information
from the database. The results are then sent to the User Interface (UI) which will be
populating the information of the cyberbullying users and the tweets itself. Thereafter, the
users (NGOs) can interact with the cyberbullying users by giving advice, warning, or
counselling to monitor the cyberbullying activities. Also, the users can view and access
Page 7 the tweet’s location in a map form that will allow the users to identify their location
more precisely. This will help in generating the topography and statistics of cyberbullying
event based on specific location. [1]

• Modeling the Detection of Textual Cyberbullying

From this paper we got the insight on how to decompose the overall detection problem into
detection of sensitive topics, lending itself into text classification sub-problems. An
experiment with a corpus of 4500 YouTube comments was performed, applying a range of
binary and multiclass classifiers. We found that binary classifiers for individual labels
outperform multiclass classifiers. Our findings show that the detection of textual
cyberbullying can be tackled by building individual topic-sensitive classifiers. [2]

Page 5 of 12
112 12
• Cyberbullying Detection by Using Artificial Neural Network Models

In this research paper, they designed eight different artificial neural network models to
detect cyberbullying in Turkish social media contents automatically in our study. These
models along with various text mining techniques were experimented in the dataset that
includes 3000 Turkish tweets. According to the evaluation results, the most successful
model for predicting cyberbullying content is YSA2 with 91% F-measure score.
Additionally, YSA2 gives better performance than the experimented machine learning
classifiers in the previous study. [3]

• Cyberbullying Detection: A Survey on Multilingual Techniques

In this research, cyberbullying detection has been done on Arabic language cyberbullying
detection until the time of writing this paper. Many techniques are utilized in the area of
cyberbullying detection' mainly Machine Learning (ML) and Natural Language Processing
(NLP). This paper presents a brief background on cyberbullying and all technologies
incorporated under this field; in addition to an extensive survey regarding the techniques
and advancements in multilingual cyberbullying detection; and finally proposes a plan of a
solution for the problem of Arabic cyberbullying. [4]

• Cyberbullying Detection using Pre-Trained BERT Model

Google researchers recently developed a language learning model called BERT, which is
capable of generating contextual embeddings and is also able to produce task specific
embeddings for classification. A new approach is proposed in this paper for cyberbullying
detection in social media platforms by using the novel pre-trained BERT model with a
single linear neural network layer on top as a classifier, which improves over the existing
results. The model is trained and evaluated on two social media datasets of which one
dataset is small size and the second dataset is relatively larger size. [5]

• Online Social Network Bullying Detection Using Intelligence Techniques


In this research paper, an effective method to detect cyberbullying activities on social media
is been proposed. The detection method can identify the presence of cyberbullying terms
and classify cyberbullying activities in social network such as Flaming, Harassment,
Racism and Terrorism, using Fuzzy logic and Genetic algorithm. The effectiveness of the
system is increased using Fuzzy rule set to retrieve relevant data for classification from the
input. [6]

Page 6 of 12
112 12
• A Comprehensive Study on Cyberbullying Detection Using Machine Learning
Approach
The system automatically identify bully words, emojis and audio/video features from online
social platforms, especially micro-blogging site such as Twitter and videosharing platform
such as YouTube is an important research. This paper presents a collective and structured
study to reconnoiter and assimilate research done in the field of detection of cyberbullying,
also research gaps are illustrated in a legitimate manner. The study portrays a
comprehensive systematic literature review of strategies proposed in the field of text-based
and video-based cyberbullying. [7]

• Using Machine Learning to Detect Cyberbullying

In this research paper, the project is been implemented through machine learning, we can
detect language patterns used by bullies and their victims, and develop rules to
automatically detect cyberbullying content. The data we used for our project was collected
from the website Formspring.me, a question-and-answer formatted website that contains a
high percentage of bullying content. The data was labeled using a web service, Amazon’s
Mechanical Turk. We used the labeled data, in conjunction with machine learning
techniques provided by the Weka tool kit, to train a computer to recognize bullying content.
Both a C4.5 decision tree learner and an instance-based learner were able to identify the
true positives with 78.5% accuracy. [8]

• Cyberbullying severity detection: A Machine Learning Approach


Here in this research paper, they developed a supervised machine learning solution for
cyberbullying detection and multi-class categorization of its severity in Twitter. In the study
we applied Embedding, Sentiment, and Lexicon features along with PMI-semantic
orientation. Extracted features were applied with Naïve Bayes, KNN, Decision Tree,
Random Forest, and Support Vector Machine algorithms. Results from experiments with
our proposed framework in a multi-class setting are promising both with respect to Kappa,
classifier accuracy and f-measure metrics, as well as in a binary setting. These results
indicate that our proposed framework provides a feasible solution to detect cyberbullying
behavior and its severity in online social networks. Finally, we compared the results of
proposed and baseline features with other machine learning algorithms. Findings of the
comparison indicate the significance of the proposed features in cyberbullying detection.[9]

• Bully Identification with Machine Learning Algorithms

This research paper used surveys from students of schools and colleges and the basis of that
information the concerned authorities were informed so that they can think of the ways to
eradicate it . For the survey, this paper adopts data mining techniques of the

Page 7 of 12
112 12
concerned survey results and convert into knowledge. Following a five step process of Data
Selection, Pre- Processing/Cleaning, Transformation, Data Mining and Inter-
pretation/Evaluation. Along with this the paper utilizes three unique approaches, Internal
Labelling, Synthetic Labelling and Data Programming. To identify these data patterns
effectively, suitable machine learning algorithms were also used. [10]

Objectives
1. To design a website that will detect the severity of harsh comments (using machine
learning) which will be done through our dataset inserted into it for the social media app
Twitter.
2. To modify the harsh comments before the user witness them.
3. To provide the mental health facilities through chatbot wherein an appointment will be
fixed with a counsellor.
4. To report the bully.

Methodology

1. Data Collection step


Dataset will be collected from the internet. Dataset was already categorized into different topics of
harassment content: i) sexual, ii) racial, iii) appearance-related, iv) intelligence, and v) political. On
severity assessment on the harassment data set, we will be categorizing the annotated cyberbullied
tweets into 4 levels; low, medium, high and non-cyberbullying. We will then categorize: sexual and
appearance related tweets as high-level cyberbullying severity; political and racial tweets as
medium-level; intelligence tweets as low-level cyberbullying severity, and all the tweets that were
labelled as ‘non-cyberbullying’ in each category were consolidated into one category as non-
cyberbullying tweets

2.Pre -processing step


This step is also known as the data cleaning step. Tweets will be converted to lower case to avoid
any sparsity issue, reduced repeated letters, standardized URLs and @usermention to remove noise
in the tweets. Tokenization will be applied with Twitter-specific tokenizer based on the CMU
TweetNLP library. Tokenization is the process of breaking a text corpus up into most commonly
words, phrases, or other meaningful elements, which are then called tokens. Finally, stop-words
and stemming procedures were performed before feature extraction.

Page 8 of 12
112 12
3.Data Validation
In this approach, text will be represented by a set of words and each word is treated as an
independent feature. We will apply part-of-speech (POS) tagging with Twitter-specific tagger
based on the CMU TweetNLP library for word sense disambiguation. The POS tagger will assign
part-of-speech tag to each word of the given text in the form of tuples (word, tag), for instance,
noun, verb, adjectives, etc.

4. Classification
Machine learning classification will be done using Na¨ıve Bayes Classifier. TF-IDF weighting and
validation data using 10 fold cross validation and then do classification using Na¨ıve Bayes , to the
positive content of bullying on Group on class result bullying, and the negative content to group on
class result negative. So, for this type of cyber- bullying, such as cyberbullying which is related to
psychology will be going on in a group class related to psychology and soon.

5.Modification
When the above process is executed harsh words will be modified in the form of astrix (*).

6.Chatbot
In case of cyber bullying, the chatbot will pop-up, by answering some of the questions user can
even seek help i.e an appointment can be fixed with a counsellor.

Work Plan

Page 9 of 12
112 12
Project Outcomes & Individual Roles
Outcomes:
• An effective anti-cyber-bullying system i.e a website that will detect the harsh comments
and modify the comments before the user sees them.
• Mental health facilities will be provided through chatbot wherein the user by answering
some of the questions can fix an appointment with a counsellor , as this project will be
focusing on making social media safer for women
• The user will be able to report the bully too.

Individual Roles:
1. Navya will be working on the backend and will be implementing the project though
machine Learning (NLP)
2. Himmat will be working on the backend of the website and on designing the chatbot, will
also be implementing machine learning in the project
3. Shubhanshu will be working on frontend and designing the website by implementing it with
proper GUI and will be establishing connections for mental health services.
4. Manasvee will work on the frontend too, will help in deploying the project and increasing
the accuracy of the project

Course Subjects
• NLP in Machine Learning
• Web Development with proper GUI
• Python language

Page 10 of 12
112 12
References
1. Amgad Muneer 1,* and Suliman Mohamed Fati : 9 October 2020; Accepted: 20 October 2020;
Published: 29 October 2020
2. Modeling the Detection of Textual Cyberbullying,Karthik Dinakar, Roi Reichart, Henry
Lieberman
3. A. Bozyiğit, S. Utku and E. Nasiboğlu, "Cyberbullying Detection by Using Artificial Neural
Network Models," 2019 4th International Conference on Computer Science and Engineering
(UBMK), Samsun, Turkey, 2019, pp. 520-524, doi: 10.1109/UBMK.2019.8907118.
4. B. Haidar, M. Chamoun and F. Yamout, "Cyberbullying Detection: A Survey on Multilingual
Techniques," 2016 European Modelling Symposium (EMS), Pisa, Italy, 2016, pp. 165-171, doi:
10.1109/EMS.2016.037.
5. J. Yadav, D. Kumar and D. Chauhan, "Cyberbullying Detection using Pre-Trained BERT
Model," 2020 International Conference on Electronics and Sustainable Communication Systems
(ICESC), Coimbatore, India, 2020, pp. 1096-1100, doi: 10.1109/ICESC48915.2020.9155700.
6. PanelB. SriNandhiniaJ.I.Sheebab Department of Computer Science and Engineering, Pondicherry
Engineering College, Pondicherry-605014 Department of Computer Science and Engineering,
Pondicherry Engineering College, Pondicherry-605014
7. Vaishali Malpe, Shubhangi Vaikole, Research Scholar, Computer Engineering Department,
Datta Meghe College of Engineering, University of Mumbai, India
2Research Guide, Computer Engineering Department, Datta Meghe College of Engineering,
University of Mumbai, India
8 Edwards, A. Kontostathis and K. Reynolds, "Using Machine Learning to Detect Cyberbullying,"
in Machine Learning and Applications, Fourth International Conference on, Honolulu, Hawaii
USA, 2011 pp. 241-244.
doi: 10.1109/ICMLA.2011.152
keywords: {cyberbullying;cybercrime;machine learning}
9. Talpur BA, O'Sullivan D. Cyberbullying severity detection: A machine learning approach. PLoS
One. 2020 Oct 27;15(10):e0240924. doi: 10.1371/journal.pone.0240924. PMID: 33108392;
PMCID: PMC7591033.
10. Masoom Patel, Pranav Sharma, Aswathy K Cherian. BULLY IDENTIFICATION WITH
MACHINE LEARNING ALGORITHMS. JCR. 2020; 7(6): 417-425. doi:10.31838/jcr.07.06.

Page 11 of 12
112 12
THE END

Page 12 of 12
112 12

You might also like