Professional Documents
Culture Documents
ZEROTH REVIEW
PROJECT MEMBERS:
ROSHINI T S
LALITHAVANI V
SERAPHIN JAYANTHI SANTHAMARY S
LITERATURE REVIEW:
Data Preprocessing:
Tokenization: Dividing a string of text into its component words
Stop word removal: Removing word like “of”, ”by”, ”on”
Case conversion: Capital letters to small letters
Special symbol removal: Removing characters like “!”, ”@”.
Feature Extraction: Obtaining the necessary texts based on which classification is
done.
Output data: Assigning labels for the classified words.
LIMITATIONS:
Though it solves some limitations faced in previous works, it still lags in some areas:
Certain data collection methods limit the predicting model to specified keywords.
The ratio of bullied and non-bullied post varies in a high way that leads to imbalance
class distribution in dataset.
Incorporating new slang language like words, spellings and acronyms which
demands regular updation in training dataset.
Social media doesn’t restrict the bullied contents before posting.
The objective of this project is to classify the posts containing text and emoji into bullied and
non-bullied through Support Vector Machine (SVM). Also, to detect the phishing links using
Convolutional Neural Network (CNN) that is being shared to social media users. Phishing
links trap the user’s identity, in order to prevent it, the user can block the sender based on
their choice.
Example:
HATSTAGS: #Chatroom
LINKS: https://iplogger.org/
PROBLEM DEFINITION:
Cyberbullying can occur through SMS, Text, and apps, or online in social media, forums, or
gaming where people can view, participate in, or share content. Cyberbullying includes
sending, posting, or sharing negative, harmful, false, or mean content about someone else. It
can include sharing personal or private information about someone else causing
embarrassment or humiliation. Some cyberbullying crosses the line into unlawful or criminal
behavior. Recent research studies have revealed that cyberbullying and online harassment are
considerable problems for users of social media platforms, especially young people. In order
to overcome such issues, the bullied text must be known as well as the trap through phishing
links must be intimated earlier. Thereby, to achieve this, the text and links are detected using
Support Vector Machine (SVM) and Convolutional Neural Network (CNN) algorithms
respectively.
INPUT:
The input for the system is a set of posts containing emoji, text and links obtained from
various social media.
OUTPUT:
The output of the system is to label the posts as bullied or non-bullied, in case of links it has
to specify whether it's malicious or not.
HARDWARE REQUIREMENTS:
RAM: Minimum 2GB-4GB
ROM:256 GB
SOFTWARE REQUIREMENTS:
Operating System: 64 Bit(Windows 7/8/10)
Platform: ASP.NET
Front End: Visual Studio (Dot Net)
Back End: SQL Server
CONCLUSION:
This study presents text classification into bullied and non-bullied followed by link
classification into malicious or non-malicious. The text classification is performed by Support
Vector Machine(SVM).The link classification is performed by Convolutional Neural
Network(CNN).
PLAN OF ACTION:
REFERENCES:
R. Slonje, P. K. Smith, and A. Frisén, ‘‘The nature of cyberbullying, and strategies for
prevention,’’ Comput. Hum. Behav., vol. 29, no. 1, pp. 26–32, 2018.
K. Reynolds, A. Kontostathis, and L. Edwards, ‘‘Using machine learning to detect
cyberbullying,’’ in Proc. 10th Int. Conf. Mach. Learn. Appl. Workshops (ICMLA),
Dec. 2019, pp. 241–244