You are on page 1of 7

Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).

IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

Hate Speech Detection in Twitter using Natural


Language Processing
2021 Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV) | 978-1-6654-1960-4/20/$31.00 ©2021 IEEE | DOI: 10.1109/ICICV50876.2021.9388496

Bhavesh Pariyani Krish Shah Meet Shah Tarjni Vyas Sheshang Degadwala
MTech Data Science, MTech Data Science, MTech Data Science, Assistant professor, Associate Professor &
Nirma University Nirma University Nirma University Nirma University Head of Department
Ahmedabad, Gujarat Ahmedabad, Gujarat Ahmedabad, Gujarat Ahmedabad, Gujarat Computer Engineering
380060, India 380060, India 380060, India 380060, India Sigma Institute of
19mced09@nirmauni. 19mced13@nirmauni. 19mced14@nirmauni. tarjni.vyas@nirmauni. Engineering
ac.in ac.in ac.in ac.in sheshang13@gmail.co
m

Abstract—Twitter's central goal is to enable everybody to labels. Though unsupervised learning is the training system
make and share thoughts and data, and to communicate their function in which data set is neither categorized nor named[3].
suppositions and convictions without boundaries. Twitter's job is
to serve the public discussion, which requires portrayal of a
Supervised learning is further divided into two types
different scope of points of view. Yet, it does not advance
viciousness against or straightforwardly assault or undermine
regression and classification based on labels of dataset.Here
others based on race, nationality, public cause, rank, sexual we concerned only about classification. Classification machine
direction, age, inability, or genuine illness. Hate S peech can hurt algorithms used categorical dataset and are used to classify the
a person or a community. S o, it is not appropriate to use hate class/category of the unknown instance. Various machine
speech. Now, due to increase in social media usage, hate speech is learning application includes task that can be set up as
very commonly used on these platforms. S o, it is not possible to supervised. We aim to do this task by applying supervised
identify hate speeches manually. S o, it is essnetial to develop an classification methods like Support vector machine, logistic
automated hate speech detection model and this resaech work regression and random forest on labeled hate speech dataset.
shows different approaches of Natural Language Processing for
Each instance is represented in form of vector the length of
classification of Hate S peech through Machine Learning
Algorithms.
vector is dependent on the method used For representation of
Keywords—Logistic Regression; S VM; Tf-Idf; Random tweets. In this paper we used two types approaches for vecto r
Forest; Hate S peech; Bag of Words) representation of a tweet, vector term frequency- inverse
document frequency(tf-idf) and bag of words .
I. INT RODUCT ION
Anti s emitism
Due to increasing scale of social media, people are using social 8.70% Anti Mi grant Hatred
media platforms to post their views. Giving opinions which are 17.80%
harsh or rude to someone directly on face is a difficult task. So,
people feel it is safe over internet to abus e or post something Speci fic Religion Hatred
offensive to others. So, they feel secured posting such content 17.70%
on the internet. Due to this the use of hate speech over the
social media is increasing daily. So, as to handle such a large Ethni c Ori gin
data of users over social media, automatic detection of hate 15.80%
speech methods are required. In this paper we use machine
learning methods to classify whether hate speech or not[13].
Ra ce Gender
There are a number of machine learning applications, One of Sexual Ori netation
8.70% 2.80%
12.70%
them is for text based classification. Each instance or here we Rel igion Other Na ti onal Ori gin
can say each tweet is represented using the same set of features 4.50% 2.20% 9.10%
used by machine learning algorithms. There are two types of
problems solved machine learning algorithms, supervised and Fig : 1 Types of hate content removed from platforms
unsupervised. Supervised learning is the task of training model that have signed the EU Code of Conduct (January 2018 )
based on given dataset containing both set of features and

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1146

Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 23:17:06 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

A. Hate Speech Knowledge of annotator for hate speech was examined in Ref.
15. Authors produce some very good results in amateur
In old times, Hate Speech was limited to face to face annotation in comparison to expert annotations. Also, Waseem
conversations. But now due to increase in social media provide its own dataset and its evaluation. To penalize
platforms the usage of hate speech is increasing. As people misclassification on minority classes weighted F1-score is
feel they are hidden on the internet. Due to this, people feel suggested as an evaluation measure.
safe to use hate speech and it is human computiv e task to
identify hate speech on social media so we need some Nowadays with development in deep learning, CNN can be
automated techniques to detect hate speech. used for hate speech detection [2],[1]. Word-vector also known
On the other side, individuals are more likely to share their as word embedding can be trained on relevant corpus of the
views online, thereby leading to the dissemination of hate domain. This pretrained word-vectors are used in CNN [2] .
speech. Given that this sort of prejudiced contact may be Most of machine learning models uses bag-of words which
particularly detrimental to society, policymakers and social fails to capture patterns and sequences. It can be understood by
networking sites may profit from monitoring and prevention the example in Ref. 2. “if a tweet ends saying if you know what
devices[6]. I mean” here each word can be considered as hate speech but it
Hate speech is generally described as any contact that distorts is most likely that this sentence is hate speech. This type of
a individual or community on the basis of characteristics such features cannot be handled by bag of words which degrades the
as color, ethnicity, gender, sexual preference, nationality or performance of traditional machine learning algorithm.
religion[8].
III. DAT ASET DESCRIPT ION
B. Definition of hate speech
Dataset was obtained from the online social media platform
According to Paula Fortuna and Sergia Nunes “Hate speech is twitter. It can be easily found on the GitHub where previous
language that attacks or diminishes, that incites violence or researchers have uploaded different datasets for hate speech
hate against groups, based on specific characteristics such as detection.
physical appearance, religion, descent, national or ethnic Table 1. Classwise distribution of dataset
origin, sexual orientation, gender identity or other, and it can Class No. Of Instances
occur with different linguistic styles, even in subtle forms or Class 0 29720
when humor is used ”[6]. Class 1 2242
a
Class 0 – non hate speech
b
Class 1 – hate speech
II. RELAT ED W ORK 20 % of data is used as testing and 80% of as training from
each class
There are many approaches for detection of hate speech. But
they differ from each other based on the output they obtained in Table 2. Classwise distribution of Training Dataset
Ref. 8 hate speech was classified into three classes race, Class No. of Instances
nationality and religion. Ref. 8 uses sentiment analysis Class 0 23775
technique for detection of hate speech but just not detecting but Class 1 1794
they also classified into one of the three classes and also rate
the polarity of speech. TABLE 3. CLASSWISE DIST IBUT ION OF TEST ING DAT ASET
Class No. of Instances
We found two survey papers for automatic hate speech Class 0 5945
detection [6],[14]. In Ref. 6 motivation for hate speech
Class 1 448
detection is shown and why it became necessary to develop
more robust and accurate models for automatic hate speech
Table 4. Top 11 words with Highest Frequency in Hate speech
detection.
Amp Hate
The problem of hate speech detection is more often researcher Trump Women
keep data private while collecting it and there are less open White Might Libtard
source code available which make it difficult for comparative Allahsoil Libtard Libtard
study [6] . This degrades the progress in this field. Different Black Libtard Sjw
features related to hate speech are described in Ref. 14, like Racist
simple surface feature which includes bag of words, unigrams
or n-grams. Both training set and testing set need to have same
predictive word but it is problem as detection of hate speech is
applied on very small piece of text so to overcome this issue
word generalization is applied [14].

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1147

Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 23:17:06 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

Table 5. Term frequency

Table 6. Inverse Document Frequency

Table 7. Multiply Tf and Idf

Table 8. Normalize Tf and idf

IV. FEAT URE EXT RACT ION TF: Term Frequency, which measures how much a word
appears in a text. Although each document is specific in
Feature Extraction is a method to convert each tweet into a duration, it is likely that the word will occur far more often in
fixed set of attributes so that it can be easily interpreted by longer documents than in shorter ones.:
machine learning models. In feature extraction, a vocabulary of
words is being generated and vocabulary depends upon the େ୭୳୬୲୭୤୲ୣ୰୫୲୧୬ୟୢ୭ୡ୳୫ୣ୬୲
ˆሺ–ሻ ൌ (1)
method we use for feature extraction. This vocabulary is used ୘୭୲ୟ୪ୡ୭୳୬୲୭୤୲ୣ୰୫ୱ ୧୬୲୦ୣୢ୭ୡ୳୫ୣ୬୲

to convert each tweet into a vector form [9].


IDF: Inverse Document Frequency, which measures the
importance of a term while calculating TF. Usually, all terms
It is an important stage in Text Based classification which
as considered to be equal in terms of importance. However,
offers significant details based on text such as the term duration
when certain terms like” is”,” of”, and” the”, may occur a
for each tweet. Feature extraction is one of the key pre-
greater number of times but have little importance. Thus, it is
processing techniques used in data mining and text
necessary to give weight to each and every term.
classification that measures the context of documents [16].
୪୭୥೐ ୡ୭୳୬୲୭୤ ୢ୭ୡ୳୫ୣ୬୲ୱ
A. Tf-idf  ሺ–ሻ  ൌ (2)
େ୭୳୬୲୭୤ୢ୭ୡ୳୫ୣ୬୲ୱୡ୭୬୲ୟ୧୬୧୬୥୲ୣ୰୫୲

TF-IDF is widely recognized and is often used as a weighting


strategy and its efficiency is equivalent to modern techniques. If we break td-idf into steps then we get following three steps:
Documents are known to be variables in the word weighting. Step 1 : Derive term frequency
Selecting a function for a function selection procedure is Step 2: Derive inverse document frequency
considered to be the key preprocessing process necessary for Step 3: Aggregate the above two values using multiplication
indexing documents [11]. and normalization

Corpus used as a example for tf-idf

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1148

Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 23:17:06 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

corpus = ["I am the trees the forest my home", We need a method to interpret textual data for the machine
"I am the pollen and the broken leaves", learning algorithm, and the bag-of-word model allows us to
"I am canopy","I am the breeze", accomplish that target.
"I am here to please" The bag of words model is easy to understand and apply. In
] this method we create tokens of each sentence and the calculate
frequency of each token [12].
Step 1: Calculate the term frequency For example.
Calculating term frequency is a pretty straight forward it is “i am thankful for having a paneer today”
calculated as the number of times the word or a term appear in “i get to see my daddy today!”
a document.
First step:

Each sentence is assumed to be a separate document and by


making a vocabulary from this two documents excluding
punctuation we get,

‘I’, ‘am’, ‘thankful, ‘for’, ‘having’, ‘a’, ‘paneer’, ‘today’,


‘get’, ‘to’, ‘see’, ‘my’, ‘daddy’
Table 5. term frequency matrix was generated using
CountVectorizer in python.
Second step:

In this step we create vectors, obtained from text which are


Step 2:Inverse Document Frequency
used by the machine learning algorithms. for example — “i am
If we only use the term frequency as measure than there is no
thankful for having a paneer today” and we check the
difference between the important word like greatness and the
frequency of words from the already generated vocabulary.
common word like you. If a word is in all document or in most
“I” = 1
of documents then it play very less role in differentiating
“am” = 1
between the documents.So we need a mechanism to tone
“thankful” = 1
down the importance of the word that appear most frequently.
“for” =1
Similar to term frequency,
“having” = 1
Inverse Document frequency = total number of documents /
“a” = 1
number of documents with the term t
“paneer” = 1
“today” = 1
Table 6.shows the idf values for all the words. Note that there
“get” = 0
is only one IDF value for a word in the corpus.
“to” = 0
“see” = 0
Step 3:Multiply and normalize
“my” = 0
In Tf-IDF as the name implies it’s a combination of tf and idf
“daddy” = 0
i.e multiplying the two values. The sklearn implementation
then normalize the product of tf and idf.
Rest of the documents will be:
“i am thankful for having a paneer today” = [1, 1, 1, 1, 1, 1,
Step 3a: Multiply Tf and idf
1, 1, 0, 0, 0, 0, 0]
When multiplying two matrices together, we take elment wise
“i get to see my daddy today!” = [1, 0, 0, 0, 0, 0, 0, 1, 1, 1,
multiplication of term frequency matrix and Inverse Document
1, 1, 1]
Frequency matrix
Example of multiplication for first sentence is calculated as
V. M ACHINE LEARNING M ODEL
below.
A. Support Vector Machine (SVM)
Applying this to the corpus we get the table 7.
Using either nonlinear or linear mapping SVM converts the
Step 3b: Normalize:
original lower dimensional data into a higher dimension.
Normally, normalize is often used because we can compare it
Within this new dimension, it looks for the linear optimal
easily as we use percentage or proportions. So table 8 shows
dividing hyper-plane to distinguish the tuples between the sets
the td-idf vector for each document.
[5]. For sufficient nonlinear scaling to an appropriate high
dimension, information from two array scans are always
B. Bag of words
differentiated by a hyperplane. The SVM finds this hyperplane

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1149

Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 23:17:06 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

with the help of support vectors. Support vectors are those


instances which are nearer to the margins. An unlimited › ൌ  Ⱦ଴ ൅ Ⱦଵ ଵ  ൅ Ⱦଶ ଶ ൅Ǥ Ǥ Ǥ Ǥ Ǥ Ǥ ൅Ⱦ௡ ௡ (5)
number of separating lines that could be drawn here. The target
is to classify the “highest” one with the least classification error
on previous unseen tuples [4]. Maintaining the Integrity of the if we substitute “Eq. (5)” in “Eq. (3)” we get,
Specifications.

݂ሺ݊ሻ ൌ (6)
ଵା௘షሺಊబశಊభ౔భశಊమ౔మశǤǤǤǤǤǤశಊ೙౔೙ሻ

Where “Eq. (6)” is probability of the datapoint belonging to


anyone class. Then cutoff point (mostly 0.5) is used divide
between the two classes. Cutoff point is not fixed it can be
change according to the dataset.

C. Random Forest (RF)

In Machine Learning, Classification is a major part. There are


various techniques for classification such as Logistic
regression, decision tree, Artificial Neural Network, Support
Vector Machines, etc. A collection of trees is known as forest.
Similarly, Random forest is a collection of decision trees and it
is called random because it is a collection of relatively
uncorrelated trees operating as a single model. The basic
principle of random forest is that each and every tree speaks
out its prediction and based on the majority decision the
Figure 1 S VM
random forest predicts its decision. So, in our application of
Detecting hate speech the trees classify the statement as hate
B. Logistic Regression (LR)
speech or not and based on the majority decision the random
forest predicts whether the statement is a hate speech or not.
Logistic regression is a supervised machine learning method
which is similar to linear regression but instead of using a
linear equation it uses a sigmoid function (Eq. (1)) which
VI. PREPROCESSING
makes the output value in range [0,1] which is used for text
classification also [7].
Various method for machine learning model are used to
ଵ achieve higher evaluation measure. Methods Used are
݂ ሺ݊ሻ ൌ (3) explained below.
ଵା௘ ష೙

A. Tokenizing

Tokenizing is process in which each sentence is divided into


words. This is used to create vocabulary for our dataset. This
vocabulary is used represent each tweet in our dataset
representation is based on choice of our method like tf-idf or
bag of words.

B. Stop Words

Stop words are those words which has less meaning or are
useless for e.g, a,the,of which often occurs in most sentences.
So it is required to remove this stop words otherwise it will
Figure 2 S igmoid Function
cause misclassification.

C. Stemming
ܺ ൌ ሺܺଵ ǡ ܺଶ ǡ ‫ ڮ‬ǡ ܺ௡ ሻ (4)
Logistic Regression Equation Stemming is process in which the prefix or suffix of a word is
removed to make similar in common form.For e.g processing,
process and processed have basically same meaning if we

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1150

Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 23:17:06 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

ignore the tense so it is required to convert all this word in Table 15. RF with BOW
similar form. For this well known english stemmer, porter Not Hate Speech Hate Speech
stemmer is used here. Not Hate Speech 5928 17
Hate Speech 220 228
D. Case Folding
2) Accuracy Score & F1-Score
In case folding all word are changed in lowercase. This is used Table 16. accuracy score F1 score
to make the vocabulary as small as possible. Model F1-Score Accuracy-Score
SVM with tfidf 0.4933 0.9524
VII. RESULT S SVM with BOW 0.5518 0.9560
LR with tfidf 0.2704 0.9401
First SVM, logistic regression and random forest are used with LR with BOW 0.6538 0.9605
default parameters with bag of words and tf-idf representation
RF with tfidf 0.5475 0.9560
without any preprocessing. Size of feature vector in all
RF with BOW 0.6580 0.9629
representation, with or without preprocessing is shown in
below table
We can see that using preprocessing size of vectors are B. Data with preprocessing and using Gridsearchcv
decreased which decreases the computation time. 1) Gridsearchcv results

Table 9. Length of vector Table 17. Best parameter from GridSearchcv


Model Best Parameter
Without Tfidf (1,588381) SVM with tf-idf ’C’: 2, ’kernel’: ’linear’
preprocessing Bag of Words (1,12514) SVM with BOW ’C’: 1, ’kernel’: ’linear’
LR with tf-idf ’C’: 400, ’maxiter’: 100
With tfidf (1,402289)
LR with BOW ’C’: 10, ’maxiter’: 50
preprocessing Bag of Words (1,10013) RF with tf-idf ’maxdepth’: None, ’nestimators0 : 5
RF with BOW ’maxdepth’: None, ’nestimators0 : 1000
A. Data Without Preprocessing
1) Confusion Matrix confusion Matrices using data with preprocessing and above
paramters for machine learning model
Table 10. SVM with Tfidf
Not Hate Speech Hate Speech 2) Confusion Matrix
Not Hate Speech 5941 4
Hate Speech 300 148 Table 18. SVM with Tfidf
Not Hate Speech Hate Speech
Table 11. SVM with BOW Not Hate Speech 5865 80
Not Hate Speech Hate Speech Hate Speech 132 316
Not Hate Speech 5939 6
Hate Speech 275 173 Table 19. SVM with BOW
Not Hate Speech Hate Speech
Table 12. LR with Tfidf Not Hate Speech 5868 77
Not Hate Speech Hate Speech Hate Speech 159 289
Not Hate Speech 5939 6
Hate Speech 377 71 Table 20. LR with Tfidf
Not Hate Speech Hate Speech
Table 13. LR with BOW Not Hate Speech 5869 76
Not Hate Speech Hate Speech Hate Speech 145 303
Not Hate Speech 5903 42
Hate Speech 210 238 Table 21. LR with BOW
Not Hate Speech Hate Speech
Table 14. RF with Tfidf Not Hate Speech 5872 73
Not Hate Speech Hate Speech Hate Speech 164 248
Not Hate Speech 5942 3
Hate Speech 278 170 Table 22. RF with Tfidf
Not Hate Speech Hate Speech
Not Hate Speech 5931 32
Hate Speech 269 179

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1151

Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 23:17:06 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2021).
IEEE Xplore Part Number: CFP21ONG-ART; 978-0-7381-1183-4

REFERENCES
Table 23. RF with BOW
Not Hate Speech Hate Speech [1] Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma.
Deep learning for hate speech detection in tweets. In Proceedings of the 26th
Not Hate Speech 5914 31 International Conference on World Wide Web Companion, pages 759–760,
Hate Speech 199 249 2017.
3) Accuracy Score & F1-Score [2] Md Abul Bashar and Richi Nayak. Qutnocturnal@ hasoc’19: Cnn for hate
speech and offensive content identification in hindi language. In Proceedings of
the 11th annual meeting of the Forum for Information Retrieval Evaluation
Table 24. accuracy score F1 score (December 2019), 2019.
Model F1-Score Accuracy-Score [3] Pete Burnap and Matthew L Williams. Cyber hate speech on twitter: An
SVM with tfidf 0.7488 0.9668 application of machine classification and statistical modeling for policy and
decision making. Policy & Internet, 7(2):223 –242, 2015.
SVM with BOW 0.7101 0.9630 [4] Fabio Del Vigna12, Andrea Cimino23, Felice Dell’Orletta, Marinella
LR with tfidf 0.7327 0.9654 Petrocchi, and Maurizio Tesconi. Hate me, hate me not: Hate speech detection
LR with BOW 0.7055 0.9629 on facebook. In Proceedings of the First Italian Conference on Cybersecurity
(IT ASEC17), pages 86–95, 2017.
RF with tfidf 0.5432 0.9529 [5] Shimaa M Abd El-Salam, Mohamed M Ezz, Somaya Hashem, Wafaa
RF with BOW 0.6840 0.9640 Elakel, Rabab Salama, Hesham ElMakhzangy, and Mahmoud ElHefnawi.
Performance of machine learning approaches on prediction of esophageal
varices for egyptian chronic hepatitis c patients. Inform atics in Medicine
CONCLUSION Unlocked, 17:100267, 2019.
[6] Paula Fortuna and S´ergio Nunes. A survey on automatic detection of hate
In routine life, as the usage of social media is increased speech in text. ACM Computing Surveys (CSUR), 51(4):1 –30, 2018.
everyone seems to think like they can speak or write anything [7] Purnama Sari Br Ginting, Budhi Irawan, and Casi Setianingsih. Hate
speech detection on twitter using multinomial logistic regression classification
they want. Due to this thinking hate speech has been increased method. In 2019 IEEE International Conference on Internet of T hings and
so it becomes necessary to automate the process of classifying Intelligence System (IoT aIS), pages 105–111. IEEE, 2019.
the hate speech data. To simplify the process of classifying of [8] Njagi Dennis Gitari, Zhang Zuping, Hanyurwimfura Damien, and Jun
hate speech we have used machine learning approach to detect Long. A lexicon-based approach for hate speech detection. International
Journal of Multimedia and Ubiquitous Engineering, 10(4):215 –230, 2015.
hate speech from the twitter data. For this we have used tf-idf [9] Ammar Ismael Kadhim. Term weighting for feature extraction on twitter:
and bag of words methods to extract feature from the tweets. A comparison between bm25 and tf-idf. In 2019 International Conference on
To classify hate speech from the tweets we have implemented Advanced Science and Engineering (ICOASE), pages 124–128. IEEE, 2019.
machine learning algorithms like SVM, Logistic Regression [10] Harpreet Kaur, Veenu Mangat, and Nidhi Krail. Dictionary -based
sentiment analysis of hinglish text and comparison with machine learning
and Random Forest. We can conclude from the results algorithms. International Journal of Metadata, Semantics and Ontologies, 12(2-
obtained that by using Data without preprocessing and 3):90–102, 2017.
machine learning models with default parameters, Random [11] M Ramya and J Alwin Pinakas. Different type of feature selection for text
Forest with bag of words gives best performance with 0.6580 classification. International Journal of Computer T rends and T echno logy,
10(2):102–107, 2014
F1 Score and 0.9629 Accuracy Score. But as explained earlier [12] Joni Salminen, Maximilian Hopf, Shammur A Chowdhury, Soon -gyo
only obtaining highest accuracy is not enough when we are Jung, Hind Almerekhi, and Bernard J Jansen. Developing an online hate
dealing with imbalance class dataset. For that we have used classifier for multiple social media platforms. Human-centric Computing and
here F1 score which is quite low for data without Information Sciences, 10(1):1, 2020.
[13] T YSS Santosh and KVS Aravind. Hate speech detection in hindi-english
preprocessing. To improve this we have used some code-mixed social media text. In Proceedings of the ACM India Joint
preprocessing steps and gridsearch to obtained best parameter International Conference on Data Science and Management of Data, pages
for machine learning model. After preprocessing and using 310–313, 2019.
gridsearch SVM with Tf-IDF gives best performance with [14] Anna Schmidt and Michael Wiegand. A survey on hate speech detection
using natural language processing. In Proceedings of the Fifth International
0.7488 F1 Score and 0.9668 Accuracy Score.Tf-idf feature Workshop on Natural Language Processing for Social Media, pages 1 –10,
extraction model derives superlative accuracy in comparison 2017.
to bag of words model because bag of words just count the [15] Zeerak Waseem. Are you a racist or am i seeing things? annotator
frequency of words and used it as a vector but tf-idf model influence on hate speech detection on twitter. In Proceedings of the first
workshop on NLP and computational social science, pages 138 –142, 2016.
uses ratio of term frequency to the document frequency. [16] T ingxi Wen and Zhongnan Zhang. Effective and extensible feature
Limitations of this approach is that it can be only applied to extraction method using genetic algorithm-based frequency-domain feature
the twitter dataset so to detect hate speech from big data can search for epileptic eeg multiclassification. Medicine, 96(19), 2017
be a challenge. [17] Abro, S., Sarang Shaikh, Z. A., Khan, S., Mujtaba, G., & Khand, Z. H.
Automatic Hate Speech Detection using Machine Learning: A Comparative
Study. Machine Learning, 10, 6,2020.
In future f1 score and accuracy can be improved. More
machine learning techniques needs to be explored. Also
different method needs to be applied to handle the imbalance
class dataset.

978-1-6654-1960-4/21/$31.00 ©2021 IEEE 1152

Authorized licensed use limited to: Carleton University. Downloaded on June 14,2021 at 23:17:06 UTC from IEEE Xplore. Restrictions apply.

You might also like