You are on page 1of 1

Identifying and Categorizing Offensive Language in

Social Media (OffensEval)

Fine tuning BERT achieved state-of-the-art performance on a number of natural language


understanding tasks. In this research, we improve BERT performance using cost-sensitive learning and
ensemble technique. The results shows to beaten state of the art on Automatic Categorization of
Offense Types (subtask B), ranked 2nd on Offensive Language Identification (subtask A), and ranked
2nd on Offense Target Identification (subtask C)

Authors Affiliations
Fajar Muslim / 13517149@std.stei.itb.ac.id School of Electrical Engineering and Informatics
Dr. Eng. Ayu Purwarianti, ST., MT. / ayu@informatika.org Institut Teknologi Bandung
Fariska Zakhralativa R, S.T., M.T. / fariska@informatika.org

Introduction Objective
Social media has become a necessity for some people. Twitter is one of them with To build system, find the best configuration, and compare system performance with
active users monthly 353 million users (Wearesocial, Digital 2020 October Statshot previous research on identifying and categorizing offensive language in social media by
Report). Freedom of expression on twitter is misused for some people to carry out adapting cost-sensitive learning techniques, and ensemble BERT models.
offensive actions such as cursing, hate speech, and swearing against ethnicity, race
and religion by using
offensive language.

Identifying and Categorizing Offensive Language in Social Media (OffensEval) is one


Methodology
of the task at Semantic Evaluation task 6 that aims to solve these problems (Zampieri
et al., 2019). This task use OLID dataset. The task consist of three subtask: offensive - Cost Sensitive Learning
language identification (subtask A), automatic categorization of offense (subtask B), Cost-Sensitive learning is a type of learning in data mining which takes into account the
offense target identification (subtask C). costs when the model misclassifies data or types other costs. This technique aims to
tackle the imbalanced dataset problem in this research (OLID dataset).
Based on previous research at OLID dataset, BERT achieved best result for all three
subtask. In this research, we trying to improve BERT performance at OLID dataset by - Ensemble
adapting cost-sensitive learning (Madabushi et all., 2020), and ensemble BERT Ensemble in this research use multiple BERT models to obtain prediction result by
models (Risch & Krestel., 2020) combine the output each model using majority voting.

Architecture REsult

Subtask A Subtask B Subtask C

Module Preprocessing
Converts emojis to English phrases, hashtag segmentation, change
'url' to 'http', remove punctuation except '?' and '!', lowercasing, limit
three consecutive '@USER' The best evaluation result on the test The best evaluation results on the test The best evaluation results on the test
data of subtask A is the baseline model data of subtask B testing is the data of subtask C testing is a cost-
Module BERT with Cost-Sensitive Learning with an F1-score of 0.823. The use of ensemble model with the . approach sensitive learning model with F1-score of
Change loss function on BERT to take into account cost sensitive cost-sensitive learning techniques did hard majority voting with an F1-score of 0.657. The use of cost-sensitive learning
not improve the performance of the 0.777. Use of technique cost-sensitive techniques can improve the
learning with following formula model in subtask A. While the ensemble learning improves F1-score performance performance of the baseline model by
technique with the hard majority voting by 3.16% compared to the baseline 6.9%. While the technique ensemble
approach increased the performance of model. While the ensemble technique with hard-majority voting approach does
the F1-score in subtask A by 0.78% with hard majority voting approach not improve model performance on
compared to the cost-sensitive learning increases F1-score performance by subtask C.
Module Ensemble technique. 1.72%.
Agregating the prediction result BERT models using majority voting

Conclusion Related literature


Wearesocial, Digital 2020 October Statshot Report
Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., &
Cost-sensitive learning technique improve Kumar, R. (2019). SemEval-2019 task 6: Identifying and
performance on subtask B (3.16 %) and subtask C categorizing offensive language in social media (OffensEval).
(6.86 %). This technique can overcome imbalanced ArXiv.
label distribution at subtask B (8:1) and subtask C Madabushi, H. T., Kochkina, E., & Castelle, M. (2020). Cost-
(18:8:3) Sensitive BERT for Generalisable Sentence Classification with
Imbalanced Data. ArXiv.
The ensemble technique with hard majority approach Risch, J., & Krestel, R. (2020). Bagging BERT Models for Robust
improve performance on subtask A (0.79 %) and Aggression Identification, (May), 55–61.
subtask B (1.72 %).

Evaluation result on OLID data test achieved 2nd


position at subtask A (F1-score 0.8215), 1st position
at subtask B (F1-score 0.7776), and 3rd position at
subtask C (F1-score 0.6574)

You might also like