A literature review of textual hate speech detection methods and datasets

F Alkomah, X Ma - Information, 2022 - mdpi.com
Online toxic discourses could result in conflicts between groups or harm to online
communities. Hate speech is complex and multifaceted harmful or offensive content …

Machine learning techniques for hate speech classification of twitter data: State-of-the-art, future challenges and research directions

FE Ayo, O Folorunso, FT Ibharalu, IA Osinuga - Computer Science Review, 2020 - Elsevier
Twitter is a microblogging tool that allow the creation of big data through short digital
contents. This study provides a survey of machine learning techniques for hate speech …

Hatebert: Retraining bert for abusive language detection in english

T Caselli, V Basile, J Mitrović, M Granitzer - arXiv preprint arXiv …, 2020 - arxiv.org
In this paper, we introduce HateBERT, a re-trained BERT model for abusive language
detection in English. The model was trained on RAL-E, a large-scale dataset of Reddit …

Semeval-2019 task 6: Identifying and categorizing offensive language in social media (offenseval)

M Zampieri, S Malmasi, P Nakov, S Rosenthal… - arXiv preprint arXiv …, 2019 - arxiv.org
We present the results and the main findings of SemEval-2019 Task 6 on Identifying and
Categorizing Offensive Language in Social Media (OffensEval). The task was based on a …

BanglaHateBERT: BERT for abusive language detection in Bengali

MS Jahan, M Haque, N Arhab… - Proceedings of the …, 2022 - aclanthology.org
This paper introduces BanglaHateBERT, a retrained BERT model for abusive language
detection in Bengali. The model was trained with a large-scale Bengali offensive, abusive …

Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements

J Deng, J Cheng, H Sun, Z Zhang, M Huang - arXiv preprint arXiv …, 2023 - arxiv.org
As generative large model capabilities advance, safety concerns become more pronounced
in their outputs. To ensure the sustainable growth of the AI ecosystem, it's imperative to …

Detection of hate: speech tweets based convolutional neural network and machine learning algorithms

HA Sennary, G Abozaid, A Hemeida, A Mikhaylov… - Scientific Reports, 2024 - nature.com
There is no doubt that social media sites have provided many benefits to humanity, such as
sharing information continuously and communicating with others easily. It also seems that …

Automatic hate speech detection: A literature review

M Mohiyaddeen, S Siddiqi - Available at SSRN 3887383, 2021 - papers.ssrn.com
Hate speech has been an ongoing problem on the Internet for many years. Besides, social
media, especially Facebook, and Twitter have given it a global stage where those hate …

LOD-connected offensive language ontology and tagset enrichment

B Lewandowska-Tomaszczyk, S Žitnik… - … : Proceedings of the …, 2021 - cris.mruni.eu
The main focus of the paper is the definitional revision and enrichment of offensive language
typology, making reference to publicly available offensive language datasets and testing …

Astartwice at semeval-2021 task 5: Toxic span detection using roberta-crf, domain specific pre-training and self-training

TA Suman, A Jain - … of the 15th International Workshop on …, 2021 - aclanthology.org
This paper describes our contribution to SemEval-2021 Task 5: Toxic Spans Detection. Our
solution is built upon RoBERTa language model and Conditional Random Fields (CRF). We …