Towards multidomain and multilingual abusive language detection: a survey

EW Pamungkas, V Basile, V Patti - Personal and Ubiquitous Computing, 2023 - Springer
Abusive language is an important issue in online communication across different platforms
and languages. Having a robust model to detect abusive instances automatically is a …

[PDF][PDF] SafetyKit: First aid for measuring safety in open-domain conversational systems

E Dinan, G Abercrombie, SA Bergman… - Proceedings of the …, 2022 - iris.unibocconi.it
The social impact of natural language processing and its applications has received
increasing attention. In this position paper, we focus on the problem of safety for end-to-end …

On the rise of fear speech in online social media

P Saha, K Garimella, NK Kalyan… - Proceedings of the …, 2023 - National Acad Sciences
Recently, social media platforms are heavily moderated to prevent the spread of online hate
speech, which is usually fertile in toxic words and is directed toward an individual or a …

Anticipating safety issues in e2e conversational ai: Framework and tooling

E Dinan, G Abercrombie, AS Bergman, S Spruit… - arXiv preprint arXiv …, 2021 - arxiv.org
Over the last several years, end-to-end neural conversational agents have vastly improved
in their ability to carry a chit-chat conversation with humans. However, these models are …

ConvAbuse: Data, analysis, and benchmarks for nuanced abuse detection in conversational AI

AC Curry, G Abercrombie, V Rieser - arXiv preprint arXiv:2109.09483, 2021 - arxiv.org
We present the first English corpus study on abusive language towards three conversational
AI systems gathered" in the wild": an open-domain social bot, a rule-based chatbot, and a …

Generalizable implicit hate speech detection using contrastive learning

Y Kim, S Park, YS Han - … of the 29th International Conference on …, 2022 - aclanthology.org
Hate speech detection has gained increasing attention with the growing prevalence of
hateful contents. When a text contains an obvious hate word or expression, it is fairly easy to …

Toxicchat: Unveiling hidden challenges of toxicity detection in real-world user-ai conversation

Z Lin, Z Wang, Y Tong, Y Wang, Y Guo, Y Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Despite remarkable advances that large language models have achieved in chatbots,
maintaining a non-toxic user-AI interactive environment has become increasingly critical …

Playing the part of the sharp bully: Generating adversarial examples for implicit hate speech detection

NB Ocampo, E Cabrio, S Villata - Findings of the Association for …, 2023 - aclanthology.org
Research on abusive content detection on social media has primarily focused on explicit
forms of hate speech (HS), that are often identifiable by recognizing hateful words and …

Measuring and mitigating language model biases in abusive language detection

R Song, F Giunchiglia, Y Li, L Shi, H Xu - Information Processing & …, 2023 - Elsevier
Warning: This paper contains abusive samples that may cause discomfort to readers.
Abusive language on social media reinforces prejudice against an individual or a specific …

[PDF][PDF] Guiding the release of safer E2E conversational AI through value sensitive design

AS Bergman, G Abercrombie, S Spruit… - Proceedings of the …, 2022 - iris.unibocconi.it
Over the last several years, end-to-end neural conversational agents have vastly improved
their ability to carry unrestricted, open-domain conversations with humans. However, these …