Towards robust toxic content classification

A Lees, VQ Tran, Y Tay, J Sorensen, J Gupta… - Proceedings of the 28th …, 2022 - dl.acm.org

On the world wide web, toxic content detectors are a crucial line of defense against
potentially hateful and offensive messages. As such, building highly effective classifiers that …

被引用次数：184 相关文章所有 5 个版本

[PDF] arxiv.org

Language generation models can cause harm: So what can we do about it? an actionable survey

S Kumar, V Balachandran, L Njoo… - arXiv preprint arXiv …, 2022 - arxiv.org

Recent advances in the capacity of large language models to generate human-like text have
resulted in their increased adoption in user-facing settings. In parallel, these improvements …

被引用次数：80 相关文章所有 5 个版本

[PDF] arxiv.org

ToxiSpanSE: An explainable toxicity detection in code review comments

J Sarker, S Sultana, SR Wilson… - 2023 ACM/IEEE …, 2023 - ieeexplore.ieee.org

Background: The existence of toxic conversations in open-source platforms can degrade
relationships among software developers and may negatively impact software product …

被引用次数：16 相关文章所有 6 个版本

[PDF] arxiv.org

Ai safety in generative ai large language models: A survey

J Chua, Y Li, S Yang, C Wang, L Yao - arXiv preprint arXiv:2407.18369, 2024 - arxiv.org

Large Language Model (LLMs) such as ChatGPT that exhibit generative AI capabilities are
facing accelerated adoption and innovation. The increased presence of Generative AI (GAI) …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Automated identification of toxic code reviews using toxicr

J Sarker, AK Turzo, M Dong, A Bosu - ACM Transactions on Software …, 2023 - dl.acm.org

Toxic conversations during software development interactions may have serious
repercussions on a Free and Open Source Software (FOSS) development project. For …

被引用次数：34 相关文章所有 5 个版本

[PDF] arxiv.org

A Taxonomy of Rater Disagreements: Surveying Challenges & Opportunities from the Perspective of Annotating Online Toxicity

W Zhang, H Guo, ID Kivlichan, V Prabhakaran… - arXiv preprint arXiv …, 2023 - arxiv.org

Toxicity is an increasingly common and severe issue in online spaces. Consequently, a rich
line of machine learning research over the past decade has focused on computationally …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

A benchmark study of the contemporary toxicity detectors on software engineering interactions

J Sarker, AK Turzo, A Bosu - 2020 27th Asia-Pacific Software …, 2020 - ieeexplore.ieee.org

Automated filtering of toxic conversations may help an Open-source software (OSS)
community to maintain healthy interactions among the project participants. Although, several …

被引用次数：42 相关文章所有 8 个版本

Robustness of models addressing Information Disorder: A comprehensive review and benchmarking study

G Fenza, V Loia, C Stanzione, M Di Gisi - Neurocomputing, 2024 - Elsevier

Abstract Machine learning and deep learning models are increasingly susceptible to
adversarial attacks, particularly in critical areas like cybersecurity and Information Disorder …

被引用次数：1 相关文章

[PDF] york.ac.uk

OCR post-correction for detecting adversarial text images

NH Imam, VG Vassilakis, D Kolovos - Journal of Information Security and …, 2022 - Elsevier

The amount of images with embedded text shared on Online Social Networks (OSNs), such
as Twitter or Facebook has been growing in recent years. It is becoming important to …

被引用次数：25 相关文章所有 6 个版本

[PDF] smu.edu

Toxic comment classification

S Zaheri, J Leath, D Stroud - SMU Data Science Review, 2020 - scholar.smu.edu

This paper presents a novel application of Natural Language Processing techniques to
classify unstructured text into toxic and non-toxic categories. In the current century, social …

被引用次数：48 相关文章所有 2 个版本

高级搜索

QQ 群