Self-supervised euphemism detection and identification for content moderation

Chatgpt perpetuates gender bias in machine translation and ignores non-gendered pronouns: Findings across bengali and five other low-resource languages

S Ghosh, A Caliskan - Proceedings of the 2023 AAAI/ACM Conference …, 2023 - dl.acm.org

In this multicultural age, language translation is one of the most performed tasks, and it is
becoming increasingly AI-moderated and automated. As a novel AI system, ChatGPT claims …

被引用次数：81 相关文章所有 4 个版本

[PDF] researchgate.net

How Do Users Experience Moderation?: A Systematic Literature Review

R Ma, Y You, X Gui, Y Kou - Proceedings of the ACM on Human …, 2023 - dl.acm.org

Researchers across various fields have investigated how users experience moderation
through different perspectives and methodologies. At present, there is a pressing need of …

被引用次数：6 相关文章所有 4 个版本

The inadequacy of reinforcement learning from human feedback-radicalizing large language models via semantic vulnerabilities

TR McIntosh, T Susnjak, T Liu, P Watters… - … on Cognitive and …, 2024 - ieeexplore.ieee.org

This study is an empirical investigation into the semantic vulnerabilities of four popular pre-
trained commercial Large Language Models (LLMs) to ideological manipulation. Using …

被引用次数：36 相关文章

[PDF] ufl.edu

" Get in Researchers; We're Measuring Reproducibility": A Reproducibility Study of Machine Learning Papers in Tier 1 Security Conferences

D Olszewski, A Lu, C Stillman, K Warren… - Proceedings of the …, 2023 - dl.acm.org

Reproducibility is crucial to the advancement of science; it strengthens confidence in
seemingly contradictory results and expands the boundaries of known discoveries …

被引用次数：12 相关文章

[PDF] arxiv.org

DarkBERT: A language model for the dark side of the Internet

Y Jin, E Jang, J Cui, JW Chung, Y Lee… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent research has suggested that there are clear differences in the language used in the
Dark Web compared to that of the Surface Web. As studies on the Dark Web commonly …

被引用次数：24 相关文章所有 6 个版本

[PDF] arxiv.org

Generate, prune, select: A pipeline for counterspeech generation against online hate speech

W Zhu, S Bhat - arXiv preprint arXiv:2106.01625, 2021 - arxiv.org

Countermeasures to effectively fight the ever increasing hate speech online without blocking
freedom of speech is of great social interest. Natural Language Generation (NLG), is …

被引用次数：48 相关文章所有 6 个版本

[PDF] arxiv.org

Cats are fuzzy pets: A corpus and analysis of potentially euphemistic terms

M Gavidia, P Lee, A Feldman, J Peng - arXiv preprint arXiv:2205.02728, 2022 - arxiv.org

Euphemisms have not received much attention in natural language processing, despite
being an important element of polite and figurative language. Euphemisms prove to be a …

被引用次数：27 相关文章所有 9 个版本

[PDF] arxiv.org

Lambretta: learning to rank for Twitter soft moderation

P Paudel, J Blackburn, E De Cristofaro… - … IEEE Symposium on …, 2023 - ieeexplore.ieee.org

To curb the problem of false information, social media platforms like Twitter started adding
warning labels to content discussing debunked narratives, with the goal of providing more …

被引用次数：7 相关文章所有 13 个版本

[PDF] arxiv.org

Sok: Content moderation for end-to-end encryption

S Scheffler, J Mayer - arXiv preprint arXiv:2303.03979, 2023 - arxiv.org

Popular messaging applications now enable end-to-end-encryption (E2EE) by default, and
E2EE data storage is becoming common. These important advances for security and privacy …

被引用次数：31 相关文章所有 4 个版本

[PDF] arxiv.org

Euphemistic phrase detection by masked language model

W Zhu, S Bhat - arXiv preprint arXiv:2109.04666, 2021 - arxiv.org

It is a well-known approach for fringe groups and organizations to use euphemisms--
ordinary-sounding and innocent-looking words with a secret meaning--to conceal what they …

被引用次数：28 相关文章所有 6 个版本

高级搜索

QQ 群