Chatgpt perpetuates gender bias in machine translation and ignores non-gendered pronouns: Findings across bengali and five other low-resource languages

S Ghosh, A Caliskan - Proceedings of the 2023 AAAI/ACM Conference …, 2023 - dl.acm.org
In this multicultural age, language translation is one of the most performed tasks, and it is
becoming increasingly AI-moderated and automated. As a novel AI system, ChatGPT claims …

How Do Users Experience Moderation?: A Systematic Literature Review

R Ma, Y You, X Gui, Y Kou - Proceedings of the ACM on Human …, 2023 - dl.acm.org
Researchers across various fields have investigated how users experience moderation
through different perspectives and methodologies. At present, there is a pressing need of …

The inadequacy of reinforcement learning from human feedback-radicalizing large language models via semantic vulnerabilities

TR McIntosh, T Susnjak, T Liu, P Watters… - … on Cognitive and …, 2024 - ieeexplore.ieee.org
This study is an empirical investigation into the semantic vulnerabilities of four popular pre-
trained commercial Large Language Models (LLMs) to ideological manipulation. Using …

" Get in Researchers; We're Measuring Reproducibility": A Reproducibility Study of Machine Learning Papers in Tier 1 Security Conferences

D Olszewski, A Lu, C Stillman, K Warren… - Proceedings of the …, 2023 - dl.acm.org
Reproducibility is crucial to the advancement of science; it strengthens confidence in
seemingly contradictory results and expands the boundaries of known discoveries …

DarkBERT: A language model for the dark side of the Internet

Y Jin, E Jang, J Cui, JW Chung, Y Lee… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent research has suggested that there are clear differences in the language used in the
Dark Web compared to that of the Surface Web. As studies on the Dark Web commonly …

Generate, prune, select: A pipeline for counterspeech generation against online hate speech

W Zhu, S Bhat - arXiv preprint arXiv:2106.01625, 2021 - arxiv.org
Countermeasures to effectively fight the ever increasing hate speech online without blocking
freedom of speech is of great social interest. Natural Language Generation (NLG), is …

Cats are fuzzy pets: A corpus and analysis of potentially euphemistic terms

M Gavidia, P Lee, A Feldman, J Peng - arXiv preprint arXiv:2205.02728, 2022 - arxiv.org
Euphemisms have not received much attention in natural language processing, despite
being an important element of polite and figurative language. Euphemisms prove to be a …

Lambretta: learning to rank for Twitter soft moderation

P Paudel, J Blackburn, E De Cristofaro… - … IEEE Symposium on …, 2023 - ieeexplore.ieee.org
To curb the problem of false information, social media platforms like Twitter started adding
warning labels to content discussing debunked narratives, with the goal of providing more …

Sok: Content moderation for end-to-end encryption

S Scheffler, J Mayer - arXiv preprint arXiv:2303.03979, 2023 - arxiv.org
Popular messaging applications now enable end-to-end-encryption (E2EE) by default, and
E2EE data storage is becoming common. These important advances for security and privacy …

Euphemistic phrase detection by masked language model

W Zhu, S Bhat - arXiv preprint arXiv:2109.04666, 2021 - arxiv.org
It is a well-known approach for fringe groups and organizations to use euphemisms--
ordinary-sounding and innocent-looking words with a secret meaning--to conceal what they …