Value kaleidoscope: Engaging ai with pluralistic human values, rights, and duties

T Sorensen, L Jiang, JD Hwang, S Levine… - Proceedings of the …, 2024 - ojs.aaai.org
Human values are crucial to human decision-making.\textit {Value pluralism} is the view that
multiple correct values may be held in tension with one another (eg, when considering\textit …

Harmful speech detection by language models exhibits gender-queer dialect bias

R Dorn, L Kezar, F Morstatter, K Lerman - … of the 4th ACM Conference on …, 2024 - dl.acm.org
Trigger Warning: Profane Language, Slurs Content moderation on social media platforms
shapes the dynamics of online discourse, influencing whose voices are amplified and …

From dogwhistles to bullhorns: Unveiling coded rhetoric with language models

J Mendelsohn, RL Bras, Y Choi, M Sap - arXiv preprint arXiv:2305.17174, 2023 - arxiv.org
Dogwhistles are coded expressions that simultaneously convey one meaning to a broad
audience and a second one, often hateful or provocative, to a narrow in-group; they are …

''Fifty Shades of Bias'': Normative Ratings of Gender Bias in GPT Generated English Text

R Hada, A Seth, H Diddee, K Bali - arXiv preprint arXiv:2310.17428, 2023 - arxiv.org
Language serves as a powerful tool for the manifestation of societal belief systems. In doing
so, it also perpetuates the prevalent biases in our society. Gender bias is one of the most …

Leveraging machine-generated rationales to facilitate social meaning detection in conversations

R Dutt, Z Wu, K Shi, D Sheth, P Gupta… - arXiv preprint arXiv …, 2024 - arxiv.org
We present a generalizable classification approach that leverages Large Language Models
(LLMs) to facilitate the detection of implicitly encoded social meaning in conversations. We …

Cobias: Contextual reliability in bias assessment

P Govil, H Jain, VK Bonagiri, A Chadha… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) often inherit biases from the web data they are trained on,
which contains stereotypes and prejudices. Current methods for evaluating and mitigating …

Social intelligence data infrastructure: Structuring the present and navigating the future

M Li, W Shi, C Ziems, D Yang - arXiv preprint arXiv:2403.14659, 2024 - arxiv.org
As Natural Language Processing (NLP) systems become increasingly integrated into human
social life, these technologies will need to increasingly rely on social intelligence. Although …

Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting

A Yerukola, X Zhou, E Clark, M Sap - arXiv preprint arXiv:2305.14755, 2023 - arxiv.org
Most existing stylistic text rewriting methods and evaluation metrics operate on a sentence
level, but ignoring the broader context of the text can lead to preferring generic, ambiguous …

Biasx:" thinking slow" in toxic content moderation with explanations of implied social biases

Y Zhang, S Nanduri, L Jiang, T Wu, M Sap - arXiv preprint arXiv …, 2023 - arxiv.org
Toxicity annotators and content moderators often default to mental shortcuts when making
decisions. This can lead to subtle toxicity being missed, and seemingly toxic but harmless …

Polarized Opinion Detection Improves the Detection of Toxic Language

J Pavlopoulos, A Likas - Proceedings of the 18th Conference of …, 2024 - aclanthology.org
Distance from unimodality (DFU) has been found to correlate well with human judgment for
the assessment of polarized opinions. However, its un-normalized nature makes it less …