Cobra frames: Contextual reasoning about effects and harms of offensive statements

T Sorensen, L Jiang, JD Hwang, S Levine… - Proceedings of the …, 2024 - ojs.aaai.org

Human values are crucial to human decision-making.\textit {Value pluralism} is the view that
multiple correct values may be held in tension with one another (eg, when considering\textit …

被引用次数：50 相关文章所有 3 个版本

[PDF] acm.org

Harmful speech detection by language models exhibits gender-queer dialect bias

R Dorn, L Kezar, F Morstatter, K Lerman - … of the 4th ACM Conference on …, 2024 - dl.acm.org

Trigger Warning: Profane Language, Slurs Content moderation on social media platforms
shapes the dynamics of online discourse, influencing whose voices are amplified and …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

From dogwhistles to bullhorns: Unveiling coded rhetoric with language models

J Mendelsohn, RL Bras, Y Choi, M Sap - arXiv preprint arXiv:2305.17174, 2023 - arxiv.org

Dogwhistles are coded expressions that simultaneously convey one meaning to a broad
audience and a second one, often hateful or provocative, to a narrow in-group; they are …

被引用次数：21 相关文章所有 6 个版本

[PDF] arxiv.org

''Fifty Shades of Bias'': Normative Ratings of Gender Bias in GPT Generated English Text

R Hada, A Seth, H Diddee, K Bali - arXiv preprint arXiv:2310.17428, 2023 - arxiv.org

Language serves as a powerful tool for the manifestation of societal belief systems. In doing
so, it also perpetuates the prevalent biases in our society. Gender bias is one of the most …

被引用次数：11 相关文章所有 4 个版本

[PDF] arxiv.org

Leveraging machine-generated rationales to facilitate social meaning detection in conversations

R Dutt, Z Wu, K Shi, D Sheth, P Gupta… - arXiv preprint arXiv …, 2024 - arxiv.org

We present a generalizable classification approach that leverages Large Language Models
(LLMs) to facilitate the detection of implicitly encoded social meaning in conversations. We …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

Cobias: Contextual reliability in bias assessment

P Govil, H Jain, VK Bonagiri, A Chadha… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) often inherit biases from the web data they are trained on,
which contains stereotypes and prejudices. Current methods for evaluating and mitigating …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Social intelligence data infrastructure: Structuring the present and navigating the future

M Li, W Shi, C Ziems, D Yang - arXiv preprint arXiv:2403.14659, 2024 - arxiv.org

As Natural Language Processing (NLP) systems become increasingly integrated into human
social life, these technologies will need to increasingly rely on social intelligence. Although …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting

A Yerukola, X Zhou, E Clark, M Sap - arXiv preprint arXiv:2305.14755, 2023 - arxiv.org

Most existing stylistic text rewriting methods and evaluation metrics operate on a sentence
level, but ignoring the broader context of the text can lead to preferring generic, ambiguous …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Biasx:" thinking slow" in toxic content moderation with explanations of implied social biases

Y Zhang, S Nanduri, L Jiang, T Wu, M Sap - arXiv preprint arXiv …, 2023 - arxiv.org

Toxicity annotators and content moderators often default to mental shortcuts when making
decisions. This can lead to subtle toxicity being missed, and seemingly toxic but harmless …

被引用次数：6 相关文章所有 8 个版本

[PDF] aclanthology.org

Polarized Opinion Detection Improves the Detection of Toxic Language

J Pavlopoulos, A Likas - Proceedings of the 18th Conference of …, 2024 - aclanthology.org

Distance from unimodality (DFU) has been found to correlate well with human judgment for
the assessment of polarized opinions. However, its un-normalized nature makes it less …

被引用次数：1 相关文章

高级搜索

QQ 群