This prompt is measuring< mask>: evaluating bias evaluation in language models

C Bird, E Ungless, A Kasirzadeh - Proceedings of the 2023 AAAI/ACM …, 2023 - dl.acm.org

This paper investigates the direct risks and harms associated with modern text-to-image
generative models, such as DALL-E and Midjourney, through a comprehensive literature …

被引用次数：54 相关文章所有 5 个版本

[PDF] arxiv.org

Goodtriever: Adaptive toxicity mitigation with retrieval-augmented models

L Pozzobon, B Ermis, P Lewis, S Hooker - arXiv preprint arXiv:2310.07589, 2023 - arxiv.org

Considerable effort has been dedicated to mitigating toxicity, but existing methods often
require drastic modifications to model parameters or the use of computationally intensive …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Introducing v0. 5 of the ai safety benchmark from mlcommons

B Vidgen, A Agrawal, AM Ahmed, V Akinwande… - arXiv preprint arXiv …, 2024 - arxiv.org

This paper introduces v0. 5 of the AI Safety Benchmark, which has been created by the
MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to …

被引用次数：4 相关文章所有 6 个版本

[PDF] arxiv.org

Responsible AI Considerations in Text Summarization Research: A Review of Current Practices

YL Liu, M Cao, SL Blodgett, JCK Cheung… - arXiv preprint arXiv …, 2023 - arxiv.org

AI and NLP publication venues have increasingly encouraged researchers to reflect on
possible ethical considerations, adverse impacts, and other responsible AI issues their work …

[PDF][PDF] Undesirable biases in NLP: Averting a crisis of measurement

O Van der Wal, D Bachmann, A Leidinger… - arXiv preprint arXiv …, 2022 - pure.uva.nl

Abstract As Large Language Models and Natural Language Processing (NLP) technology
rapidly develops and spreads into daily life, it becomes crucial to anticipate how its use …

被引用次数：9 相关文章所有 6 个版本

[PDF] arxiv.org

The gaps between pre-train and downstream settings in bias evaluation and debiasing

M Kaneko, D Bollegala, T Baldwin - arXiv preprint arXiv:2401.08511, 2024 - arxiv.org

The output tendencies of Pre-trained Language Models (PLM) vary markedly before and
after Fine-Tuning (FT) due to the updates to the model parameters. These divergences in …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Fft: Towards harmlessness evaluation and analysis for llms with factuality, fairness, toxicity

S Cui, Z Zhang, Y Chen, W Zhang, T Liu… - arXiv preprint arXiv …, 2023 - arxiv.org

The widespread of generative artificial intelligence has heightened concerns about the
potential harms posed by AI-generated texts, primarily stemming from factoid, unfair, and …

被引用次数：7 相关文章所有 2 个版本

[PDF] oup.com

Healthy immigrant effect or under-detection? Examining undiagnosed and unrecognized late-life depression for racialized immigrants and nonimmigrants in Canada

S Lin - The Journals of Gerontology: Series B, 2024 - academic.oup.com

Abstract Objectives Immigrants to Canada tend to have a lower incidence of diagnosed
depression than nonimmigrants. One theory suggests that this “healthy immigrant effect …

被引用次数：2 相关文章所有 11 个版本

[PDF] arxiv.org

Mind vs. Mouth: On Measuring Re-judge Inconsistency of Social Bias in Large Language Models

Y Zhao, B Wang, D Zhao, K Huang, Y Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent researches indicate that Pre-trained Large Language Models (LLMs) possess
cognitive constructs similar to those observed in humans, prompting researchers to …

被引用次数：3 相关文章所有 2 个版本

[PDF] aclanthology.org

“One-Size-Fits-All”? Examining Expectations around What Constitute “Fair” or “Good” NLG System Behaviors

L Lucy, SL Blodgett, M Shokouhi… - Proceedings of the …, 2024 - aclanthology.org

Fairness-related assumptions about what constitute appropriate NLG system behaviors
range from invariance, where systems are expected to behave identically for social groups …

高级搜索

QQ 群