SeeGULL: A stereotype benchmark with broad geo-cultural coverage leveraging generative models

MF Adilazuarda, S Mukherjee, P Lavania… - arXiv preprint arXiv …, 2024 - arxiv.org

We present a survey of more than 90 recent papers that aim to study cultural representation
and inclusion in large language models (LLMs). We observe that none of the studies …

被引用次数：30 相关文章

[PDF] arxiv.org

Cvqa: Culturally-diverse multilingual visual question answering benchmark

D Romero, C Lyu, HA Wibowo, T Lynn, I Hamed… - arXiv preprint arXiv …, 2024 - arxiv.org

Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used
to test the ability of vision-language models to understand and reason on knowledge …

被引用次数：17 相关文章所有 2 个版本

[PDF] neurips.cc

Building socio-culturally inclusive stereotype resources with community engagement

S Dev, J Goyal, D Tewari, S Dave… - Advances in Neural …, 2024 - proceedings.neurips.cc

With rapid development and deployment of generative language models in global settings,
there is an urgent need to also scale our measurements of harm, not just in the number and …

被引用次数：21 相关文章所有 5 个版本

[PDF] aclanthology.org

ViSAGe: A global-scale analysis of visual stereotypes in text-to-image generation

A Jha, V Prabhakaran, R Denton, S Laszlo… - Proceedings of the …, 2024 - aclanthology.org

Recent studies have shown that Text-to-Image (T2I) model generations can reflect social
stereotypes present in the real world. However, existing approaches for evaluating …

被引用次数：6 相关文章所有 2 个版本

[PDF] aaai.org

Socialstigmaqa: A benchmark to uncover stigma amplification in generative language models

M Nagireddy, L Chiazor, M Singh… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Current datasets for unwanted social bias auditing are limited to studying protected
demographic features such as race and gender. In this work, we introduce a comprehensive …

被引用次数：14 相关文章所有 5 个版本

[HTML] maartensap.com

[HTML][HTML] Normad: A benchmark for measuring the cultural adaptability of large language models

A Rao, A Yerukola, V Shah, K Reinecke… - arXiv preprint arXiv …, 2024 - maartensap.com

Maarten Sap - Publications Maarten Sap Publications Contact About Me CV Notes/Blogposts
Applying to grad school Giving feedback for talks Notes from my 2020 Academic Job Search …

被引用次数：25 相关文章所有 3 个版本

[PDF] aaai.org

How are LLMs mitigating stereotyping harms? Learning from search engine studies

A Leidinger, R Rogers - Proceedings of the AAAI/ACM Conference on AI …, 2024 - ojs.aaai.org

With the widespread availability of LLMs since the release of ChatGPT and increased public
scrutiny, commercial model development appears to have focused their efforts …

被引用次数：3 相关文章所有 7 个版本

[PDF] tandfonline.com

GPT, large language models (LLMs) and generative artificial intelligence (GAI) models in geospatial science: a systematic review

S Wang, T Hu, H Xiao, Y Li, C Zhang… - … Journal of Digital …, 2024 - Taylor & Francis

The launch of large language models (LLMs) like ChatGPT in late 2022 and the anticipated
arrival of future GPT-x iterations have marked the beginning of the generative artificial …

被引用次数：13 相关文章所有 5 个版本

[PDF] unibocconi.it

[PDF][PDF] Metrics for what, metrics for whom: assessing actionability of bias evaluation metrics in NLP

P Delobelle, G Attanasio, D Nozza… - Proceedings of the …, 2024 - iris.unibocconi.it

This paper introduces the concept of actionability in the context of bias measures in natural
language processing (NLP). We define actionability as the degree to which a …

Survey of cultural awareness in language models: Text and beyond

S Pawar, J Park, J Jin, A Arora, J Myung… - arXiv preprint arXiv …, 2024 - arxiv.org

Large-scale deployment of large language models (LLMs) in various applications, such as
chatbots and virtual assistants, requires LLMs to be culturally sensitive to the user to ensure …

被引用次数：3 相关文章所有 4 个版本

高级搜索

QQ 群