Trustllm: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu… - International …, 2024 - proceedings.mlr.press
Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …

Building socio-culturally inclusive stereotype resources with community engagement

S Dev, J Goyal, D Tewari, S Dave… - Advances in Neural …, 2024 - proceedings.neurips.cc
With rapid development and deployment of generative language models in global settings,
there is an urgent need to also scale our measurements of harm, not just in the number and …

SeeGULL: A stereotype benchmark with broad geo-cultural coverage leveraging generative models

A Jha, A Davani, CK Reddy, S Dave… - arXiv preprint arXiv …, 2023 - arxiv.org
Stereotype benchmark datasets are crucial to detect and mitigate social stereotypes about
groups of people in NLP models. However, existing datasets are limited in size and …

Weakly supervised detection of hallucinations in llm activations

M Rateike, C Cintas, J Wamburu, T Akumu… - arXiv preprint arXiv …, 2023 - arxiv.org
We propose an auditing method to identify whether a large language model (LLM) encodes
patterns such as hallucinations in its internal states, which may propagate to downstream …

Seegull multilingual: a dataset of geo-culturally situated stereotypes

M Bhutani, K Robinson, V Prabhakaran, S Dave… - arXiv preprint arXiv …, 2024 - arxiv.org
While generative multilingual models are rapidly being deployed, their safety and fairness
evaluations are largely limited to resources collected in English. This is especially …

Editable fairness: Fine-grained bias mitigation in language models

R Chen, Y Li, J Yang, JT Zhou, Z Liu - arXiv preprint arXiv:2408.11843, 2024 - arxiv.org
Generating fair and accurate predictions plays a pivotal role in deploying large language
models (LLMs) in the real world. However, existing debiasing methods inevitably generate …

Exploring Stereotypes and Biases in Language Technologies in Latin America

H Maina, LA Alemany, G Ivetta, M Rajngewerc… - Communications of the …, 2024 - dl.acm.org
Language technologies are becoming more pervasive in our everyday lives, and they are
also being applied in critical domains involving health, justice, and education. Given the …

A Robot Walks into a Bar: Can Language Models Serve as Creativity SupportTools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians

P Mirowski, J Love, K Mathewson… - The 2024 ACM …, 2024 - dl.acm.org
We interviewed twenty professional comedians who perform live shows in front of audiences
and who use artificial intelligence in their artistic process as part of 3-hour workshops on “AI …

A Robot Walks into a Bar: Can Language Models Serve asCreativity Support Tools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians

PW Mirowski, J Love, KW Mathewson… - arXiv preprint arXiv …, 2024 - arxiv.org
We interviewed twenty professional comedians who perform live shows in front of audiences
and who use artificial intelligence in their artistic process as part of 3-hour workshops on``AI …