Quantifying the persona effect in llm simulations

T Hu, N Collier - arXiv preprint arXiv:2402.10811, 2024 - arxiv.org
Large language models (LLMs) have shown remarkable promise in simulating human
language use and behavior. In this study, we delve into the intersection of persona variables …

Mapping social choice theory to RLHF

J Dai, E Fleisig - arXiv preprint arXiv:2404.13038, 2024 - arxiv.org
Recent work on the limitations of using reinforcement learning from human feedback (RLHF)
to incorporate human preferences into model behavior often raises social choice theory as a …

Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models

EL Ungless, N Vitsakis, Z Talat, J Garforth… - arXiv preprint arXiv …, 2024 - arxiv.org
This whitepaper offers an overview of the ethical considerations surrounding research into
or with large language models (LLMs). As LLMs become more integrated into widely used …

Efficacy of language model self-play in non-zero-sum games

A Liao, N Tomlin, D Klein - arXiv preprint arXiv:2406.18872, 2024 - arxiv.org
Game-playing agents like AlphaGo have achieved superhuman performance through self-
play, which is theoretically guaranteed to yield optimal policies in competitive games …

Better Synthetic Data by Retrieving and Transforming Existing Datasets

S Gandhi, R Gala, V Viswanathan, T Wu… - arXiv preprint arXiv …, 2024 - arxiv.org
Despite recent advances in large language models, building dependable and deployable
NLP models typically requires abundant, high-quality training data. However, task-specific …

“Get Their Hands Dirty, Not Mine”: On Researcher-Annotator Collaboration and the Agency of Annotators

S Zhu, J Rzeszotarski - Findings of the Association for …, 2024 - aclanthology.org
Annotation quality is often framed as post-hoc cleanup of annotator-caused issues. This
position paper discusses whether, how, and why this narrative limits the scope of improving …

“One-Size-Fits-All”? Examining Expectations around What Constitute “Fair” or “Good” NLG System Behaviors

L Lucy, SL Blodgett, M Shokouhi… - Proceedings of the …, 2024 - aclanthology.org
Fairness-related assumptions about what constitute appropriate NLG system behaviors
range from invariance, where systems are expected to behave identically for social groups …

The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels

E Fleisig, SL Blodgett, D Klein, Z Talat - arXiv preprint arXiv:2405.05860, 2024 - arxiv.org
Longstanding data labeling practices in machine learning involve collecting and
aggregating labels from multiple annotators. But what should we do when annotators …

Re-examining Sexism and Misogyny Classification with Annotator Attitudes

A Jiang, N Vitsakis, T Dinkar, G Abercrombie… - arXiv preprint arXiv …, 2024 - arxiv.org
Gender-Based Violence (GBV) is an increasing problem online, but existing datasets fail to
capture the plurality of possible annotator perspectives or ensure the representation of …

Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions

L Mathur, PP Liang, LP Morency - arXiv preprint arXiv:2404.11023, 2024 - arxiv.org
Building socially-intelligent AI agents (Social-AI) is a multidisciplinary, multimodal research
goal that involves creating agents that can sense, perceive, reason about, learn from, and …