The moral psychology of Artificial Intelligence

JF Bonnefon, I Rahwan, A Shariff - Annual review of psychology, 2024 - annualreviews.org
Moral psychology was shaped around three categories of agents and patients: humans,
other animals, and supernatural beings. Rapid progress in artificial intelligence has …

[HTML][HTML] In conversation with artificial intelligence: aligning language models with human values

A Kasirzadeh, I Gabriel - Philosophy & Technology, 2023 - Springer
Large-scale language technologies are increasingly used in various forms of
communication with humans across different contexts. One particular use case for these …

Personality traits in large language models

M Safdari, G Serapio-García, C Crepy, S Fitz… - arXiv preprint arXiv …, 2023 - arxiv.org
The advent of large language models (LLMs) has revolutionized natural language
processing, enabling the generation of coherent and contextually relevant text. As LLMs …

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

Collaborating with humans without human data

DJ Strouse, K McKee, M Botvinick… - Advances in …, 2021 - proceedings.neurips.cc
Collaborating with humans requires rapidly adapting to their individual strengths,
weaknesses, and preferences. Unfortunately, most standard multi-agent reinforcement …

[HTML][HTML] The risks of using ChatGPT to obtain common safety-related information and advice

O Oviedo-Trespalacios, AE Peden, T Cole-Hunter… - Safety science, 2023 - Elsevier
ChatGPT is a highly advanced AI language model that has gained widespread popularity. It
is trained to understand and generate human language and is used in various applications …

The benefits, risks and bounds of personalizing the alignment of large language models to individuals

HR Kirk, B Vidgen, P Röttger, SA Hale - Nature Machine Intelligence, 2024 - nature.com
Large language models (LLMs) undergo 'alignment'so that they better reflect human values
or preferences, and are safer or more useful. However, alignment is intrinsically difficult …

Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

HR Kirk, B Vidgen, P Röttger, SA Hale - arXiv preprint arXiv:2303.05453, 2023 - arxiv.org
Large language models (LLMs) are used to generate content for a wide range of tasks, and
are set to reach a growing audience in coming years due to integration in product interfaces …

From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arXiv preprint arXiv …, 2023 - arxiv.org
This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

Moca: Measuring human-language model alignment on causal and moral judgment tasks

A Nie, Y Zhang, AS Amdekar, C Piech… - Advances in …, 2024 - proceedings.neurips.cc
Human commonsense understanding of the physical and social world is organized around
intuitive theories. These theories support making causal and moral judgments. When …