Artificial intelligence, values, and alignment

JF Bonnefon, I Rahwan, A Shariff - Annual review of psychology, 2024 - annualreviews.org

Moral psychology was shaped around three categories of agents and patients: humans,
other animals, and supernatural beings. Rapid progress in artificial intelligence has …

被引用次数：18 相关文章所有 20 个版本

[HTML] springer.com

[HTML][HTML] In conversation with artificial intelligence: aligning language models with human values

A Kasirzadeh, I Gabriel - Philosophy & Technology, 2023 - Springer

Large-scale language technologies are increasingly used in various forms of
communication with humans across different contexts. One particular use case for these …

被引用次数：88 相关文章所有 9 个版本

[PDF] arxiv.org

Personality traits in large language models

M Safdari, G Serapio-García, C Crepy, S Fitz… - arXiv preprint arXiv …, 2023 - arxiv.org

The advent of large language models (LLMs) has revolutionized natural language
processing, enabling the generation of coherent and contextually relevant text. As LLMs …

被引用次数：87 相关文章所有 5 个版本

[PDF] arxiv.org

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

被引用次数：87 相关文章所有 3 个版本

[PDF] neurips.cc

Collaborating with humans without human data

DJ Strouse, K McKee, M Botvinick… - Advances in …, 2021 - proceedings.neurips.cc

Collaborating with humans requires rapidly adapting to their individual strengths,
weaknesses, and preferences. Unfortunately, most standard multi-agent reinforcement …

被引用次数：142 相关文章所有 6 个版本

[HTML] sciencedirect.com

[HTML][HTML] The risks of using ChatGPT to obtain common safety-related information and advice

O Oviedo-Trespalacios, AE Peden, T Cole-Hunter… - Safety science, 2023 - Elsevier

ChatGPT is a highly advanced AI language model that has gained widespread popularity. It
is trained to understand and generate human language and is used in various applications …

被引用次数：64 相关文章所有 21 个版本

The benefits, risks and bounds of personalizing the alignment of large language models to individuals

HR Kirk, B Vidgen, P Röttger, SA Hale - Nature Machine Intelligence, 2024 - nature.com

Large language models (LLMs) undergo 'alignment'so that they better reflect human values
or preferences, and are safer or more useful. However, alignment is intrinsically difficult …

被引用次数：16 相关文章所有 2 个版本

[PDF] arxiv.org

Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

HR Kirk, B Vidgen, P Röttger, SA Hale - arXiv preprint arXiv:2303.05453, 2023 - arxiv.org

Large language models (LLMs) are used to generate content for a wide range of tasks, and
are set to reach a growing audience in coming years due to integration in product interfaces …

被引用次数：64 相关文章所有 2 个版本

[PDF] arxiv.org

From google gemini to openai q*(q-star): A survey of reshaping the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arXiv preprint arXiv …, 2023 - arxiv.org

This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

被引用次数：61 相关文章所有 3 个版本

[PDF] neurips.cc

Moca: Measuring human-language model alignment on causal and moral judgment tasks

A Nie, Y Zhang, AS Amdekar, C Piech… - Advances in …, 2024 - proceedings.neurips.cc

Human commonsense understanding of the physical and social world is organized around
intuitive theories. These theories support making causal and moral judgments. When …

被引用次数：13 相关文章所有 8 个版本

高级搜索

QQ 群