Auditing large language models: a three-layered approach

B Meskó, EJ Topol - NPJ digital medicine, 2023 - nature.com

The rapid advancements in artificial intelligence (AI) have led to the development of
sophisticated large language models (LLMs) such as GPT-4 and Bard. The potential …

被引用次数：271 相关文章所有 7 个版本

[PDF] arxiv.org

A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

被引用次数：217 相关文章所有 3 个版本

[PDF] arxiv.org

Open problems and fundamental limitations of reinforcement learning from human feedback

S Casper, X Davies, C Shi, TK Gilbert… - arXiv preprint arXiv …, 2023 - arxiv.org

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …

被引用次数：244 相关文章所有 6 个版本

[PDF] pubpub.org

[PDF][PDF] Ai transparency in the age of llms: A human-centered research roadmap

QV Liao, JW Vaughan - arXiv preprint arXiv:2306.01941, 2023 - assets.pubpub.org

The rise of powerful large language models (LLMs) brings about tremendous opportunities
for innovation but also looming risks for individuals and society at large. We have reached a …

被引用次数：79 相关文章所有 6 个版本

[PDF] arxiv.org

Personality traits in large language models

M Safdari, G Serapio-García, C Crepy, S Fitz… - arXiv preprint arXiv …, 2023 - arxiv.org

The advent of large language models (LLMs) has revolutionized natural language
processing, enabling the generation of coherent and contextually relevant text. As LLMs …

被引用次数：92 相关文章所有 5 个版本

[PDF] arxiv.org

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

被引用次数：94 相关文章所有 3 个版本

[PDF] arxiv.org

Starcoder 2 and the stack v2: The next generation

A Lozhkov, R Li, LB Allal, F Cassano… - arXiv preprint arXiv …, 2024 - arxiv.org

The BigCode project, an open-scientific collaboration focused on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In …

被引用次数：53 相关文章所有 2 个版本

[PDF] arxiv.org

Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

HR Kirk, B Vidgen, P Röttger, SA Hale - arXiv preprint arXiv:2303.05453, 2023 - arxiv.org

Large language models (LLMs) are used to generate content for a wide range of tasks, and
are set to reach a growing audience in coming years due to integration in product interfaces …

被引用次数：64 相关文章所有 2 个版本

[PDF] arxiv.org

Managing AI risks in an era of rapid progress

Y Bengio, G Hinton, A Yao, D Song, P Abbeel… - arXiv preprint arXiv …, 2023 - arxiv.org

In this short consensus paper, we outline risks from upcoming, advanced AI systems. We
examine large-scale social harms and malicious uses, as well as an irreversible loss of …

被引用次数：51 相关文章所有 14 个版本

[PDF] arxiv.org

Machine psychology: Investigating emergent capabilities and behavior in large language models using psychological methods

T Hagendorff - arXiv preprint arXiv:2303.13988, 2023 - arxiv.org

Large language models (LLMs) are currently at the forefront of intertwining AI systems with
human communication and everyday life. Due to rapid technological advances and their …

被引用次数：65 相关文章所有 3 个版本

高级搜索

QQ 群