相关文章- 学术资源搜索

Training language models to follow instructions with human feedback

L Ouyang, J Wu, X Jiang, D Almeida… - Advances in neural …, 2022 - proceedings.neurips.cc

Making language models bigger does not inherently make them better at following a user's
intent. For example, large language models can generate outputs that are untruthful, toxic, or …

被引用次数：7318 相关文章所有 18 个版本

[PDF] arxiv.org

Chain of hindsight aligns language models with feedback

H Liu, C Sferrazza, P Abbeel - arXiv preprint arXiv:2302.02676, 2023 - arxiv.org

Learning from human preferences is important for language models to match human needs
and to align with human and social values. Prior works have achieved remarkable …

被引用次数：88 相关文章所有 3 个版本

[PDF] arxiv.org

Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too?

W Shi, X Han, H Gonen, A Holtzman, Y Tsvetkov… - arXiv preprint arXiv …, 2022 - arxiv.org

Large language models can perform new tasks in a zero-shot fashion, given natural
language prompts that specify the desired behavior. Such prompts are typically hand …

被引用次数：27 相关文章所有 5 个版本

[PDF] arxiv.org

Black-box prompt optimization: Aligning large language models without model training

J Cheng, X Liu, K Zheng, P Ke, H Wang, Y Dong… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have shown impressive success in various applications.
However, these models are often not well aligned with human intents, which calls for …

被引用次数：18 相关文章所有 2 个版本

[PDF] arxiv.org

Activation addition: Steering language models without optimization

A Turner, L Thiergart, D Udell, G Leech, U Mini… - arXiv preprint arXiv …, 2023 - arxiv.org

Reliably controlling the behavior of large language models (LLMs) is a pressing open
problem. Existing methods include supervised finetuning, reinforcement learning from …

被引用次数：52 相关文章所有 3 个版本

[PDF] arxiv.org

Eliciting human preferences with language models

BZ Li, A Tamkin, N Goodman, J Andreas - arXiv preprint arXiv:2310.11589, 2023 - arxiv.org

Language models (LMs) can be directed to perform target tasks by using labeled examples
or natural language prompts. But selecting examples or writing prompts for can be …

被引用次数：20 相关文章所有 3 个版本

[PDF] acm.org

Constitutionmaker: Interactively critiquing large language models by converting feedback into principles

S Petridis, BD Wedin, J Wexler, M Pushkarna… - Proceedings of the 29th …, 2024 - dl.acm.org

Large language model (LLM) prompting is a promising new approach for users to create
and customize their own chatbots. However, current methods for steering a chatbot's …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

Star-gate: Teaching language models to ask clarifying questions

C Andukuri, JP Fränken, T Gerstenberg… - arXiv preprint arXiv …, 2024 - arxiv.org

When prompting language models to complete a task, users often leave important aspects
unsaid. While asking questions could resolve this ambiguity\citep [GATE;][]{li2023eliciting} …

被引用次数：10 相关文章所有 3 个版本

[PDF] mlr.press

Pretraining language models with human preferences

T Korbak, K Shi, A Chen, RV Bhalerao… - International …, 2023 - proceedings.mlr.press

Abstract Language models (LMs) are pretrained to imitate text from large and diverse
datasets that contain content that would violate human preferences if generated by an LM …

被引用次数：126 相关文章所有 9 个版本

[PDF] arxiv.org

Large language models are human-level prompt engineers

Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis… - arXiv preprint arXiv …, 2022 - arxiv.org

By conditioning on natural language instructions, large language models (LLMs) have
displayed impressive capabilities as general-purpose computers. However, task …

被引用次数：570 相关文章所有 7 个版本

高级搜索

QQ 群

Training language models to follow instructions with human feedback

Chain of hindsight aligns language models with feedback

Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too?

Black-box prompt optimization: Aligning large language models without model training

Activation addition: Steering language models without optimization

Eliciting human preferences with language models

Constitutionmaker: Interactively critiquing large language models by converting feedback into principles

Star-gate: Teaching language models to ask clarifying questions

Pretraining language models with human preferences

Large language models are human-level prompt engineers

引用