相关文章- 学术资源搜索

Fine-tuning language models to find agreement among humans with diverse preferences

M Bakker, M Chadwick, H Sheahan… - Advances in …, 2022 - proceedings.neurips.cc

Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with
the preferences of a prototypical user. This work assumes that human preferences are static …

被引用次数：139 相关文章所有 8 个版本

[PDF] arxiv.org

Peering through preferences: Unraveling feedback acquisition for aligning large language models

H Bansal, J Dang, A Grover - arXiv preprint arXiv:2308.15812, 2023 - arxiv.org

Aligning large language models (LLMs) with human values and intents critically involves the
use of human or AI feedback. While dense feedback annotations are expensive to acquire …

被引用次数：19 相关文章所有 5 个版本

[PDF] arxiv.org

Aligning large language models with human preferences through representation engineering

W Liu, X Wang, M Wu, T Li, C Lv, Z Ling, J Zhu… - arXiv preprint arXiv …, 2023 - arxiv.org

Aligning large language models (LLMs) with human preferences is crucial for enhancing
their utility in terms of helpfulness, truthfulness, safety, harmlessness, and interestingness …

被引用次数：13 相关文章所有 2 个版本

[PDF] arxiv.org

Chatbot arena: An open platform for evaluating llms by human preference

WL Chiang, L Zheng, Y Sheng… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models (LLMs) have unlocked new capabilities and applications; however,
evaluating the alignment with human preferences still poses significant challenges. To …

被引用次数：61 相关文章所有 3 个版本

[PDF] arxiv.org

Aligning language models to user opinions

EJ Hwang, BP Majumder, N Tandon - arXiv preprint arXiv:2305.14929, 2023 - arxiv.org

An important aspect of developing LLMs that interact with humans is to align models'
behavior to their users. It is possible to prompt an LLM into behaving as a certain persona …

被引用次数：25 相关文章所有 4 个版本

[PDF] mlr.press

Whose opinions do language models reflect?

S Santurkar, E Durmus, F Ladhak… - International …, 2023 - proceedings.mlr.press

Abstract Language models (LMs) are increasingly being used in open-ended contexts,
where the opinions they reflect in response to subjective queries can have a profound …

被引用次数：206 相关文章所有 7 个版本

[PDF] arxiv.org

Eliciting human preferences with language models

BZ Li, A Tamkin, N Goodman, J Andreas - arXiv preprint arXiv:2310.11589, 2023 - arxiv.org

Language models (LMs) can be directed to perform target tasks by using labeled examples
or natural language prompts. But selecting examples or writing prompts for can be …

被引用次数：22 相关文章所有 3 个版本

[PDF] arxiv.org

Aligning large language models with human: A survey

Y Wang, W Zhong, L Li, F Mi, X Zeng, W Huang… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) trained on extensive textual corpora have emerged as
leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite …

被引用次数：160 相关文章所有 2 个版本

[PDF] neurips.cc

Aligning language models with human preferences via a bayesian approach

J Wang, H Wang, S Sun, W Li - Advances in Neural …, 2024 - proceedings.neurips.cc

In the quest to advance human-centric natural language generation (NLG) systems,
ensuring alignment between NLG models and human preferences is crucial. For this …

被引用次数：8 相关文章所有 5 个版本

[PDF] arxiv.org

Rain: Your language models can align themselves without finetuning

Y Li, F Wei, J Zhao, C Zhang, H Zhang - arXiv preprint arXiv:2309.07124, 2023 - arxiv.org

Large language models (LLMs) often demonstrate inconsistencies with human preferences.
Previous research gathered human preference data and then aligned the pre-trained …

被引用次数：56 相关文章所有 3 个版本

高级搜索

QQ 群

Fine-tuning language models to find agreement among humans with diverse preferences

Peering through preferences: Unraveling feedback acquisition for aligning large language models

Aligning large language models with human preferences through representation engineering

Chatbot arena: An open platform for evaluating llms by human preference

Aligning language models to user opinions

Whose opinions do language models reflect?

Eliciting human preferences with language models

Aligning large language models with human: A survey

Aligning language models with human preferences via a bayesian approach

Rain: Your language models can align themselves without finetuning

相关搜索

引用