Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

L Castricato, N Lile, R Rafailov, JP Fränken… - arXiv preprint arXiv …, 2024 - arxiv.org

The rapid advancement of language models (LMs) necessitates robust alignment with
diverse user values. However, current preference optimization approaches often fail to …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Adaptagent: Adapting multimodal web agents with few-shot learning from human demonstrations

G Verma, R Kaur, N Srishankar, Z Zeng, T Balch… - arXiv preprint arXiv …, 2024 - arxiv.org

State-of-the-art multimodal web agents, powered by Multimodal Large Language Models
(MLLMs), can autonomously execute many web tasks by processing user instructions and …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

ChainBuddy: An AI Agent System for Generating LLM Pipelines

J Zhang, I Arawjo - arXiv preprint arXiv:2409.13588, 2024 - arxiv.org

As large language models (LLMs) advance, their potential applications have grown
significantly. However, it remains difficult to evaluate LLM behavior on user-specific tasks …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Open-domain implicit format control for large language model generation

Y Yao, W Ma, X Fang, X Jiang, X Li, X Meng… - arXiv preprint arXiv …, 2024 - arxiv.org

Controlling the format of outputs generated by large language models (LLMs) is a critical
functionality in various applications. Current methods typically employ constrained decoding …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Aligning LLMs with Domain Invariant Reward Models

D Wu, S Choudhury - arXiv preprint arXiv:2501.00911, 2025 - arxiv.org

Aligning large language models (LLMs) to human preferences is challenging in domains
where preference data is unavailable. We address the problem of learning reward models …

SePPO: Semi-Policy Preference Optimization for Diffusion Alignment

D Zhang, G Lan, DJ Han, W Yao, X Pan… - arXiv preprint arXiv …, 2024 - arxiv.org

Reinforcement learning from human feedback (RLHF) methods are emerging as a way to
fine-tune diffusion models (DMs) for visual generation. However, commonly used on-policy …

No Preference Left Behind: Group Distributional Preference Optimization

B Yao, Z Cai, YS Chuang, S Yang, M Jiang… - arXiv preprint arXiv …, 2024 - arxiv.org

Preferences within a group of people are not uniform but follow a distribution. While existing
alignment methods like Direct Preference Optimization (DPO) attempt to steer models to …

Improving llm generation with inverse and forward alignment: Reward modeling, prompting, fine-tuning, and inference-time optimization

H Sun, T Pouplin, N Astorga, T Liu… - The First Workshop on … - openreview.net

Large Language Models (LLMs) are often characterized as samplers or generators in the
literature, yet maximizing their capabilities in these roles is a complex challenge. Previous …

被引用次数：1 相关文章所有 2 个版本

Granting Non-AI Experts Creative Control Over AI Systems

MS Lam - Adjunct Proceedings of the 37th Annual ACM …, 2024 - dl.acm.org

Many harmful behaviors and problematic deployments of AI stem from the fact that AI experts
are not experts in the vast array of settings where AI is applied. Non-AI experts from these …

[PDF] openreview.net

Superficial Alignment, Subtle Divergence, and Nudge Sensitivity in LLM Decision-Making

M Cherep, N Singh, P Maes - NeurIPS 2024 Workshop on Behavioral … - openreview.net

LLMs are being set loose in complex, real-world environments involving sequential decision-
making and tool use. Often, this involves making choices on behalf of human users. Not …

高级搜索

QQ 群