Reinforcement learning for fine-tuning text-to-image diffusion models

Reinforcement learning for generative ai: State of the art, opportunities and open research challenges

G Franceschelli, M Musolesi - Journal of Artificial Intelligence Research, 2024 - jair.org

Abstract Generative Artificial Intelligence (AI) is one of the most exciting developments in
Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has …

被引用次数：15 相关文章所有 10 个版本

[PDF] nowpublishers.com

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com

Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

被引用次数：161 相关文章所有 6 个版本

[PDF] arxiv.org

Understanding reinforcement learning-based fine-tuning of diffusion models: A tutorial and review

M Uehara, Y Zhao, T Biancalani, S Levine - arXiv preprint arXiv …, 2024 - arxiv.org

This tutorial provides a comprehensive survey of methods for fine-tuning diffusion models to
optimize downstream reward functions. While diffusion models are widely known to provide …

被引用次数：8 相关文章所有 3 个版本

[PDF] arxiv.org

Training diffusion models with reinforcement learning

K Black, M Janner, Y Du, I Kostrikov… - arXiv preprint arXiv …, 2023 - arxiv.org

Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …

被引用次数：155 相关文章所有 6 个版本

[PDF] thecvf.com

Diffusion model alignment using direct preference optimization

B Wallace, M Dang, R Rafailov… - Proceedings of the …, 2024 - openaccess.thecvf.com

Large language models (LLMs) are fine-tuned using human comparison data with
Reinforcement Learning from Human Feedback (RLHF) methods to make them better …

被引用次数：74 相关文章所有 3 个版本

[PDF] arxiv.org

Directly fine-tuning diffusion models on differentiable rewards

K Clark, P Vicol, K Swersky, DJ Fleet - arXiv preprint arXiv:2309.17400, 2023 - arxiv.org

We present Direct Reward Fine-Tuning (DRaFT), a simple and effective method for fine-
tuning diffusion models to maximize differentiable reward functions, such as scores from …

被引用次数：67 相关文章所有 4 个版本

[PDF] thecvf.com

Using human feedback to fine-tune diffusion models without any reward model

K Yang, J Tao, J Lyu, C Ge, J Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com

Using reinforcement learning with human feedback (RLHF) has shown significant promise in
fine-tuning diffusion models. Previous methods start by training a reward model that aligns …

被引用次数：22 相关文章所有 3 个版本

[PDF] arxiv.org

From to : Your Language Model is Secretly a Q-Function

R Rafailov, J Hejna, R Park, C Finn - arXiv preprint arXiv:2404.12358, 2024 - arxiv.org

Reinforcement Learning From Human Feedback (RLHF) has been a critical to the success
of the latest generation of generative AI models. In response to the complex nature of the …

被引用次数：44 相关文章所有 2 个版本

[PDF] thecvf.com

Textcraftor: Your text encoder can be image quality controller

Y Li, X Liu, A Kag, J Hu, Y Idelbayev… - Proceedings of the …, 2024 - openaccess.thecvf.com

Diffusion-based text-to-image generative models eg Stable Diffusion have revolutionized the
field of content generation enabling significant advancements in areas like image editing …

被引用次数：6 相关文章所有 3 个版本

[PDF] thecvf.com

InstructVideo: instructing video diffusion models with human feedback

H Yuan, S Zhang, X Wang, Y Wei… - Proceedings of the …, 2024 - openaccess.thecvf.com

Diffusion models have emerged as the de facto paradigm for video generation. However
their reliance on web-scale data of varied quality often yields results that are visually …

被引用次数：15 相关文章所有 4 个版本

高级搜索

QQ 群