Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward

Z Jia, Y Nan, H Zhao, G Liu - arXiv preprint arXiv:2411.15247, 2024 - arxiv.org
Recent research has shown that fine-tuning diffusion models (DMs) with arbitrary rewards,
including non-differentiable ones, is feasible with reinforcement learning (RL) techniques …