Image restoration (IR) has been an indispensable and challenging task in the low-level vision field, which strives to improve the subjective quality of images distorted by various …
L Khachatryan, A Movsisyan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent text-to-video generation approaches rely on computationally heavy training and require large-scale video datasets. In this paper, we introduce a new task, zero-shot text-to …
J Xu, X Liu, Y Wu, Y Tong, Q Li… - Advances in …, 2024 - proceedings.neurips.cc
We present a comprehensive solution to learn and improve text-to-image models from human preference feedback. To begin with, we build ImageReward---the first general …
This survey reviews text-to-image diffusion models in the context that diffusion models have emerged to be popular for a wide range of generative tasks. As a self-contained work, this …
Deep generative models have unlocked another profound realm of human creativity. By capturing and generalizing patterns within data, we have entered the epoch of all …
Large vision-language models (VLMs) such as GPT-4 have achieved unprecedented performance in response generation, especially with visual inputs, enabling more creative …
This paper proposes a unified diffusion framework (dubbed UniDiffuser) to fit all distributions relevant to a set of multi-modal data in one model. Our key insight is–learning diffusion …
Z Tang, Z Yang, C Zhu, M Zeng… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract We present Composable Diffusion (CoDi), a novel generative model capable of generating any combination of output modalities, such as language, image, video, or audio …
The significant advances in applications of text-to-image generation models have prompted the demand of a post-hoc adaptation algorithms that can efficiently remove unwanted …