Diffusion models: A comprehensive survey of methods and applications

L Yang, Z Zhang, Y Song, S Hong, R Xu, Y Zhao… - ACM Computing …, 2023 - dl.acm.org
Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …

Diffusion models in vision: A survey

FA Croitoru, V Hondru, RT Ionescu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Denoising diffusion models represent a recent emerging topic in computer vision,
demonstrating remarkable results in the area of generative modeling. A diffusion model is a …

Adding conditional control to text-to-image diffusion models

L Zhang, A Rao, M Agrawala - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
We present ControlNet, a neural network architecture to add spatial conditioning controls to
large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large …

Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation

N Ruiz, Y Li, V Jampani, Y Pritch… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large text-to-image models achieved a remarkable leap in the evolution of AI, enabling high-
quality and diverse synthesis of images from a given text prompt. However, these models …

Instructpix2pix: Learning to follow image editing instructions

T Brooks, A Holynski, AA Efros - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We propose a method for editing images from human instructions: given an input image and
a written instruction that tells the model what to do, our model follows these instructions to …

Imagic: Text-based real image editing with diffusion models

B Kawar, S Zada, O Lang, O Tov… - Proceedings of the …, 2023 - openaccess.thecvf.com
Text-conditioned image editing has recently attracted considerable interest. However, most
methods are currently limited to one of the following: specific editing types (eg, object …

Imagen video: High definition video generation with diffusion models

J Ho, W Chan, C Saharia, J Whang, R Gao… - arXiv preprint arXiv …, 2022 - arxiv.org
We present Imagen Video, a text-conditional video generation system based on a cascade
of video diffusion models. Given a text prompt, Imagen Video generates high definition …

An image is worth one word: Personalizing text-to-image generation using textual inversion

R Gal, Y Alaluf, Y Atzmon, O Patashnik… - arXiv preprint arXiv …, 2022 - arxiv.org
Text-to-image models offer unprecedented freedom to guide creation through natural
language. Yet, it is unclear how such freedom can be exercised to generate images of …

Scalable diffusion models with transformers

W Peebles, S Xie - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
We explore a new class of diffusion models based on the transformer architecture. We train
latent diffusion models of images, replacing the commonly-used U-Net backbone with a …

Prompt-to-prompt image editing with cross attention control

A Hertz, R Mokady, J Tenenbaum, K Aberman… - arXiv preprint arXiv …, 2022 - arxiv.org
Recent large-scale text-driven synthesis models have attracted much attention thanks to
their remarkable capabilities of generating highly diverse images that follow given text …