Photorealistic text-to-image diffusion models with deep language understanding

L Yang, Z Zhang, Y Song, S Hong, R Xu, Y Zhao… - ACM Computing …, 2023 - dl.acm.org

Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …

被引用次数：961 相关文章所有 6 个版本

[PDF] arxiv.org

Diffusion models in vision: A survey

FA Croitoru, V Hondru, RT Ionescu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Denoising diffusion models represent a recent emerging topic in computer vision,
demonstrating remarkable results in the area of generative modeling. A diffusion model is a …

被引用次数：793 相关文章所有 7 个版本

[PDF] thecvf.com

Adding conditional control to text-to-image diffusion models

L Zhang, A Rao, M Agrawala - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

We present ControlNet, a neural network architecture to add spatial conditioning controls to
large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large …

被引用次数：2073 相关文章所有 6 个版本

[PDF] neurips.cc

Visual instruction tuning

H Liu, C Li, Q Wu, YJ Lee - Advances in neural information …, 2024 - proceedings.neurips.cc

Instruction tuning large language models (LLMs) using machine-generated instruction-
following data has been shown to improve zero-shot capabilities on new tasks, but the idea …

被引用次数：2457 相关文章所有 15 个版本

[PDF] thecvf.com

Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation

N Ruiz, Y Li, V Jampani, Y Pritch… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large text-to-image models achieved a remarkable leap in the evolution of AI, enabling high-
quality and diverse synthesis of images from a given text prompt. However, these models …

被引用次数：1658 相关文章所有 6 个版本

[PDF] thecvf.com

Instructpix2pix: Learning to follow image editing instructions

T Brooks, A Holynski, AA Efros - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

We propose a method for editing images from human instructions: given an input image and
a written instruction that tells the model what to do, our model follows these instructions to …

被引用次数：990 相关文章所有 7 个版本

[PDF] thecvf.com

Magic3d: High-resolution text-to-3d content creation

CH Lin, J Gao, L Tang, T Takikawa… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recently, DreamFusion demonstrated the utility of a pretrained text-to-image diffusion model
to optimize Neural Radiance Fields (NeRF), achieving remarkable text-to-3D synthesis …

被引用次数：772 相关文章所有 6 个版本

[PDF] thecvf.com

Scalable diffusion models with transformers

W Peebles, S Xie - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

We explore a new class of diffusion models based on the transformer architecture. We train
latent diffusion models of images, replacing the commonly-used U-Net backbone with a …

被引用次数：668 相关文章所有 6 个版本

[PDF] thecvf.com

Align your latents: High-resolution video synthesis with latent diffusion models

A Blattmann, R Rombach, H Ling… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding
excessive compute demands by training a diffusion model in a compressed lower …

被引用次数：534 相关文章所有 6 个版本

[PDF] thecvf.com

Imagic: Text-based real image editing with diffusion models

B Kawar, S Zada, O Lang, O Tov… - Proceedings of the …, 2023 - openaccess.thecvf.com

Text-conditioned image editing has recently attracted considerable interest. However, most
methods are currently limited to one of the following: specific editing types (eg, object …

被引用次数：722 相关文章所有 6 个版本

高级搜索

QQ 群