- 学术资源搜索

Diffusion models in vision: A survey

FA Croitoru, V Hondru, RT Ionescu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Denoising diffusion models represent a recent emerging topic in computer vision,
demonstrating remarkable results in the area of generative modeling. A diffusion model is a …

被引用次数：1244 相关文章所有 7 个版本

[PDF] arxiv.org

A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arXiv preprint arXiv …, 2023 - arxiv.org

As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

被引用次数：201 相关文章所有 4 个版本

Attend-and-excite: Attention-based semantic guidance for text-to-image diffusion models

H Chefer, Y Alaluf, Y Vinker, L Wolf… - ACM Transactions on …, 2023 - dl.acm.org

Recent text-to-image generative models have demonstrated an unparalleled ability to
generate diverse and creative imagery guided by a target text prompt. While revolutionary …

被引用次数：429 相关文章所有 4 个版本

[PDF] thecvf.com

Plug-and-play diffusion features for text-driven image-to-image translation

N Tumanyan, M Geyer, S Bagon… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale text-to-image generative models have been a revolutionary breakthrough in the
evolution of generative AI, synthesizing diverse images with highly complex visual concepts …

被引用次数：614 相关文章所有 6 个版本

[PDF] thecvf.com

Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing

M Cao, X Wang, Z Qi, Y Shan… - Proceedings of the …, 2023 - openaccess.thecvf.com

Despite the success in large-scale text-to-image generation and text-conditioned image
editing, existing methods still struggle to produce consistent generation and editing results …

被引用次数：344 相关文章所有 5 个版本

[PDF] thecvf.com

Fatezero: Fusing attentions for zero-shot text-based video editing

C Qi, X Cun, Y Zhang, C Lei, X Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

The diffusion-based generative models have achieved remarkable success in text-based
image generation. However, since it contains enormous randomness in generation …

被引用次数：295 相关文章所有 6 个版本

ediff-i: Text-to-image diffusion models with an ensemble of expert denoisers

Y Balaji, S Nah, X Huang, A Vahdat, J Song… - arXiv preprint arXiv …, 2022 - arxiv.org

Large-scale diffusion-based generative models have led to breakthroughs in text-
conditioned high-resolution image synthesis. Starting from random noise, such text-to-image …

被引用次数：709 相关文章所有 2 个版本

[PDF] arxiv.org

Next-gpt: Any-to-any multimodal llm

S Wu, H Fei, L Qu, W Ji, TS Chua - arXiv preprint arXiv:2309.05519, 2023 - arxiv.org

While recently Multimodal Large Language Models (MM-LLMs) have made exciting strides,
they mostly fall prey to the limitation of only input-side multimodal understanding, without the …

被引用次数：446 相关文章所有 4 个版本

[PDF] thecvf.com

Null-text inversion for editing real images using guided diffusion models

R Mokady, A Hertz, K Aberman… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent large-scale text-guided diffusion models provide powerful image generation
capabilities. Currently, a massive effort is given to enable the modification of these images …

被引用次数：761 相关文章所有 5 个版本

[PDF] thecvf.com

Latent-nerf for shape-guided generation of 3d shapes and textures

G Metzer, E Richardson, O Patashnik… - Proceedings of the …, 2023 - openaccess.thecvf.com

Text-guided image generation has progressed rapidly in recent years, inspiring major
breakthroughs in text-guided shape generation. Recently, it has been shown that using …

被引用次数：416 相关文章所有 5 个版本

高级搜索

QQ 群

Diffusion models in vision: A survey

A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

Attend-and-excite: Attention-based semantic guidance for text-to-image diffusion models

Plug-and-play diffusion features for text-driven image-to-image translation

Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing

Fatezero: Fusing attentions for zero-shot text-based video editing

ediff-i: Text-to-image diffusion models with an ensemble of expert denoisers

Next-gpt: Any-to-any multimodal llm

Null-text inversion for editing real images using guided diffusion models

Latent-nerf for shape-guided generation of 3d shapes and textures

引用