相关文章- 学术资源搜索

Eclipse: A resource-efficient text-to-image prior for image generations

M Patel, C Kim, S Cheng, C Baral… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Text-to-image (T2I) diffusion models notably the unCLIP models (eg DALL-E-2)
achieve state-of-the-art (SOTA) performance on various compositional T2I benchmarks at …

被引用次数：7 相关文章所有 3 个版本

[PDF] aaai.org

T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models

C Mou, X Wang, L Xie, Y Wu, J Zhang, Z Qi… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

The incredible generative ability of large-scale text-to-image (T2I) models has demonstrated
strong power of learning complex structures and meaningful semantics. However, relying …

被引用次数：547 相关文章所有 3 个版本

[PDF] thecvf.com

Swiftbrush: One-step text-to-image diffusion model with variational score distillation

TH Nguyen, A Tran - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Despite their ability to generate high-resolution and diverse images from text prompts text-to-
image diffusion models often suffer from slow iterative sampling processes. Model distillation …

被引用次数：9 相关文章所有 2 个版本

[PDF] openreview.net

Are diffusion models vision-and-language reasoners?

B Krojer, E Poole-Dayan, V Voleti, C Pal… - … seventh Conference on …, 2023 - openreview.net

Text-conditioned image generation models have recently shown immense qualitative
success using denoising diffusion processes. However, unlike discriminative vision-and …

被引用次数：6 相关文章所有 7 个版本

[PDF] thecvf.com

Initno: Boosting text-to-image diffusion models via initial noise optimization

X Guo, J Liu, M Cui, J Li, H Yang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recent strides in the development of diffusion models exemplified by advancements such as
Stable Diffusion have underscored their remarkable prowess in generating visually …

被引用次数：3 相关文章所有 3 个版本

[PDF] thecvf.com

Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model

Z Wang, L Wei, T Wang, H Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Text-to-image (T2I) generative models have recently emerged as a powerful tool
enabling the creation of photo-realistic images and giving rise to a multitude of applications …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

COMCAT: towards efficient compression and customization of attention-based vision models

J Xiao, M Yin, Y Gong, X Zang, J Ren… - arXiv preprint arXiv …, 2023 - arxiv.org

Attention-based vision models, such as Vision Transformer (ViT) and its variants, have
shown promising performance in various computer vision tasks. However, these emerging …

被引用次数：6 相关文章所有 5 个版本

[PDF] thecvf.com

Unleashing text-to-image diffusion models for visual perception

W Zhao, Y Rao, Z Liu, B Liu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Diffusion models (DMs) have become the new trend of generative models and have
demonstrated a powerful ability of conditional synthesis. Among those, text-to-image …

被引用次数：112 相关文章所有 5 个版本

[PDF] arxiv.org

-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

M Patel, S Jung, C Baral, Y Yang - arXiv preprint arXiv:2402.05195, 2024 - arxiv.org

Despite the recent advances in personalized text-to-image (P-T2I) generative models,
subject-driven T2I remains challenging. The primary bottlenecks include 1) Intensive training …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

Cross-attention makes inference cumbersome in text-to-image diffusion models

W Zhang, H Liu, J Xie, F Faccio, MZ Shou… - arXiv preprint arXiv …, 2024 - arxiv.org

This study explores the role of cross-attention during inference in text-conditional diffusion
models. We find that cross-attention outputs converge to a fixed point after few inference …

被引用次数：7 相关文章所有 2 个版本

高级搜索

QQ 群

Eclipse: A resource-efficient text-to-image prior for image generations

T2i-adapter: Learning adapters to dig out more controllable ability for text-to-image diffusion models

Swiftbrush: One-step text-to-image diffusion model with variational score distillation

Are diffusion models vision-and-language reasoners?

Initno: Boosting text-to-image diffusion models via initial noise optimization

Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model

COMCAT: towards efficient compression and customization of attention-based vision models

Unleashing text-to-image diffusion models for visual perception

-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space

Cross-attention makes inference cumbersome in text-to-image diffusion models

引用