Unsupervised compositional concepts discovery with text-to-image generative models

N Liu, Y Du, S Li, JB Tenenbaum… - Proceedings of the …, 2023 - openaccess.thecvf.com
Text-to-image generative models have enabled high-resolution image synthesis across
different domains, but require users to specify the content they wish to generate. In this …

Subject-diffusion: Open domain personalized text-to-image generation without test-time fine-tuning

J Ma, J Liang, C Chen, H Lu - arXiv preprint arXiv:2307.11410, 2023 - arxiv.org
Recent progress in personalized image generation using diffusion models has been
significant. However, development in the area of open-domain and non-fine-tuning …

Improving negative-prompt inversion via proximal guidance

L Han, S Wen, Q Chen, Z Zhang, K Song… - arXiv preprint arXiv …, 2023 - arxiv.org
DDIM inversion has revealed the remarkable potential of real image editing within diffusion-
based methods. However, the accuracy of DDIM reconstruction degrades as larger classifier …

Mastering text-to-image diffusion: Recaptioning, planning, and generating with multimodal llms

L Yang, Z Yu, C Meng, M Xu, S Ermon, B Cui - arXiv preprint arXiv …, 2024 - arxiv.org
Diffusion models have exhibit exceptional performance in text-to-image generation and
editing. However, existing methods often face challenges when handling complex text …

Proxedit: Improving tuning-free real image editing with proximal guidance

L Han, S Wen, Q Chen, Z Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
DDIM inversion has revealed the remarkable potential of real image editing within diffusion-
based methods. However, the accuracy of DDIM reconstruction degrades as larger classifier …

Realcompo: Dynamic equilibrium between realism and compositionality improves text-to-image diffusion models

X Zhang, L Yang, Y Cai, Z Yu, J Xie, Y Tian… - arXiv preprint arXiv …, 2024 - arxiv.org
Diffusion models have achieved remarkable advancements in text-to-image generation.
However, existing models still have many difficulties when faced with multiple-object …

Diffblender: Scalable and composable multimodal text-to-image diffusion models

S Kim, J Lee, K Hong, D Kim, N Ahn - arXiv preprint arXiv:2305.15194, 2023 - arxiv.org
The recent progress in diffusion-based text-to-image generation models has significantly
expanded generative capabilities via conditioning the text descriptions. However, since …

R3CD: Scene Graph to Image Generation with Relation-Aware Compositional Contrastive Control Diffusion

J Liu, Q Liu - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Image generation tasks have achieved remarkable performance using large-scale diffusion
models. However, these models are limited to capturing the abstract relations (viz …

Attention Calibration for Disentangled Text-to-Image Personalization

Y Zhang, M Yang, Q Zhou, Z Wang - arXiv preprint arXiv:2403.18551, 2024 - arxiv.org
Recent thrilling progress in large-scale text-to-image (T2I) models has unlocked
unprecedented synthesis quality of AI-generated content (AIGC) including image …

A Survey of Diffusion Based Image Generation Models: Issues and Their Solutions

T Zhang, Z Wang, J Huang, MM Tasnim… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, there has been significant progress in the development of large models. Following
the success of ChatGPT, numerous language models have been introduced, demonstrating …