Photomaker: Customizing realistic human photos via stacked id embedding

Z Li, M Cao, X Wang, Z Qi… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advances in text-to-image generation have made remarkable progress in
synthesizing realistic human photos conditioned on given text prompts. However existing …

Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

A neural space-time representation for text-to-image personalization

Y Alaluf, E Richardson, G Metzer… - ACM Transactions on …, 2023 - dl.acm.org
A key aspect of text-to-image personalization methods is the manner in which the target
concept is represented within the generative process. This choice greatly affects the visual …

Subject-diffusion: Open domain personalized text-to-image generation without test-time fine-tuning

J Ma, J Liang, C Chen, H Lu - arXiv preprint arXiv:2307.11410, 2023 - arxiv.org
Recent progress in personalized image generation using diffusion models has been
significant. However, development in the area of open-domain and non-fine-tuning …

Training-free consistent text-to-image generation

Y Tewel, O Kaduri, R Gal, Y Kasten, L Wolf… - arXiv preprint arXiv …, 2024 - arxiv.org
Text-to-image models offer a new level of creative flexibility by allowing users to guide the
image generation process through natural language. However, using these models to …

Singleinsert: Inserting new concepts from a single image into text-to-image models for flexible editing

Z Wu, C Yu, Z Zhu, F Wang, X Bai - arXiv preprint arXiv:2310.08094, 2023 - arxiv.org
Recent progress in text-to-image (T2I) models enables high-quality image generation with
flexible textual control. To utilize the abundant visual priors in the off-the-shelf T2I models, a …

The chosen one: Consistent characters in text-to-image diffusion models

O Avrahami, A Hertz, Y Vinker, M Arar… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advances in text-to-image generation models have unlocked vast potential for visual
creativity. However, these models struggle with generation of consistent characters, a crucial …

Controllable generation with text-to-image diffusion models: A survey

P Cao, F Zhou, Q Song, L Yang - arXiv preprint arXiv:2403.04279, 2024 - arxiv.org
In the rapidly advancing realm of visual generation, diffusion models have revolutionized the
landscape, marking a significant shift in capabilities with their impressive text-guided …

A data perspective on enhanced identity preservation for diffusion personalization

X He, Z Cao, N Kolkin, L Yu, H Rhodin… - arXiv preprint arXiv …, 2023 - arxiv.org
Large text-to-image models have revolutionized the ability to generate imagery using natural
language. However, particularly unique or personal visual concepts, such as your pet, an …

MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation

D Ostashev, Y Fang, S Tulyakov, K Aberman - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce a new architecture for personalization of text-to-image diffusion models,
coined Mixture-of-Attention (MoA). Inspired by the Mixture-of-Experts mechanism utilized in …