Multimodal image synthesis and editing: A survey and taxonomy

F Zhan, Y Yu, R Wu, J Zhang, S Lu, L Liu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …

Creating, using, misusing, and detecting deep fakes

H Farid - Journal of Online Trust and Safety, 2022 - tsjournal.org
Synthetic media—so-called deep fakes—have captured the imagination of some and struck
fear in others. Although they vary in their form and creation, deep fakes refer to text, image …

Attend-and-excite: Attention-based semantic guidance for text-to-image diffusion models

H Chefer, Y Alaluf, Y Vinker, L Wolf… - ACM Transactions on …, 2023 - dl.acm.org
Recent text-to-image generative models have demonstrated an unparalleled ability to
generate diverse and creative imagery guided by a target text prompt. While revolutionary …

Zero-shot image-to-image translation

G Parmar, K Kumar Singh, R Zhang, Y Li, J Lu… - ACM SIGGRAPH 2023 …, 2023 - dl.acm.org
Large-scale text-to-image generative models have shown their remarkable ability to
synthesize diverse, high-quality images. However, directly applying these models for real …

Plug-and-play diffusion features for text-driven image-to-image translation

N Tumanyan, M Geyer, S Bagon… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large-scale text-to-image generative models have been a revolutionary breakthrough in the
evolution of generative AI, synthesizing diverse images with highly complex visual concepts …

ediff-i: Text-to-image diffusion models with an ensemble of expert denoisers

Y Balaji, S Nah, X Huang, A Vahdat, J Song… - arXiv preprint arXiv …, 2022 - arxiv.org
Large-scale diffusion-based generative models have led to breakthroughs in text-
conditioned high-resolution image synthesis. Starting from random noise, such text-to-image …

Paint by example: Exemplar-based image editing with diffusion models

B Yang, S Gu, B Zhang, T Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Language-guided image editing has achieved great success recently. In this paper,
we investigate exemplar-guided image editing for more precise control. We achieve this …

Fastcomposer: Tuning-free multi-subject image generation with localized attention

G Xiao, T Yin, WT Freeman, F Durand… - International Journal of …, 2024 - Springer
Diffusion models excel at text-to-image generation, especially in subject-driven generation
for personalized images. However, existing methods are inefficient due to the subject …

Compositional visual generation with composable diffusion models

N Liu, S Li, Y Du, A Torralba, JB Tenenbaum - European Conference on …, 2022 - Springer
Large text-guided diffusion models, such as DALLE-2, are able to generate stunning
photorealistic images given natural language descriptions. While such models are highly …

Spatext: Spatio-textual representation for controllable image generation

O Avrahami, T Hayes, O Gafni… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent text-to-image diffusion models are able to generate convincing results of
unprecedented quality. However, it is nearly impossible to control the shapes of different …