Edict: Exact diffusion inversion via coupled transformations

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arXiv preprint arXiv …, 2023 - arxiv.org

As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

被引用次数：205 相关文章所有 4 个版本

[PDF] arxiv.org

State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library

The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

被引用次数：90 相关文章所有 12 个版本

[PDF] neurips.cc

Blip-diffusion: Pre-trained subject representation for controllable text-to-image generation and editing

D Li, J Li, S Hoi - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Subject-driven text-to-image generation models create novel renditions of an input subject
based on text prompts. Existing models suffer from lengthy fine-tuning and difficulties …

被引用次数：243 相关文章所有 5 个版本

[PDF] thecvf.com

Svdiff: Compact parameter space for diffusion fine-tuning

L Han, Y Li, H Zhang, P Milanfar… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recently, diffusion models have achieved remarkable success in text-to-image generation,
enabling the creation of high-quality images from text prompts and various conditions …

被引用次数：219 相关文章所有 9 个版本

[PDF] arxiv.org

Text-to-image diffusion models in generative ai: A survey

C Zhang, C Zhang, M Zhang, IS Kweon - arXiv preprint arXiv:2303.07909, 2023 - arxiv.org

This survey reviews text-to-image diffusion models in the context that diffusion models have
emerged to be popular for a wide range of generative tasks. As a self-contained work, this …

被引用次数：330 相关文章所有 4 个版本

[PDF] thecvf.com

Video-p2p: Video editing with cross-attention control

S Liu, Y Zhang, W Li, Z Lin, J Jia - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Video-P2P is the first framework for real-world video editing with cross-attention control.
While attention control has proven effective for image editing with pre-trained image …

被引用次数：169 相关文章所有 4 个版本

[PDF] thecvf.com

Instructdiffusion: A generalist modeling interface for vision tasks

Z Geng, B Yang, T Hang, C Li, S Gu… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present InstructDiffusion a unified and generic framework for aligning computer vision
tasks with human instructions. Unlike existing approaches that integrate prior knowledge …

被引用次数：87 相关文章所有 3 个版本

[PDF] arxiv.org

Composer: Creative and controllable image synthesis with composable conditions

L Huang, D Chen, Y Liu, Y Shen, D Zhao… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent large-scale generative models learned on big data are capable of synthesizing
incredible images yet suffer from limited controllability. This work offers a new generation …

被引用次数：245 相关文章所有 5 个版本

[PDF] thecvf.com

Hive: Harnessing human feedback for instructional visual editing

S Zhang, X Yang, Y Feng, C Qin… - Proceedings of the …, 2024 - openaccess.thecvf.com

Incorporating human feedback has been shown to be crucial to align text generated by large
language models to human preferences. We hypothesize that state-of-the-art instructional …

被引用次数：92 相关文章所有 4 个版本

[PDF] thecvf.com

Tf-icon: Diffusion-based training-free cross-domain image composition

S Lu, Y Liu, AWK Kong - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

Text-driven diffusion models have exhibited impressive generative capabilities, enabling
various image editing tasks. In this paper, we propose TF-ICON, a novel Training-Free …

被引用次数：72 相关文章所有 6 个版本

高级搜索

QQ 群