Infusion: Inject and attention fusion for multi concept zero-shot text-based video editing

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - arXiv preprint arXiv …, 2023 - arxiv.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

被引用次数：42 相关文章所有 3 个版本

Cross-image attention for zero-shot appearance transfer

Y Alaluf, D Garibi, O Patashnik… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org

Recent advancements in text-to-image generative models have demonstrated a remarkable
ability to capture a deep semantic understanding of images. In this work, we leverage this …

被引用次数：21 相关文章所有 2 个版本

[PDF] thecvf.com

Dreammatcher: Appearance matching self-attention for semantically-consistent text-to-image personalization

J Nam, H Kim, DJ Lee, S Jin, S Kim… - Proceedings of the …, 2024 - openaccess.thecvf.com

The objective of text-to-image (T2I) personalization is to customize a diffusion model to a
user-provided reference concept generating diverse images of the concept aligned with the …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Mtvg: Multi-text video generation with text-to-video models

G Oh, J Jeong, S Kim, W Byeon, J Kim, S Kim… - arXiv preprint arXiv …, 2023 - arxiv.org

Recently, video generation has attracted massive attention and yielded noticeable
outcomes. Concerning the characteristics of video, multi-text conditioning incorporating …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

A Survey on LoRA of Large Language Models

Y Mao, Y Ge, Y Fan, W Xu, Y Mi, Z Hu… - arXiv preprint arXiv …, 2024 - arxiv.org

Low-Rank Adaptation~(LoRA), which updates the dense neural network layers with
pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning …

被引用次数：2 相关文章所有 2 个版本

[PDF] thecvf.com

Generating Non-Stationary Textures using Self-Rectification

Y Zhou, R Xiao, D Lischinski… - Proceedings of the …, 2024 - openaccess.thecvf.com

This paper addresses the challenge of example-based non-stationary texture synthesis. We
introduce a novel two-step approach wherein users first modify a reference texture using …

UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control

X Chen, T Xia, S Xu - arXiv preprint arXiv:2403.02332, 2024 - arxiv.org

Video Diffusion Models have been developed for video generation, usually integrating text
and image conditioning to enhance control over the generated content. Despite the …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Text Prompting for Multi-Concept Video Customization by Autoregressive Generation

D Kothandaraman, K Sohn, R Villegas… - arXiv preprint arXiv …, 2024 - arxiv.org

We present a method for multi-concept customization of pretrained text-to-video (T2V)
models. Intuitively, the multi-concept customized video can be derived from the (non-linear) …

PRIME: Protect Your Videos From Malicious Editing

G Li, S Yang, J Zhang, T Zhang - arXiv preprint arXiv:2402.01239, 2024 - arxiv.org

With the development of generative models, the quality of generated content keeps
increasing. Recently, open-source models have made it surprisingly easy to manipulate and …

高级搜索

QQ 群