Splicing vit features for semantic appearance transfer

M Oquab, T Darcet, T Moutakanni, H Vo… - arXiv preprint arXiv …, 2023 - arxiv.org

The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …

被引用次数：1184 相关文章所有 11 个版本

[PDF] thecvf.com

Plug-and-play diffusion features for text-driven image-to-image translation

N Tumanyan, M Geyer, S Bagon… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale text-to-image generative models have been a revolutionary breakthrough in the
evolution of generative AI, synthesizing diverse images with highly complex visual concepts …

被引用次数：419 相关文章所有 6 个版本

[PDF] neurips.cc

Emergent correspondence from image diffusion

L Tang, M Jia, Q Wang, CP Phoo… - Advances in Neural …, 2023 - proceedings.neurips.cc

Finding correspondences between images is a fundamental problem in computer vision. In
this paper, we show that correspondence emerges in image diffusion models without any …

被引用次数：185 相关文章所有 12 个版本

[PDF] acm.org Full View

Zero-shot image-to-image translation

G Parmar, K Kumar Singh, R Zhang, Y Li, J Lu… - ACM SIGGRAPH 2023 …, 2023 - dl.acm.org

Large-scale text-to-image generative models have shown their remarkable ability to
synthesize diverse, high-quality images. However, directly applying these models for real …

被引用次数：292 相关文章所有 3 个版本

Text2live: Text-driven layered image and video editing

O Bar-Tal, D Ofri-Amar, R Fridman, Y Kasten… - European conference on …, 2022 - Springer

We present a method for zero-shot, text-driven editing of natural images and videos. Given
an image or a video and a text prompt, our goal is to edit the appearance of existing objects …

被引用次数：268 相关文章所有 4 个版本

[PDF] arxiv.org

Encoder-based domain tuning for fast personalization of text-to-image models

R Gal, M Arar, Y Atzmon, AH Bermano… - ACM Transactions on …, 2023 - dl.acm.org

Text-to-image personalization aims to teach a pre-trained diffusion model to reason about
novel, user provided concepts, embedding them into new scenes guided by natural …

被引用次数：115 相关文章所有 4 个版本

[PDF] thecvf.com

Sine: Semantic-driven image-based nerf editing with prior-guided editing field

C Bao, Y Zhang, B Yang, T Fan… - Proceedings of the …, 2023 - openaccess.thecvf.com

Despite the great success in 2D editing using user-friendly tools, such as Photoshop,
semantic strokes, or even text prompts, similar capabilities in 3D areas are still limited, either …

被引用次数：74 相关文章所有 5 个版本

[PDF] arxiv.org

Neural feature fusion fields: 3d distillation of self-supervised 2d image representations

V Tschernezki, I Laina, D Larlus… - … Conference on 3D …, 2022 - ieeexplore.ieee.org

We present Neural Feature Fusion Fields (N3F),\a method that improves dense 2D image
feature extractors when the latter are applied to the analysis of multiple images …

被引用次数：131 相关文章所有 10 个版本

[PDF] arxiv.org

Sinnerf: Training neural radiance fields on complex scenes from a single image

D Xu, Y Jiang, P Wang, Z Fan, H Shi… - European Conference on …, 2022 - Springer

Despite the rapid development of Neural Radiance Field (NeRF), the necessity of dense
covers largely prohibits its wider applications. While several recent works have attempted to …

被引用次数：147 相关文章所有 5 个版本

[PDF] thecvf.com

Style aligned image generation via shared attention

A Hertz, A Voynov, S Fruchter… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Large-scale Text-to-Image (T2I) models have rapidly gained prominence across
creative fields generating visually compelling outputs from textual prompts. However …

被引用次数：30 相关文章所有 4 个版本

高级搜索

QQ 群