Dinov2: Learning robust visual features without supervision

M Oquab, T Darcet, T Moutakanni, H Vo… - arXiv preprint arXiv …, 2023 - arxiv.org
The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …

Plug-and-play diffusion features for text-driven image-to-image translation

N Tumanyan, M Geyer, S Bagon… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large-scale text-to-image generative models have been a revolutionary breakthrough in the
evolution of generative AI, synthesizing diverse images with highly complex visual concepts …

Emergent correspondence from image diffusion

L Tang, M Jia, Q Wang, CP Phoo… - Advances in Neural …, 2023 - proceedings.neurips.cc
Finding correspondences between images is a fundamental problem in computer vision. In
this paper, we show that correspondence emerges in image diffusion models without any …

Zero-shot image-to-image translation

G Parmar, K Kumar Singh, R Zhang, Y Li, J Lu… - ACM SIGGRAPH 2023 …, 2023 - dl.acm.org
Large-scale text-to-image generative models have shown their remarkable ability to
synthesize diverse, high-quality images. However, directly applying these models for real …

Text2live: Text-driven layered image and video editing

O Bar-Tal, D Ofri-Amar, R Fridman, Y Kasten… - European conference on …, 2022 - Springer
We present a method for zero-shot, text-driven editing of natural images and videos. Given
an image or a video and a text prompt, our goal is to edit the appearance of existing objects …

Encoder-based domain tuning for fast personalization of text-to-image models

R Gal, M Arar, Y Atzmon, AH Bermano… - ACM Transactions on …, 2023 - dl.acm.org
Text-to-image personalization aims to teach a pre-trained diffusion model to reason about
novel, user provided concepts, embedding them into new scenes guided by natural …

Sine: Semantic-driven image-based nerf editing with prior-guided editing field

C Bao, Y Zhang, B Yang, T Fan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Despite the great success in 2D editing using user-friendly tools, such as Photoshop,
semantic strokes, or even text prompts, similar capabilities in 3D areas are still limited, either …

Neural feature fusion fields: 3d distillation of self-supervised 2d image representations

V Tschernezki, I Laina, D Larlus… - … Conference on 3D …, 2022 - ieeexplore.ieee.org
We present Neural Feature Fusion Fields (N3F),\a method that improves dense 2D image
feature extractors when the latter are applied to the analysis of multiple images …

Sinnerf: Training neural radiance fields on complex scenes from a single image

D Xu, Y Jiang, P Wang, Z Fan, H Shi… - European Conference on …, 2022 - Springer
Despite the rapid development of Neural Radiance Field (NeRF), the necessity of dense
covers largely prohibits its wider applications. While several recent works have attempted to …

Style aligned image generation via shared attention

A Hertz, A Voynov, S Fruchter… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Large-scale Text-to-Image (T2I) models have rapidly gained prominence across
creative fields generating visually compelling outputs from textual prompts. However …