Anydoor: Zero-shot object-level image customization

X Chen, L Huang, Y Liu, Y Shen… - Proceedings of the …, 2024 - openaccess.thecvf.com
This work presents AnyDoor a diffusion-based image generator with the power to teleport
target objects to new scenes at user-specified locations with desired shapes. Instead of …

Animate anyone: Consistent and controllable image-to-video synthesis for character animation

L Hu - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Character Animation aims to generating character videos from still images through driving
signals. Currently diffusion models have become the mainstream in visual generation …

Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

Video understanding with large language models: A survey

Y Tang, J Bi, S Xu, L Song, S Liang, T Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
With the burgeoning growth of online video platforms and the escalating volume of video
content, the demand for proficient video understanding tools has intensified markedly. Given …

Diffusion model-based image editing: A survey

Y Huang, J Huang, Y Liu, M Yan, J Lv, J Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Denoising diffusion models have emerged as a powerful tool for various image generation
and editing tasks, facilitating the synthesis of visual content in an unconditional or input …

Shadows Don't Lie and Lines Can't Bend! Generative Models don't know Projective Geometry... for now

A Sarkar, H Mai, A Mahapatra… - Proceedings of the …, 2024 - openaccess.thecvf.com
Generative models can produce impressively realistic images. This paper demonstrates that
generated images have geometric features different from those of real images. We build a …

Dl3dv-10k: A large-scale scene dataset for deep learning-based 3d vision

L Ling, Y Sheng, Z Tu, W Zhao, C Xin… - Proceedings of the …, 2024 - openaccess.thecvf.com
We have witnessed significant progress in deep learning-based 3D vision ranging from
neural radiance field (NeRF) based 3D representation learning to applications in novel view …

Imprint: Generative object compositing by learning identity-preserving representation

Y Song, Z Zhang, Z Lin, S Cohen… - Proceedings of the …, 2024 - openaccess.thecvf.com
Generative object compositing emerges as a promising new avenue for compositional
image editing. However the requirement of object identity preservation poses a significant …

Llmr: Real-time prompting of interactive worlds using large language models

F De La Torre, CM Fang, H Huang… - Proceedings of the CHI …, 2024 - dl.acm.org
We present Large Language Model for Mixed Reality (LLMR), a framework for the real-time
creation and modification of interactive Mixed Reality experiences using LLMs. LLMR …

Diffusion Handles Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D

K Pandey, P Guerrero, M Gadelha… - Proceedings of the …, 2024 - openaccess.thecvf.com
Diffusion handles is a novel approach to enable 3D object edits on diffusion images
requiring only existing pre-trained diffusion models depth estimation without any fine-tuning …