State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

Objaverse: A universe of annotated 3d objects

M Deitke, D Schwenk, J Salvador… - Proceedings of the …, 2023 - openaccess.thecvf.com
Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebImageText, and
LAION have propelled recent dramatic progress in AI. Large neural models trained on such …

Sdfusion: Multimodal 3d shape completion, reconstruction, and generation

YC Cheng, HY Lee, S Tulyakov… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this work, we present a novel framework built to simplify 3D asset generation for amateur
users. To enable interactive generation, our method supports a variety of input modalities …

Scalable 3d captioning with pretrained models

T Luo, C Rockwell, H Lee… - Advances in Neural …, 2024 - proceedings.neurips.cc
We introduce Cap3D, an automatic approach for generating descriptive text for 3D objects.
This approach utilizes pretrained models from image captioning, image-text alignment, and …

Autosdf: Shape priors for 3d completion, reconstruction and generation

P Mittal, YC Cheng, M Singh… - Proceedings of the …, 2022 - openaccess.thecvf.com
Powerful priors allow us to perform inference with insufficient information. In this paper, we
propose an autoregressive prior for 3D shapes to solve multimodal 3D tasks such as shape …

3dshape2vecset: A 3d shape representation for neural fields and generative diffusion models

B Zhang, J Tang, M Niessner, P Wonka - ACM Transactions on Graphics …, 2023 - dl.acm.org
We introduce 3DShape2VecSet, a novel shape representation for neural fields designed for
generative diffusion models. Our shape representation can encode 3D shapes given as …

Diffusion-sdf: Text-to-shape via voxelized diffusion

M Li, Y Duan, J Zhou, J Lu - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com
With the rising industrial attention to 3D virtual modeling technology, generating novel 3D
content based on specified conditions (eg text) has become a hot issue. In this paper, we …

Michelangelo: Conditional 3d shape generation based on shape-image-text aligned latent representation

Z Zhao, W Liu, X Chen, X Zeng… - Advances in …, 2024 - proceedings.neurips.cc
We present a novel alignment-before-generation approach to tackle the challenging task of
generating general 3D shapes based on 2D images or texts. Directly learning a conditional …

Multi3drefer: Grounding text description to multiple 3d objects

Y Zhang, ZM Gong, AX Chang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We introduce the task of localizing a flexible number of objects in real-world 3D scenes
using natural language descriptions. Existing 3D visual grounding tasks focus on localizing …

Referit3d: Neural listeners for fine-grained 3d object identification in real-world scenes

P Achlioptas, A Abdelreheem, F Xia… - Computer Vision–ECCV …, 2020 - Springer
In this work we study the problem of using referential language to identify common objects in
real-world 3D scenes. We focus on a challenging setup where the referred object belongs to …