Weakly supervised object localization and detection: A survey

D Zhang, J Han, G Cheng… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
As an emerging and challenging problem in the computer vision community, weakly
supervised object localization and detection plays an important role for developing new …

Artificial intelligence (AI) in augmented reality (AR)-assisted manufacturing applications: a review

CK Sahu, C Young, R Rai - International Journal of Production …, 2021 - Taylor & Francis
Augmented reality (AR) has proven to be an invaluable interactive medium to reduce
cognitive load by bridging the gap between the task-at-hand and relevant information by …

Zero-1-to-3: Zero-shot one image to 3d object

R Liu, R Wu, B Van Hoorick… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We introduce Zero-1-to-3, a framework for changing the camera viewpoint of an
object given just a single RGB image. To perform novel view synthesis in this …

One-2-3-45: Any single image to 3d mesh in 45 seconds without per-shape optimization

M Liu, C Xu, H Jin, L Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Single image 3D reconstruction is an important but challenging task that requires extensive
knowledge of our natural world. Many existing methods solve this problem by optimizing a …

Dreambooth3d: Subject-driven text-to-3d generation

A Raj, S Kaza, B Poole, M Niemeyer… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present DreamBooth3D, an approach to personalize text-to-3D generative models from
as few as 3-6 casually captured images of a subject. Our approach combines recent …

One-2-3-45++: Fast single image to 3d objects with consistent multi-view generation and 3d diffusion

M Liu, R Shi, L Chen, Z Zhang, C Xu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advancements in open-world 3D object generation have been remarkable with
image-to-3D methods offering superior fine-grained control over their text-to-3D …

Nerfdiff: Single-image view synthesis with nerf-guided distillation from 3d-aware diffusion

J Gu, A Trevithick, KE Lin, JM Susskind… - International …, 2023 - proceedings.mlr.press
Novel view synthesis from a single image requires inferring occluded regions of objects and
scenes whilst simultaneously maintaining semantic and physical consistency with the input …

Voxformer: Sparse voxel transformer for camera-based 3d semantic scene completion

Y Li, Z Yu, C Choy, C Xiao, JM Alvarez… - Proceedings of the …, 2023 - openaccess.thecvf.com
Humans can easily imagine the complete 3D geometry of occluded objects and scenes. This
appealing ability is vital for recognition and understanding. To enable such capability in AI …

Masked autoencoders for point cloud self-supervised learning

Y Pang, W Wang, FEH Tay, W Liu, Y Tian… - European conference on …, 2022 - Springer
As a promising scheme of self-supervised learning, masked autoencoding has significantly
advanced natural language processing and computer vision. Inspired by this, we propose a …

Point-m2ae: multi-scale masked autoencoders for hierarchical point cloud pre-training

R Zhang, Z Guo, P Gao, R Fang… - Advances in neural …, 2022 - proceedings.neurips.cc
Masked Autoencoders (MAE) have shown great potentials in self-supervised pre-training for
language and 2D image transformers. However, it still remains an open question on how to …