idisc: Internal discretization for monocular depth estimation

M El Banani, A Raj, KK Maninis, A Kar… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recent advances in large-scale pretraining have yielded visual foundation models with
strong capabilities. Not only can recent models generalize to arbitrary images for their …

被引用次数：13 相关文章所有 3 个版本

[PDF] thecvf.com

Hugs: Holistic urban 3d scene understanding via gaussian splatting

H Zhou, J Shao, L Xu, D Bai, W Qiu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Holistic understanding of urban scenes based on RGB images is a challenging yet important
problem. It encompasses understanding both the geometry and appearance to enable novel …

被引用次数：8 相关文章所有 3 个版本

[PDF] thecvf.com

Polymax: General dense prediction with mask transformer

X Yang, L Yuan, K Wilber, A Sharma… - Proceedings of the …, 2024 - openaccess.thecvf.com

Dense prediction tasks, such as semantic segmentation, depth estimation, and surface
normal prediction, can be easily formulated as per-pixel classification (discrete outputs) or …

被引用次数：6 相关文章所有 7 个版本

VP-Net: Voxels as points for 3-D object detection

Z Song, H Wei, C Jia, Y Xia, X Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

The 3-D object detection with light detection and ranging (LiDAR) point clouds is a
challenging problem, which requires 3-D scene understanding, yet this task is critical to …

被引用次数：25 相关文章所有 4 个版本

[PDF] thecvf.com

Patchfusion: An end-to-end tile-based framework for high-resolution monocular metric depth estimation

Z Li, SF Bhat, P Wonka - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Single image depth estimation is a foundational task in computer vision and generative
modeling. However prevailing depth estimation models grapple with accommodating the …

被引用次数：5 相关文章所有 6 个版本

[PDF] acm.org

Deep Learning-based Depth Estimation Methods from Monocular Image and Videos: A Comprehensive Survey

U Rajapaksha, F Sohel, H Laga, D Diepeveen… - ACM Computing …, 2024 - dl.acm.org

Estimating depth from single RGB images and videos is of widespread interest due to its
applications in many areas, including autonomous driving, 3D reconstruction, digital …

Voxelnextfusion: A simple, unified and effective voxel fusion framework for multi-modal 3d object detection

Z Song, G Zhang, J Xie, L Liu, C Jia, S Xu… - arXiv preprint arXiv …, 2024 - arxiv.org

LiDAR-camera fusion can enhance the performance of 3D object detection by utilizing
complementary information between depth-aware LiDAR points and semantically rich …

被引用次数：7 相关文章所有 2 个版本

[PDF] thecvf.com

Atlantis: Enabling Underwater Depth Estimation with Stable Diffusion

F Zhang, S You, Y Li, Y Fu - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com

Monocular depth estimation has experienced significant progress on terrestrial images in
recent years thanks to deep learning advancements. But it remains inadequate for …

被引用次数：3 相关文章所有 3 个版本

[PDF] thecvf.com

Joint depth prediction and semantic segmentation with multi-view sam

M Shvets, D Zhao, M Niethammer… - Proceedings of the …, 2024 - openaccess.thecvf.com

Multi-task approaches to joint depth and segmentation prediction are well-studied for
monocular images. Yet, predictions from a single-view are inherently limited, while multiple …

被引用次数：2 相关文章所有 5 个版本

[PDF] thecvf.com

BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation

Y Ge, Y Tang, J Xu, C Gokmen, C Li… - Proceedings of the …, 2024 - openaccess.thecvf.com

The systematic evaluation and understanding of computer vision models under varying
conditions require large amounts of data with comprehensive and customized labels which …

高级搜索

QQ 群