Empowering things with intelligence: a survey of the progress, challenges, and opportunities in artificial intelligence of things

J Zhang, D Tao - IEEE Internet of Things Journal, 2020 - ieeexplore.ieee.org
In the Internet-of-Things (IoT) era, billions of sensors and devices collect and process data
from the environment, transmit them to cloud centers, and receive feedback via the Internet …

3D object detection for autonomous driving: A comprehensive survey

J Mao, S Shi, X Wang, H Li - International Journal of Computer Vision, 2023 - Springer
Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving …

Bevdepth: Acquisition of reliable depth for multi-view 3d object detection

Y Li, Z Ge, G Yu, J Yang, Z Wang, Y Shi… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
In this research, we propose a new 3D object detector with a trustworthy depth estimation,
dubbed BEVDepth, for camera-based Bird's-Eye-View~(BEV) 3D object detection. Our work …

Neural window fully-connected crfs for monocular depth estimation

W Yuan, X Gu, Z Dai, S Zhu… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Estimating the accurate depth from a single image is challenging since it is inherently
ambiguous and ill-posed. While recent works design increasingly complicated and powerful …

Cross-view transformers for real-time map-view semantic segmentation

B Zhou, P Krähenbühl - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
We present cross-view transformers, an efficient attention-based model for map-view
semantic segmentation from multiple cameras. Our architecture implicitly learns a mapping …

Zoedepth: Zero-shot transfer by combining relative and metric depth

SF Bhat, R Birkl, D Wofk, P Wonka, M Müller - arXiv preprint arXiv …, 2023 - arxiv.org
This paper tackles the problem of depth estimation from a single image. Existing work either
focuses on generalization performance disregarding metric scale, ie relative depth …

Bevstereo: Enhancing depth estimation in multi-view 3d object detection with temporal stereo

Y Li, H Bao, Z Ge, J Yang, J Sun, Z Li - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Restricted by the ability of depth perception, all Multi-view 3D object detection methods fall
into the bottleneck of depth accuracy. By constructing temporal stereo, depth estimation is …

Nope-nerf: Optimising neural radiance field with no pose prior

W Bian, Z Wang, K Li, JW Bian… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Training a Neural Radiance Field (NeRF) without pre-computed camera poses is
challenging. Recent advances in this direction demonstrate the possibility of jointly …

Spatialvlm: Endowing vision-language models with spatial reasoning capabilities

B Chen, Z Xu, S Kirmani, B Ichter… - Proceedings of the …, 2024 - openaccess.thecvf.com
Understanding and reasoning about spatial relationships is crucial for Visual Question
Answering (VQA) and robotics. Vision Language Models (VLMs) have shown impressive …

Repurposing diffusion-based image generators for monocular depth estimation

B Ke, A Obukhov, S Huang, N Metzger… - Proceedings of the …, 2024 - openaccess.thecvf.com
Monocular depth estimation is a fundamental computer vision task. Recovering 3D depth
from a single image is geometrically ill-posed and requires scene understanding so it is not …