Udepth: Fast monocular depth estimation for visually-guided underwater robots

B Yu, J Wu, MJ Islam - 2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org
2023 IEEE International Conference on Robotics and Automation (ICRA), 2023ieeexplore.ieee.org
In this paper, we present a fast monocular depth estimation method for enabling 3D
perception capabilities of low-cost underwater robots. We formulate a novel end-to-end
deep visual learning pipeline named UDepth, which incorporates domain knowledge of
image formation characteristics of natural underwater scenes. First, we adapt a new input
space from raw RGB image space by exploiting underwater light attenuation prior, and then
devise a least-squared formulation for coarse pixel-wise depth prediction. Subsequently, we …
In this paper, we present a fast monocular depth estimation method for enabling 3D perception capabilities of low-cost underwater robots. We formulate a novel end-to-end deep visual learning pipeline named UDepth, which incorporates domain knowledge of image formation characteristics of natural underwater scenes. First, we adapt a new input space from raw RGB image space by exploiting underwater light attenuation prior, and then devise a least-squared formulation for coarse pixel-wise depth prediction. Subsequently, we extend this into a domain projection loss that guides the end-to-end learning of UDepth on over 9K RGB-D training samples. UDepth is designed with a computationally light MobileNetV2 backbone and a Transformer-based optimizer for ensuring fast inference rates on embedded systems. By domain-aware design choices and through comprehensive experimental analyses, we demonstrate that it is possible to achieve state-of-the-art depth estimation performance while ensuring a small computational footprint. Specifically, with 70 % −80 % less network parameters than existing benchmarks, UDepth achieves comparable and often better depth estimation performance. While the full model offers over 66 FPS (13 FPS) inference rates on a single GPU (CPU core), our domain projection for coarse depth prediction runs at 51.5 FPS rates on single-board Jetson TX2s. The inference pipelines are available at https://github.com/uf-robopi/UDepth.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果