作者
Lijun Wang, Jianming Zhang, Oliver Wang, Zhe Lin, Huchuan Lu
发表日期
2020
研讨会论文
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
页码范围
541-550
简介
Monocular depth estimation is an ill-posed problem, and as such critically relies on scene priors and semantics. Due to its complexity, we propose a deep neural network model based on a semantic divide-and-conquer approach. Our model decomposes a scene into semantic segments, such as object instances and background stuff classes, and then predicts a scale and shift invariant depth map for each semantic segment in a canonical space. Semantic segments of the same category share the same depth decoder, so the global depth prediction task is decomposed into a series of category-specific ones, which are simpler to learn and easier to generalize to new scene types. Finally, our model stitches each local depth segment by predicting its scale and shift based on the global context of the image. The model is trained end-to-end using a multi-task loss for panoptic segmentation and depth prediction, and is therefore able to leverage large-scale panoptic segmentation datasets to boost its semantic understanding. We validate the effectiveness of our approach and show state-of-the-art performance on three benchmark datasets.
引用总数
2019202020212022202320241632444222
学术搜索中的文章
L Wang, J Zhang, O Wang, Z Lin, H Lu - Proceedings of the IEEE/CVF Conference on Computer …, 2020