作者
Zining Wang, Wei Zhan, Masayoshi Tomizuka
发表日期
2018/6/26
研讨会论文
2018 IEEE Intelligent Vehicles Symposium (IV)
页码范围
1-6
出版商
IEEE
简介
We propose a new method for fusing LIDAR point cloud and camera-captured images in deep convolutional neural networks (CNN). The proposed method constructs a new layer called sparse non-homogeneous pooling layer to transform features between bird's eye view and front view. The sparse point cloud is used to construct the mapping between the two views. The pooling layer allows efficient fusion of the multi-view features at any stage of the network. This is favorable for 3D object detection using camera-LIDAR fusion for autonomous driving. A corresponding one-stage detector is designed and tested on the KITTI bird's eye view object detection dataset, which produces 3D bounding boxes from the bird's eye view map. The fusion method shows significant improvement on both speed and accuracy of the pedestrian detection over other fusion-based object detection networks.
引用总数
2018201920202021202220232024711193018188