Multi-modal 3d object detection in autonomous driving: A survey and taxonomy

L Wang, X Zhang, Z Song, J Bi, G Zhang… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Autonomous vehicles require constant environmental perception to obtain the distribution of
obstacles to achieve safe driving. Specifically, 3D object detection is a vital functional …

Object detection in 20 years: A survey

Z Zou, K Chen, Z Shi, Y Guo, J Ye - Proceedings of the IEEE, 2023 - ieeexplore.ieee.org
Object detection, as of one the most fundamental and challenging problems in computer
vision, has received great attention in recent years. Over the past two decades, we have …

Multimodal learning with transformers: A survey

P Xu, X Zhu, DA Clifton - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Transformer is a promising neural network learner, and has achieved great success in
various machine learning tasks. Thanks to the recent prevalence of multimodal applications …

Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving

A Ando, S Gidaris, A Bursuc, G Puy… - Proceedings of the …, 2023 - openaccess.thecvf.com
Casting semantic segmentation of outdoor LiDAR point clouds as a 2D problem, eg, via
range projection, is an effective and popular approach. These projection-based methods …

A simple vision transformer for weakly semi-supervised 3d object detection

D Zhang, D Liang, Z Zou, J Li, X Ye… - Proceedings of the …, 2023 - openaccess.thecvf.com
Advanced 3D object detection methods usually rely on large-scale, elaborately labeled
datasets to achieve good performance. However, labeling the bounding boxes for the 3D …

Transformers in 3d point clouds: A survey

D Lu, Q Xie, M Wei, K Gao, L Xu, J Li - arXiv preprint arXiv:2205.07417, 2022 - arxiv.org
Transformers have been at the heart of the Natural Language Processing (NLP) and
Computer Vision (CV) revolutions. The significant success in NLP and CV inspired exploring …

2D-3D interlaced transformer for point cloud segmentation with scene-level supervision

CK Yang, MH Chen, YY Chuang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We present a Multimodal Interlaced Transformer (MIT) that jointly considers 2D and
3D data for weakly supervised point cloud segmentation. Research studies have shown that …

[HTML][HTML] A comprehensive survey of transformers for computer vision

S Jamil, M Jalil Piran, OJ Kwon - Drones, 2023 - mdpi.com
As a special type of transformer, vision transformers (ViTs) can be used for various computer
vision (CV) applications. Convolutional neural networks (CNNs) have several potential …

3D vision with transformers: A survey

J Lahoud, J Cao, FS Khan, H Cholakkal… - arXiv preprint arXiv …, 2022 - arxiv.org
The success of the transformer architecture in natural language processing has recently
triggered attention in the computer vision field. The transformer has been used as a …

Attention discriminant sampling for point clouds

CY Hong, YY Chou, TL Liu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
This paper describes an attention-driven approach to 3-D point cloud sampling. We
establish our method based on a structure-aware attention discriminant analysis that …