Multi-modal fusion transformer for end-to-end autonomous driving

L Wang, Y Gong, Q Wang, K Zhou… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

In this work, we propose a real-time monocular 3D video reconstruction approach named
Flora for reconstructing delicate and complete 3D scenes from RGB video sequences in an …

被引用次数：12 相关文章所有 2 个版本

[PDF] mlr.press

Addressing optimism bias in sequence modeling for reinforcement learning

AR Villaflor, Z Huang, S Pande… - international …, 2022 - proceedings.mlr.press

Impressive results in natural language processing (NLP) based on the Transformer neural
network architecture have inspired researchers to explore viewing offline reinforcement …

被引用次数：28 相关文章所有 6 个版本

[PDF] researchgate.net

Attention-based interrelation modeling for explainable automated driving

Z Zhang, R Tian, R Sherony… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Automated driving desires better performance on tasks like motion planning and interacting
with pedestrians in mixed-traffic environments. Deep learning algorithms can achieve high …

被引用次数：34 相关文章所有 4 个版本

[PDF] arxiv.org

Transformers in 3d point clouds: A survey

D Lu, Q Xie, M Wei, K Gao, L Xu, J Li - arXiv preprint arXiv:2205.07417, 2022 - arxiv.org

Transformers have been at the heart of the Natural Language Processing (NLP) and
Computer Vision (CV) revolutions. The significant success in NLP and CV inspired exploring …

被引用次数：46 相关文章所有 2 个版本

Multi-modal policy fusion for end-to-end autonomous driving

Z Huang, S Sun, J Zhao, L Mao - Information Fusion, 2023 - Elsevier

Multi-modal learning has made impressive progress in autonomous driving by leveraging
information from multiple sensors. Existing feature fusion methods make decisions by …

被引用次数：11 相关文章所有 2 个版本

[PDF] thecvf.com

POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning

J Guan, L Shen, A Zhou, L Li, H Hu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Multi-constraint offline reinforcement learning (RL) promises to learn policies that satisfy
both cumulative and state-wise costs from offline datasets. This arrangement provides an …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Drivelm: Driving with graph visual question answering

C Sima, K Renz, K Chitta, L Chen, H Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org

We study how vision-language models (VLMs) trained on web-scale data can be integrated
into end-to-end driving systems to boost generalization and enable interactivity with human …

被引用次数：41 相关文章所有 2 个版本

[PDF] arxiv.org

Glass segmentation with RGB-thermal image pairs

D Huo, J Wang, Y Qian, YH Yang - IEEE Transactions on Image …, 2023 - ieeexplore.ieee.org

This paper proposes a new glass segmentation method utilizing paired RGB and thermal
images. Due to the large difference between the transmission property of visible light and …

被引用次数：32 相关文章所有 6 个版本

[PDF] thecvf.com

Lift: Learning 4d lidar image fusion transformer for 3d object detection

Y Zeng, D Zhang, C Wang, Z Miao… - Proceedings of the …, 2022 - openaccess.thecvf.com

LiDAR and camera are two common sensors to collect data in time for 3D object detection
under the autonomous driving context. Though the complementary information across …

被引用次数：19 相关文章所有 7 个版本

[PDF] researchgate.net

A comparative review on multi-modal sensors fusion based on deep learning

Q Tang, J Liang, F Zhu - Signal Processing, 2023 - Elsevier

The wide deployment of multi-modal sensors in various areas generates vast amounts of
data with characteristics of high volume, wide variety, and high integrity. However, traditional …

被引用次数：25 相关文章所有 3 个版本

高级搜索

QQ 群