Towards knowledge-driven autonomous driving

X Li, Y Bai, P Cai, L Wen, D Fu, B Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper explores the emerging knowledge-driven autonomous driving technologies. Our
investigation highlights the limitations of current autonomous driving systems, in particular …

Delving into Multi-modal Multi-task Foundation Models for Road Scene Understanding: From Learning Paradigm Perspectives

S Luo, W Chen, W Tian, R Liu, L Hou… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
Foundation models have indeed made a profound impact on various fields, emerging as
pivotal components that significantly shape the capabilities of intelligent systems. In the …

Merging Vision Transformers from Different Tasks and Domains

P Ye, C Huang, M Shen, T Chen, Y Huang… - arXiv preprint arXiv …, 2023 - arxiv.org
This work targets to merge various Vision Transformers (ViTs) trained on different tasks (ie,
datasets with different object categories) or domains (ie, datasets with the same categories …

Push-and-Pull: A General Training Framework with Differential Augmentor for Domain Generalized Point Cloud Classification

J Xu, X Ma, L Zhang, B Zhang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
As a fundamental task of 3D perception, point cloud recognition has shown significant
progress in recent years. However, existing methods still face challenges when dealing with …

Efficient Architecture Search for Real-Time Instance Segmentation

R Xia, D Zhang, Y Dong, J Zhao, W Liao… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Traditional CNN-based training for instance segmentation is time-consuming owing to large
datasets and complex network modules, making direct searching of architecture …