We investigate the Vision-and-Language Navigation (VLN) problem in the context of autonomous driving in outdoor settings. We solve the problem by explicitly grounding the …
H Zhang, L Wang, S Li, K Xu, B Yin - Neurocomputing, 2024 - Elsevier
Referring image segmentation aims to segment the instance corresponding to the given language description, which requires aligning information from two modalities. Existing …
N Hosomi, S Hatanaka, Y Iioka, W Yang… - IEEE Robotics and …, 2024 - ieeexplore.ieee.org
In this study, we develop a model that enables mobilities to have more friendly interactions with users. Specifically, we focus on the referring navigable regions task in which a model …
Existing Vision-Language models (VLMs) estimate either long-term trajectory waypoints or a set of control actions as a reactive solution for closed-loop planning based on their rich …
Autonomous driving requires efficient reasoning about the location and appearance of the different agents in the scene, which aids in downstream tasks such as object detection …
Recent advancement of vehicle automation technology is expected to improve the interaction between human and mobility modes. As the promising means, language …