Semantically-aware spatio-temporal reasoning agent for vision-and-language navigation in continuous environments

MZ Irshad, NC Mithun, Z Seymour… - 2022 26th …, 2022 - ieeexplore.ieee.org
This paper presents a novel approach for the Vision-and-Language Navigation (VLN) task in
continuous 3D environments, which requires an autonomous agent to follow natural
language instructions in unseen environments. Existing end-to-end learning-based VLN
methods struggle at this task as they focus mostly on utilizing raw visual observations and
lack the semantic spatio-temporal reasoning capabilities which is crucial in generalizing to
new environments. In this regard, we present a hybrid transformer-recurrence model which …

SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments

M Zubair Irshad, N Chowdhury Mithun… - arXiv e …, 2021 - ui.adsabs.harvard.edu
This paper presents a novel approach for the Vision-and-Language Navigation (VLN) task in
continuous 3D environments, which requires an autonomous agent to follow natural
language instructions in unseen environments. Existing end-to-end learning-based VLN
methods struggle at this task as they focus mostly on utilizing raw visual observations and
lack the semantic spatio-temporal reasoning capabilities which is crucial in generalizing to
new environments. In this regard, we present a hybrid transformer-recurrence model which …
以上显示的是最相近的搜索结果。 查看全部搜索结果