Semantically-aware spatio-temporal reasoning agent for vision-and-language navigation in...

R Liu, X Wang, W Wang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract Vision-language navigation (VLN), which entails an agent to navigate 3D
environments following human instructions, has shown great advances. However, current …

被引用次数：46 相关文章所有 5 个版本

[PDF] thecvf.com

Gridmm: Grid memory map for vision-and-language navigation

Z Wang, X Li, J Yang, Y Liu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Vision-and-language navigation (VLN) enables the agent to navigate to a remote location
following the natural language instruction in 3D environments. To represent the previously …

被引用次数：48 相关文章所有 5 个版本

[PDF] neurips.cc

Weakly-supervised multi-granularity map learning for vision-and-language navigation

P Chen, D Ji, K Lin, R Zeng, T Li… - Advances in Neural …, 2022 - proceedings.neurips.cc

We address a practical yet challenging problem of training robot agents to navigate in an
environment following a path described by some language instructions. The instructions …

被引用次数：54 相关文章所有 8 个版本

[PDF] arxiv.org

Shapo: Implicit representations for multi-object shape, appearance, and pose optimization

MZ Irshad, S Zakharov, R Ambrus, T Kollar… - … on Computer Vision, 2022 - Springer

Our method studies the complex task of object-centric 3D understanding from a single RGB-
D observation. As it is an ill-posed problem, existing methods suffer from low performance …

被引用次数：61 相关文章所有 8 个版本

[PDF] thecvf.com

Bridging the gap between learning in discrete and continuous environments for vision-and-language navigation

Y Hong, Z Wang, Q Wu, S Gould - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Most existing works in vision-and-language navigation (VLN) focus on either discrete or
continuous environments, training agents that cannot generalize across the two. Although …

被引用次数：71 相关文章所有 7 个版本

[PDF] thecvf.com

Learning navigational visual representations with semantic map supervision

Y Hong, Y Zhou, R Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Being able to perceive the semantics and the spatial structure of the environment is
essential for visual navigation of a household robot. However, most existing works only …

被引用次数：24 相关文章所有 6 个版本

[PDF] thecvf.com

Bevbert: Multimodal map pre-training for language-guided navigation

D An, Y Qi, Y Li, Y Huang, L Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale pre-training has shown promising results on the vision-and-language
navigation (VLN) task. However, most existing pre-training methods employ discrete …

被引用次数：48 相关文章所有 3 个版本

[PDF] arxiv.org

Etpnav: Evolving topological planning for vision-language navigation in continuous environments

D An, H Wang, W Wang, Z Wang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Vision-language navigation is a task that requires an agent to follow instructions to navigate
in environments. It becomes increasingly crucial in the field of embodied AI, with potential …

被引用次数：48 相关文章所有 6 个版本

[PDF] arxiv.org

Vision-and-language navigation today and tomorrow: A survey in the era of foundation models

Y Zhang, Z Ma, J Li, Y Qiao, Z Wang, J Chai… - arXiv preprint arXiv …, 2024 - arxiv.org

Vision-and-Language Navigation (VLN) has gained increasing attention over recent years
and many approaches have emerged to advance their development. The remarkable …

被引用次数：12 相关文章所有 4 个版本

[PDF] thecvf.com

Meta-explore: Exploratory hierarchical vision-and-language navigation using scene object spectrum grounding

M Hwang, J Jeong, M Kim, Y Oh… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

The main challenge in vision-and-language navigation (VLN) is how to understand natural-
language instructions in an unseen environment. The main limitation of conventional VLN …

被引用次数：15 相关文章所有 9 个版本

高级搜索

QQ 群