Multimodal text style transfer for outdoor vision-and-language navigation

J Gu, E Stefani, Q Wu, J Thomason… - arXiv preprint arXiv …, 2022 - arxiv.org

A long-term goal of AI research is to build intelligent agents that can communicate with
humans in natural language, perceive the environment, and perform real-world tasks. Vision …

被引用次数：126 相关文章所有 6 个版本

[PDF] aaai.org

Velma: Verbalization embodiment of llm agents for vision and language navigation in street view

R Schumann, W Zhu, W Feng, TJ Fu… - Proceedings of the …, 2024 - ojs.aaai.org

Incremental decision making in real-world environments is one of the most challenging tasks
in embodied artificial intelligence. One particularly demanding scenario is Vision and …

被引用次数：48 相关文章所有 7 个版本

[PDF] thecvf.com

Envedit: Environment editing for vision-and-language navigation

J Li, H Tan, M Bansal - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com

Abstract In Vision-and-Language Navigation (VLN), an agent needs to navigate through the
environment based on natural language instructions. Due to limited available data for agent …

被引用次数：85 相关文章所有 5 个版本

[PDF] arxiv.org

ChatGPT vs human-authored text: Insights into controllable text summarization and sentence style transfer

D Pu, V Demberg - arXiv preprint arXiv:2306.07799, 2023 - arxiv.org

Large-scale language models, like ChatGPT, have garnered significant media attention and
stunned the public with their remarkable capacity for generating coherent text from short …

被引用次数：50 相关文章所有 5 个版本

[PDF] thecvf.com

Pathdreamer: A world model for indoor navigation

JY Koh, H Lee, Y Yang, J Baldridge… - Proceedings of the …, 2021 - openaccess.thecvf.com

People navigating in unfamiliar buildings take advantage of myriad visual, spatial and
semantic cues to efficiently achieve their navigation goals. Towards equipping …

被引用次数：76 相关文章所有 10 个版本

[PDF] arxiv.org

Vision-language navigation: a survey and taxonomy

W Wu, T Chang, X Li, Q Yin, Y Hu - Neural Computing and Applications, 2024 - Springer

Vision-language navigation (VLN) tasks require an agent to follow language instructions
from a human guide to navigate in previously unseen environments using visual …

被引用次数：21 相关文章所有 4 个版本

[PDF] arxiv.org

Diagnosing vision-and-language navigation: What really matters

W Zhu, Y Qi, P Narayana, K Sone, S Basu… - arXiv preprint arXiv …, 2021 - arxiv.org

Vision-and-language navigation (VLN) is a multimodal task where an agent follows natural
language instructions and navigates in visual environments. Multiple setups have been …

被引用次数：45 相关文章所有 7 个版本

[PDF] arxiv.org

Ground then navigate: Language-guided navigation in dynamic scenes

K Jain, V Chhangani, A Tiwari… - … on Robotics and …, 2023 - ieeexplore.ieee.org

We investigate the Vision-and-Language Navigation (VLN) problem in the context of
autonomous driving in outdoor settings. We solve the problem by explicitly grounding the …

被引用次数：22 相关文章所有 3 个版本

[PDF] arxiv.org

Grounding and distinguishing conceptual vocabulary through similarity learning in embodied simulations

S Ghaffari, N Krishnaswamy - arXiv preprint arXiv:2305.13668, 2023 - arxiv.org

We present a novel method for using agent experiences gathered through an embodied
simulation to ground contextualized word vectors to object representations. We use similarity …

被引用次数：13 相关文章所有 6 个版本

[PDF] arxiv.org

Loc4plan: Locating before planning for outdoor vision and language navigation

H Tian, J Meng, WS Zheng, YM Li, J Yan… - Proceedings of the 32nd …, 2024 - dl.acm.org

Vision and Language Navigation (VLN) is a challenging task that requires agents to
understand instructions and navigate to the destination in a visual environment. One of the …

被引用次数：2 相关文章所有 5 个版本

高级搜索

QQ 群