Where is misty? interpreting spatial descriptors by modeling regions in space

H Chen, A Suhr, D Misra… - Proceedings of the …, 2019 - openaccess.thecvf.com

We study the problem of jointly reasoning about language and vision through a navigation
and spatial reasoning task. We introduce the Touchdown task and dataset, where an agent …

被引用次数：408 相关文章所有 11 个版本

[PDF] arxiv.org

Mapping instructions to actions in 3d environments with visual goal prediction

D Misra, A Bennett, V Blukis, E Niklasson… - arXiv preprint arXiv …, 2018 - arxiv.org

We propose to decompose instruction execution to goal prediction and action generation.
We design a model that maps raw visual observations to goals using LINGUNET, a …

被引用次数：201 相关文章所有 8 个版本

[PDF] arxiv.org

Learning to execute actions or ask clarification questions

Z Shi, Y Feng, A Lipani - arXiv preprint arXiv:2204.08373, 2022 - arxiv.org

Collaborative tasks are ubiquitous activities where a form of communication is required in
order to reach a joint goal. Collaborative building is one of such tasks. We wish to develop …

被引用次数：44 相关文章所有 7 个版本

[PDF] arxiv.org

Interactive grounded language acquisition and generalization in a 2d world

H Yu, H Zhang, W Xu - arXiv preprint arXiv:1802.01433, 2018 - arxiv.org

We build a virtual agent for learning language in a 2D maze-like world. The agent sees
images of the surrounding environment, listens to a virtual teacher, and takes actions to …

被引用次数：84 相关文章所有 6 个版本

[PDF] frontiersin.org

Crossmodal Language Comprehension—Psycholinguistic Insights and Computational Approaches

Ö Alaçam, X Li, W Menzel, T Staron - Frontiers in neurorobotics, 2020 - frontiersin.org

Crossmodal interaction in situated language comprehension is important for effective and
efficient communication. The relationship between linguistic and visual stimuli provides …

被引用次数：7 相关文章所有 9 个版本

[PDF] arxiv.org

Why Build an Assistant in Minecraft?

A Szlam, J Gray, K Srinet, Y Jernite, A Joulin… - arXiv preprint arXiv …, 2019 - arxiv.org

arXiv:1907.09273v2 [cs.AI] 25 Jul 2019 Page 1 Why Build an Assistant in Minecraft? Arthur
Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe …

被引用次数：25 相关文章所有 3 个版本

[PDF] aclanthology.org

CraftAssist instruction parsing: Semantic parsing for a voxel-world assistant

K Srinet, Y Jernite, J Gray, A Szlam - … of the 58th Annual Meeting of …, 2020 - aclanthology.org

We propose a semantic parsing dataset focused on instruction-driven communication with
an agent in the game Minecraft. The dataset consists of 7K human utterances and their …

被引用次数：11 相关文章

[PDF] aclanthology.org

Points, paths, and playscapes: Large-scale spatial language understanding tasks set in the real world

J Baldridge, T Bedrax-Weiss, D Luong… - Proceedings of the …, 2018 - aclanthology.org

Spatial language understanding is important for practical applications and as a building
block for better abstract language understanding. Much progress has been made through …

被引用次数：11 相关文章所有 4 个版本

[PDF] arxiv.org

Craftassist instruction parsing: Semantic parsing for a minecraft assistant

Y Jernite, K Srinet, J Gray, A Szlam - arXiv preprint arXiv:1905.01978, 2019 - arxiv.org

We propose a large scale semantic parsing dataset focused on instruction-driven
communication with an agent in Minecraft. We describe the data collection process which …

被引用次数：7 相关文章所有 3 个版本

[PDF] cornell.edu

[图书][B] Scalable and Interpretable Approaches for Learning to Follow Natural Language Instructions

DK Misra - 2019 - search.proquest.com

Agents that can execute natural language instructions have many applications. For example,
an assistive house robot that can follow instructions will reduce the time spent on doing …

高级搜索

QQ 群