相关文章- 学术资源搜索

Transcribe3d: Grounding llms using transcribed information for 3d referential reasoning with self-corrected finetuning

J Fang, X Tan, S Lin, H Mei, M Walter - 2nd Workshop on Language …, 2023 - openreview.net

If robots are to work effectively alongside people, they must be able to interpret natural
language references to objects in their 3D environment. Understanding 3D referring …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Transcrib3D: 3D Referring Expression Resolution through Large Language Models

J Fang, X Tan, S Lin, I Vasiljevic, V Guizilini… - arXiv preprint arXiv …, 2024 - arxiv.org

If robots are to work effectively alongside people, they must be able to interpret natural
language references to objects in their 3D environment. Understanding 3D referring …

Lerf: Language embedded radiance fields

J Kerr, CM Kim, K Goldberg… - Proceedings of the …, 2023 - openaccess.thecvf.com

Humans describe the physical world using natural language to refer to specific 3D locations
based on a vast range of properties: visual appearance, semantics, abstract associations, or …

被引用次数：192 相关文章所有 6 个版本

[PDF] thecvf.com

Scanents3d: Exploiting phrase-to-3d-object correspondences for improved visio-linguistic models in 3d scenes

A Abdelreheem, K Olszewski, HY Lee… - Proceedings of the …, 2024 - openaccess.thecvf.com

The two popular datasets ScanRefer [20] and ReferIt3D [5] connect natural language to real-
world 3D scenes. In this paper, we curate a complementary dataset extending both the …

被引用次数：18 相关文章所有 7 个版本

[PDF] arxiv.org

Chat-3d v2: Bridging 3d scene and large language models with object identifiers

H Huang, Z Wang, R Huang, L Liu, X Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent research has evidenced the significant potentials of Large Language Models (LLMs)
in handling challenging tasks within 3D scenes. However, current models are constrained to …

被引用次数：10 相关文章所有 2 个版本

[PDF] mlr.press

Languagerefer: Spatial-language model for 3d visual grounding

J Roh, K Desingh, A Farhadi… - Conference on Robot …, 2022 - proceedings.mlr.press

For robots to understand human instructions and perform meaningful tasks in the near
future, it is important to develop learned models that comprehend referential language to …

被引用次数：80 相关文章所有 6 个版本

[PDF] arxiv.org

Rrex-bot: Remote referring expressions with a bag of tricks

GA Sigurdsson, J Thomason… - 2023 IEEE/RSJ …, 2023 - ieeexplore.ieee.org

Household robots operate in the same space for years. Such robots incrementally build
dynamic maps that can be used for tasks requiring remote object localization. However …

被引用次数：4 相关文章所有 5 个版本

[PDF] arxiv.org

Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension

R Guan, R Zhang, N Ouyang, J Liu, KL Man… - arXiv preprint arXiv …, 2024 - arxiv.org

Embodied perception is essential for intelligent vehicles and robots, enabling more natural
interaction and task execution. However, these advancements currently embrace vision …

Shapellm: Universal 3d object understanding for embodied interaction

Z Qi, R Dong, S Zhang, H Geng, C Han, Z Ge… - arXiv preprint arXiv …, 2024 - arxiv.org

This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM)
designed for embodied interaction, exploring a universal 3D object understanding with 3D …

被引用次数：12 相关文章所有 2 个版本

Question Generation for Uncertainty Elimination in Referring Expressions in 3D Environments

F Matsuzawa, Y Qiu, K Iwata… - … on Robotics and …, 2023 - ieeexplore.ieee.org

We introduce a new task of question generation to eliminate the uncertainty of referring
expressions in 3D indoor environments (3D-REQ). Referring to an object using natural …

被引用次数：1 相关文章

高级搜索

QQ 群

Transcribe3d: Grounding llms using transcribed information for 3d referential reasoning with self-corrected finetuning

Transcrib3D: 3D Referring Expression Resolution through Large Language Models

Lerf: Language embedded radiance fields

Scanents3d: Exploiting phrase-to-3d-object correspondences for improved visio-linguistic models in 3d scenes

Chat-3d v2: Bridging 3d scene and large language models with object identifiers

Languagerefer: Spatial-language model for 3d visual grounding

Rrex-bot: Remote referring expressions with a bag of tricks

Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension

Shapellm: Universal 3d object understanding for embodied interaction

Question Generation for Uncertainty Elimination in Referring Expressions in 3D Environments

引用