Transcribe3d: Grounding llms using transcribed information for 3d referential reasoning with self-corrected finetuning

J Fang, X Tan, S Lin, H Mei, M Walter - 2nd Workshop on Language …, 2023 - openreview.net
If robots are to work effectively alongside people, they must be able to interpret natural
language references to objects in their 3D environment. Understanding 3D referring …

Transcrib3D: 3D Referring Expression Resolution through Large Language Models

J Fang, X Tan, S Lin, I Vasiljevic, V Guizilini… - arXiv preprint arXiv …, 2024 - arxiv.org
If robots are to work effectively alongside people, they must be able to interpret natural
language references to objects in their 3D environment. Understanding 3D referring …

Lerf: Language embedded radiance fields

J Kerr, CM Kim, K Goldberg… - Proceedings of the …, 2023 - openaccess.thecvf.com
Humans describe the physical world using natural language to refer to specific 3D locations
based on a vast range of properties: visual appearance, semantics, abstract associations, or …

Scanents3d: Exploiting phrase-to-3d-object correspondences for improved visio-linguistic models in 3d scenes

A Abdelreheem, K Olszewski, HY Lee… - Proceedings of the …, 2024 - openaccess.thecvf.com
The two popular datasets ScanRefer [20] and ReferIt3D [5] connect natural language to real-
world 3D scenes. In this paper, we curate a complementary dataset extending both the …

Chat-3d v2: Bridging 3d scene and large language models with object identifiers

H Huang, Z Wang, R Huang, L Liu, X Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent research has evidenced the significant potentials of Large Language Models (LLMs)
in handling challenging tasks within 3D scenes. However, current models are constrained to …

Languagerefer: Spatial-language model for 3d visual grounding

J Roh, K Desingh, A Farhadi… - Conference on Robot …, 2022 - proceedings.mlr.press
For robots to understand human instructions and perform meaningful tasks in the near
future, it is important to develop learned models that comprehend referential language to …

Rrex-bot: Remote referring expressions with a bag of tricks

GA Sigurdsson, J Thomason… - 2023 IEEE/RSJ …, 2023 - ieeexplore.ieee.org
Household robots operate in the same space for years. Such robots incrementally build
dynamic maps that can be used for tasks requiring remote object localization. However …

Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension

R Guan, R Zhang, N Ouyang, J Liu, KL Man… - arXiv preprint arXiv …, 2024 - arxiv.org
Embodied perception is essential for intelligent vehicles and robots, enabling more natural
interaction and task execution. However, these advancements currently embrace vision …

Shapellm: Universal 3d object understanding for embodied interaction

Z Qi, R Dong, S Zhang, H Geng, C Han, Z Ge… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper presents ShapeLLM, the first 3D Multimodal Large Language Model (LLM)
designed for embodied interaction, exploring a universal 3D object understanding with 3D …

Question Generation for Uncertainty Elimination in Referring Expressions in 3D Environments

F Matsuzawa, Y Qiu, K Iwata… - … on Robotics and …, 2023 - ieeexplore.ieee.org
We introduce a new task of question generation to eliminate the uncertainty of referring
expressions in 3D indoor environments (3D-REQ). Referring to an object using natural …