W Deng, R Ding, J Yang, J Liu, Y Li, X Qi… - arXiv e …, 2024 - ui.adsabs.harvard.edu
Rapid advancements in 3D vision-language (3D-VL) tasks have opened up new avenues
for human interaction with embodied agents or robots using natural language. Despite this …