D Liu, Y Liu, W Huang, W Hu - arXiv preprint arXiv:2406.05785, 2024 - arxiv.org
Text-guided 3D visual grounding (T-3DVG), which aims to locate a specific object that
semantically corresponds to a language query from a complicated 3D scene, has drawn …