相关文章- 学术资源搜索

Toward explainable and fine-grained 3d grounding through referring textual phrases

Z Yuan, X Yan, Z Li, X Li, Y Guo, S Cui, Z Li - arXiv preprint arXiv …, 2022 - arxiv.org

Recent progress in 3D scene understanding has explored visual grounding (3DVG) to
localize a target object through a language description. However, existing methods only …

被引用次数：13 相关文章所有 2 个版本

[PDF] thecvf.com

Multi3drefer: Grounding text description to multiple 3d objects

Y Zhang, ZM Gong, AX Chang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

We introduce the task of localizing a flexible number of objects in real-world 3D scenes
using natural language descriptions. Existing 3D visual grounding tasks focus on localizing …

被引用次数：21 相关文章所有 7 个版本

[PDF] arxiv.org

Viewrefer: Grasp the multi-view knowledge for 3d visual grounding with gpt and prototype guidance

Z Guo, Y Tang, R Zhang, D Wang, Z Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

Understanding 3D scenes from multi-view inputs has been proven to alleviate the view
discrepancy issue in 3D visual grounding. However, existing methods normally neglect the …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment

X Xu, Y Yuan, Q Zhang, W Wu, Z Jie, L Ma… - arXiv preprint arXiv …, 2023 - arxiv.org

Learning to ground natural language queries to target objects or regions in 3D point clouds
is quite essential for 3D scene understanding. Nevertheless, existing 3D visual grounding …

被引用次数：1 相关文章所有 2 个版本

[PDF] neurips.cc

Exploiting contextual objects and relations for 3d visual grounding

L Yang, Z Zhang, Z Qi, Y Xu, W Liu… - Advances in …, 2024 - proceedings.neurips.cc

Abstract 3D visual grounding, the task of identifying visual objects in 3D scenes based on
natural language inputs, plays a critical role in enabling machines to understand and …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

DOrA: 3D Visual Grounding with Order-Aware Referring

TY Wu, SY Huang, YCF Wang - arXiv preprint arXiv:2403.16539, 2024 - arxiv.org

3D visual grounding aims to identify the target object within a 3D point cloud scene referred
to by a natural language description. While previous works attempt to exploit the verbo …

被引用次数：1 相关文章所有 2 个版本

[PDF] thecvf.com

Distilling coarse-to-fine semantic matching knowledge for weakly supervised 3d visual grounding

Z Wang, H Huang, Y Zhao, L Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract 3D visual grounding involves finding a target object in a 3D scene that corresponds
to a given sentence query. Although many approaches have been proposed and achieved …

被引用次数：7 相关文章所有 5 个版本

[PDF] arxiv.org

Dense object grounding in 3d scenes

W Huang, D Liu, W Hu - Proceedings of the 31st ACM International …, 2023 - dl.acm.org

Localizing objects in 3D scenes according to the semantics of a given natural language is a
fundamental yet important task in the field of multimedia understanding, which benefits …

被引用次数：5 相关文章所有 3 个版本

[PDF] thecvf.com

Viewrefer: Grasp the multi-view knowledge for 3d visual grounding

Z Guo, Y Tang, R Zhang, D Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Understanding 3D scenes from multi-view inputs has been proven to alleviate the view
discrepancy issue in 3D visual grounding. However, existing methods normally neglect the …

被引用次数：15 相关文章所有 3 个版本

[PDF] thecvf.com

Eda: Explicit text-decoupling and dense alignment for 3d visual grounding

Y Wu, X Cheng, R Zhang, Z Cheng… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract 3D visual grounding aims to find the object within point clouds mentioned by free-
form natural language descriptions with rich semantic cues. However, existing methods …

被引用次数：44 相关文章所有 5 个版本

高级搜索

QQ 群

Toward explainable and fine-grained 3d grounding through referring textual phrases

Multi3drefer: Grounding text description to multiple 3d objects

Viewrefer: Grasp the multi-view knowledge for 3d visual grounding with gpt and prototype guidance

Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment

Exploiting contextual objects and relations for 3d visual grounding

DOrA: 3D Visual Grounding with Order-Aware Referring

Distilling coarse-to-fine semantic matching knowledge for weakly supervised 3d visual grounding

Dense object grounding in 3d scenes

Viewrefer: Grasp the multi-view knowledge for 3d visual grounding

Eda: Explicit text-decoupling and dense alignment for 3d visual grounding

引用