相关文章- 学术资源搜索

Spatiality-guided transformer for 3d dense captioning on point clouds

H Wang, C Zhang, J Yu, W Cai - arXiv preprint arXiv:2204.10688, 2022 - arxiv.org

Dense captioning in 3D point clouds is an emerging vision-and-language task involving
object-level 3D scene understanding. Apart from coarse semantic class prediction and …

被引用次数：23 相关文章所有 5 个版本

[PDF] thecvf.com

End-to-end 3d dense captioning with vote2cap-detr

S Chen, H Zhu, X Chen, Y Lei… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract 3D dense captioning aims to generate multiple captions localized with their
associated object regions. Existing methods follow a sophisticated" detect-then-describe" …

被引用次数：22 相关文章所有 5 个版本

[PDF] arxiv.org

Contextual modeling for 3d dense captioning on point clouds

Y Zhong, L Xu, J Luo, L Ma - arXiv preprint arXiv:2210.03925, 2022 - arxiv.org

3D dense captioning, as an emerging vision-language task, aims to identify and locate each
object from a set of point clouds and generate a distinctive natural language sentence for …

被引用次数：8 相关文章所有 2 个版本

[PDF] thecvf.com

X-trans2cap: Cross-modal knowledge transfer using transformer for 3d dense captioning

Z Yuan, X Yan, Y Liao, Y Guo, G Li… - Proceedings of the …, 2022 - openaccess.thecvf.com

Abstract 3D dense captioning aims to describe individual objects by natural language in 3D
scenes, where 3D scenes are usually represented as RGB-D scans or point clouds …

被引用次数：59 相关文章所有 7 个版本

[PDF] arxiv.org

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

B Jin, Y Zheng, P Li, W Li, Y Zheng, S Hu, X Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

3D dense captioning stands as a cornerstone in achieving a comprehensive understanding
of 3D scenes through natural language. It has recently witnessed remarkable achievements …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Vote2cap-detr++: Decoupling localization and describing for end-to-end 3d dense captioning

S Chen, H Zhu, M Li, X Chen, P Guo… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

3D dense captioning requires a model to translate its understanding of an input 3D scene
into several captions associated with different object regions. Existing methods adopt a …

被引用次数：2 相关文章所有 2 个版本

Complete 3d relationships extraction modality alignment network for 3d dense captioning

A Mao, Z Yang, W Chen, R Yi… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

3D dense captioning aims to semantically describe each object detected in a 3D scene,
which plays a significant role in 3D scene understanding. Previous works lack a complete …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

More: Multi-order relation mining for dense captioning in 3d scenes

Y Jiao, S Chen, Z Jie, J Chen, L Ma… - European Conference on …, 2022 - Springer

Abstract 3D dense captioning is a recently-proposed novel task, where point clouds contain
more geometric information than the 2D counterpart. However, it is also more challenging …

被引用次数：26 相关文章所有 5 个版本

[PDF] arxiv.org

Pointllm: Empowering large language models to understand point clouds

R Xu, X Wang, T Wang, Y Chen, J Pang… - arXiv preprint arXiv …, 2023 - arxiv.org

The unprecedented advancements in Large Language Models (LLMs) have created a
profound impact on natural language processing but are yet to fully embrace the realm of 3D …

被引用次数：47 相关文章所有 3 个版本

[PDF] thecvf.com

3djcg: A unified framework for joint dense captioning and visual grounding on 3d point clouds

D Cai, L Zhao, J Zhang, L Sheng… - Proceedings of the …, 2022 - openaccess.thecvf.com

Observing that the 3D captioning task and the 3D grounding task contain both shared and
complementary information in nature, in this work, we propose a unified framework to jointly …

被引用次数：63 相关文章所有 5 个版本

高级搜索

QQ 群

Spatiality-guided transformer for 3d dense captioning on point clouds

End-to-end 3d dense captioning with vote2cap-detr

Contextual modeling for 3d dense captioning on point clouds

X-trans2cap: Cross-modal knowledge transfer using transformer for 3d dense captioning

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

Vote2cap-detr++: Decoupling localization and describing for end-to-end 3d dense captioning

Complete 3d relationships extraction modality alignment network for 3d dense captioning

More: Multi-order relation mining for dense captioning in 3d scenes

Pointllm: Empowering large language models to understand point clouds

3djcg: A unified framework for joint dense captioning and visual grounding on 3d point clouds

相关搜索

引用