Vote2cap-detr++: Decoupling localization and describing for end-to-end 3d dense captioning

文章

学术资源搜索

获得 3 条结果（用时0.02秒）

我的图书馆

Vote2cap-detr++: Decoupling localization and describing for end-to-end 3d dense captioning

在引用文章中搜索

[PDF] thecvf.com

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning

S Chen, X Chen, C Zhang, M Li, G Yu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Recent progress in Large Multimodal Models (LMM) has opened up great
possibilities for various applications in the field of human-machine interactions. However …

被引用次数：13 相关文章所有 3 个版本

[PDF] arxiv.org

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

B Jin, Y Zheng, P Li, W Li, Y Zheng, S Hu, X Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

3D dense captioning stands as a cornerstone in achieving a comprehensive understanding
of 3D scenes through natural language. It has recently witnessed remarkable achievements …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Lightweight Model Pre-Training Via Language Guided Knowledge Distillation

M Li, L Zhang, M Zhu, Z Huang, G Yu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

This paper studies the problem of pre-training for small models, which is essential for many
mobile devices. Current state-of-the-art methods on this problem transfer the …

高级搜索

QQ 群

Vote2cap-detr++: Decoupling localization and describing for end-to-end 3d dense captioning

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

Lightweight Model Pre-Training Via Language Guided Knowledge Distillation

引用