DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries

文章

学术资源搜索

获得 4 条结果（用时0.03秒）

我的图书馆

DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries

在引用文章中搜索

[PDF] ieee.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

被引用次数：70 相关文章所有 3 个版本

[PDF] arxiv.org

Omg-llava: Bridging image-level, object-level, pixel-level reasoning and understanding

T Zhang, X Li, H Fei, H Yuan, S Wu, S Ji… - arXiv preprint arXiv …, 2024 - arxiv.org

Current universal segmentation methods demonstrate strong capabilities in pixel-level
image and video understanding. However, they lack reasoning abilities and cannot be …

被引用次数：7 相关文章所有 3 个版本

[PDF] arxiv.org

Visa: Reasoning video object segmentation via large language models

C Yan, H Wang, S Yan, X Jiang, Y Hu, G Kang… - arXiv preprint arXiv …, 2024 - arxiv.org

Existing Video Object Segmentation (VOS) relies on explicit user instructions, such as
categories, masks, or short phrases, restricting their ability to perform complex video …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

ViLLa: Video Reasoning Segmentation with Large Language Model

R Zheng, L Qi, X Chen, Y Wang, K Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Although video perception models have made remarkable advancements in recent years,
they still heavily rely on explicit text descriptions or pre-defined categories to identify target …

高级搜索

QQ 群

DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries

Transformer-based visual segmentation: A survey

Omg-llava: Bridging image-level, object-level, pixel-level reasoning and understanding

Visa: Reasoning video object segmentation via large language models

ViLLa: Video Reasoning Segmentation with Large Language Model

引用