Learning equivariant segmentation with instance-unique querying

[HTML][HTML] Coarse-to-fine video instance segmentation with factorized conditional appearance flows

Z Qin, X Lu, X Nie, D Liu, Y Yin, W Wang - IEEE/CAA Journal of …, 2023 - ieee-jas.net

We introduce a novel method using a new generative model that automatically learns
effective representations of the target and background appearance to detect, segment and …

被引用次数：52 相关文章所有 4 个版本

[PDF] thecvf.com

Transflow: Transformer as flow learner

Y Lu, Q Wang, S Ma, T Geng… - Proceedings of the …, 2023 - openaccess.thecvf.com

Optical flow is an indispensable building block for various important computer vision tasks,
including motion estimation, object tracking, and disparity measurement. In this work, we …

被引用次数：46 相关文章所有 6 个版本

[PDF] arxiv.org

Clustseg: Clustering for universal segmentation

J Liang, T Zhou, D Liu, W Wang - arXiv preprint arXiv:2305.02187, 2023 - arxiv.org

We present CLUSTSEG, a general, transformer-based framework that tackles different
image segmentation tasks (ie, superpixel, semantic, instance, and panoptic) through a …

被引用次数：63 相关文章所有 5 个版本

[PDF] neurips.cc

Clusterfomer: clustering as a universal visual learner

J Liang, Y Cui, Q Wang, T Geng… - Advances in neural …, 2024 - proceedings.neurips.cc

This paper presents ClusterFormer, a universal vision model that is based on the Clustering
paradigm with TransFormer. It comprises two novel designs: 1) recurrent cross-attention …

被引用次数：19 相关文章所有 5 个版本

[PDF] arxiv.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - arXiv preprint arXiv …, 2023 - arxiv.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

被引用次数：54 相关文章所有 3 个版本

Differential feature awareness network within antagonistic learning for infrared-visible object detection

R Zhang, L Li, Q Zhang, J Zhang, L Xu… - … on Circuits and …, 2023 - ieeexplore.ieee.org

The combination of infrared and visible videos aims to gather more comprehensive feature
information from multiple sources and reach superior results on various practical tasks, such …

被引用次数：41 相关文章

[PDF] neurips.cc

Unified 3d segmenter as prototypical classifiers

Z Qin, C Han, Q Wang, X Nie, Y Yin… - Advances in Neural …, 2023 - proceedings.neurips.cc

The task of point cloud segmentation, comprising semantic, instance, and panoptic
segmentation, has been mainly tackled by designing task-specific network architectures …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

E^ 2VPT: An Effective and Efficient Approach for Visual Prompt Tuning

C Han, Q Wang, Y Cui, Z Cao, W Wang, S Qi… - arXiv preprint arXiv …, 2023 - arxiv.org

As the size of transformer-based models continues to grow, fine-tuning these large-scale
pretrained vision models for new tasks has become increasingly parameter-intensive …

被引用次数：23 相关文章所有 5 个版本

[PDF] thecvf.com

Large-scale person detection and localization using overhead fisheye cameras

L Yang, L Li, X Xin, Y Sun, Q Song… - Proceedings of the …, 2023 - openaccess.thecvf.com

Location determination finds wide applications in daily life. Instead of existing efforts devoted
to localizing tourist photos captured by perspective cameras, in this article, we focus on …

被引用次数：8 相关文章所有 6 个版本

[PDF] ijcai.org

[PDF][PDF] Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration For Video Captioning.

L Yan, C Han, Z Xu, D Liu, Q Wang - IJCAI, 2023 - ijcai.org

Fine-tuning large vision-language models is a challenging task. Prompt tuning approaches
have been introduced to learn fixed textual or visual prompts while freezing the pre-trained …

被引用次数：17 相关文章所有 3 个版本

高级搜索

QQ 群