- 学术资源搜索

Spectrum-guided multi-granularity referring video object segmentation

B Miao, M Bennamoun, Y Gao… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Current referring video object segmentation (R-VOS) techniques extract conditional kernels
from encoded (low-resolution) vision-language features to segment the decoded high …

被引用次数：28 相关文章所有 7 个版本

[PDF] arxiv.org

Segment any anomaly without training via hybrid prompt regularization

Y Cao, X Xu, C Sun, Y Cheng, Z Du, L Gao… - arXiv preprint arXiv …, 2023 - arxiv.org

We present a novel framework, ie, Segment Any Anomaly+(SAA+), for zero-shot anomaly
segmentation with hybrid prompt regularization to improve the adaptability of modern …

被引用次数：53 相关文章所有 2 个版本

[PDF] thecvf.com

Univs: Unified and universal video segmentation with prompts as queries

M Li, S Li, X Zhang, L Zhang - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com

Despite the recent advances in unified image segmentation (IS) developing a unified video
segmentation (VS) model remains a challenge. This is mainly because generic category …

被引用次数：9 相关文章所有 4 个版本

[PDF] neurips.cc

Described object detection: Liberating object detection with flexible expressions

C Xie, Z Zhang, Y Wu, F Zhu… - Advances in Neural …, 2024 - proceedings.neurips.cc

Detecting objects based on language information is a popular task that includes Open-
Vocabulary object Detection (OVD) and Referring Expression Comprehension (REC). In this …

被引用次数：16 相关文章所有 4 个版本

[PDF] neurips.cc

Paintseg: Painting pixels for training-free segmentation

X Li, CC Lin, Y Chen, Z Liu, J Wang… - Advances in Neural …, 2024 - proceedings.neurips.cc

The paper introduces PaintSeg, a new unsupervised method for segmenting objects without
any training. We propose an adversarial masked contrastive painting (AMCP) process …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Towards robust referring image segmentation

J Wu, X Li, X Li, H Ding, Y Tong… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Referring Image Segmentation (RIS) is a fundamental vision-language task that outputs
object masks based on text descriptions. Many works have achieved considerable progress …

被引用次数：33 相关文章所有 9 个版本

[PDF] thecvf.com

Decoupling static and hierarchical motion perception for referring video segmentation

S He, H Ding - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Referring video segmentation relies on natural language expressions to identify and
segment objects often emphasizing motion clues. Previous works treat a sentence as a …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Omg-llava: Bridging image-level, object-level, pixel-level reasoning and understanding

T Zhang, X Li, H Fei, H Yuan, S Wu, S Ji… - arXiv preprint arXiv …, 2024 - arxiv.org

Current universal segmentation methods demonstrate strong capabilities in pixel-level
image and video understanding. However, they lack reasoning abilities and cannot be …

被引用次数：7 相关文章所有 3 个版本

[PDF] aclanthology.org

Towards noise-tolerant speech-referring video object segmentation: Bridging speech and text

X Li, J Wang, X Xu, M Yang, F Yang… - Proceedings of the …, 2023 - aclanthology.org

Linguistic communication is prevalent in Human-Computer Interaction (HCI). Speech
(spoken language) serves as a convenient yet potentially ambiguous form due to noise and …

被引用次数：11 相关文章所有 3 个版本

[PDF] thecvf.com

Learning cross-modal affinity for referring video object segmentation targeting limited samples

G Li, M Gao, H Liu, X Zhen… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Referring video object segmentation (RVOS), as a supervised learning task, relies on
sufficient annotated data for a given scene. However, in more realistic scenarios, only …

被引用次数：2 相关文章所有 5 个版本

高级搜索

QQ 群

Spectrum-guided multi-granularity referring video object segmentation

Segment any anomaly without training via hybrid prompt regularization

Univs: Unified and universal video segmentation with prompts as queries

Described object detection: Liberating object detection with flexible expressions

Paintseg: Painting pixels for training-free segmentation

Towards robust referring image segmentation

Decoupling static and hierarchical motion perception for referring video segmentation

Omg-llava: Bridging image-level, object-level, pixel-level reasoning and understanding

Towards noise-tolerant speech-referring video object segmentation: Bridging speech and text

Learning cross-modal affinity for referring video object segmentation targeting limited samples

引用