Coarse-to-fine video instance segmentation with factorized conditional appearance flows

Z Qin, C Han, Q Wang, X Nie, Y Yin… - Advances in Neural …, 2023 - proceedings.neurips.cc

The task of point cloud segmentation, comprising semantic, instance, and panoptic
segmentation, has been mainly tackled by designing task-specific network architectures …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Local-global context aware transformer for language-guided video segmentation

C Liang, W Wang, T Zhou, J Miao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

We explore the task of language-guided video segmentation (LVS). Previous algorithms
mostly adopt 3D CNNs to learn video representation, struggling to capture long-term context …

被引用次数：63 相关文章所有 9 个版本

[PDF] ijcai.org

[PDF][PDF] Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration For Video Captioning.

L Yan, C Han, Z Xu, D Liu, Q Wang - IJCAI, 2023 - ijcai.org

Fine-tuning large vision-language models is a challenging task. Prompt tuning approaches
have been introduced to learn fixed textual or visual prompts while freezing the pre-trained …

被引用次数：17 相关文章所有 3 个版本

[PDF] thecvf.com

Unified mask embedding and correspondence learning for self-supervised video segmentation

L Li, W Wang, T Zhou, J Li… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

The objective of this paper is self-supervised learning of video object segmentation. We
develop a unified framework which simultaneously models cross-frame dense …

被引用次数：13 相关文章所有 7 个版本

[PDF] thecvf.com

Promotion: Prototypes as motion learners

Y Lu, D Liu, Q Wang, C Han, Y Cui… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this work we introduce ProMotion a unified prototypical transformer-based framework
engineered to model fundamental motion tasks. ProMotion offers a range of compelling …

被引用次数：1 相关文章所有 3 个版本

[PDF] archive.org

Clip fusion with bi-level optimization for human mesh reconstruction from monocular videos

P Wu, X Lu, J Shen, Y Yin - Proceedings of the 31st ACM international …, 2023 - dl.acm.org

Human mesh reconstruction (HMR) from monocular video is the key step to many mixed
reality and robotic applications. Although existing methods show promising results by …

被引用次数：6 相关文章所有 2 个版本

[PDF] cardiff.ac.uk

Dual-constraint coarse-to-fine network for camouflaged object detection

G Yue, H Xiao, H Xie, T Zhou, W Zhou… - … on Circuits and …, 2023 - ieeexplore.ieee.org

Camouflaged object detection (COD) is an important yet challenging task, with great
application values in industrial defect detection, medical care, etc. The challenges mainly …

被引用次数：7 相关文章所有 2 个版本

Video corpus moment retrieval via deformable multigranularity feature fusion and adversarial training

X Zhang, P Zhao, J Ji, X Lu, Y Yin - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

As a new emerging task, video corpus moment retrieval (VCMR) aims to find the video
segments relevant to a given natural language query from a large number of untrimmed …

被引用次数：7 相关文章

[PDF] arxiv.org

Image translation as diffusion visual programmers

C Han, JC Liang, Q Wang, M Rabbani, S Dianat… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce the novel Diffusion Visual Programmer (DVP), a neuro-symbolic image
translation framework. Our proposed DVP seamlessly embeds a condition-flexible diffusion …

被引用次数：5 相关文章所有 3 个版本

Hybridvps: Hybrid-supervised video polyp segmentation under low-cost labels

W Li, X Xiong, S Li, F Fan - IEEE Signal Processing Letters, 2023 - ieeexplore.ieee.org

Deep polyp segmentation methods have shown remarkable potential in boosting diagnostic
efficiency. Nevertheless, these methods rely on sufficient pixel-wise annotated data, which is …

被引用次数：5 相关文章所有 2 个版本

高级搜索

QQ 群