Unified 3d segmenter as prototypical classifiers

Z Qin, C Han, Q Wang, X Nie, Y Yin… - Advances in Neural …, 2023 - proceedings.neurips.cc
The task of point cloud segmentation, comprising semantic, instance, and panoptic
segmentation, has been mainly tackled by designing task-specific network architectures …

Local-global context aware transformer for language-guided video segmentation

C Liang, W Wang, T Zhou, J Miao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
We explore the task of language-guided video segmentation (LVS). Previous algorithms
mostly adopt 3D CNNs to learn video representation, struggling to capture long-term context …

[PDF][PDF] Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration For Video Captioning.

L Yan, C Han, Z Xu, D Liu, Q Wang - IJCAI, 2023 - ijcai.org
Fine-tuning large vision-language models is a challenging task. Prompt tuning approaches
have been introduced to learn fixed textual or visual prompts while freezing the pre-trained …

Unified mask embedding and correspondence learning for self-supervised video segmentation

L Li, W Wang, T Zhou, J Li… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
The objective of this paper is self-supervised learning of video object segmentation. We
develop a unified framework which simultaneously models cross-frame dense …

Promotion: Prototypes as motion learners

Y Lu, D Liu, Q Wang, C Han, Y Cui… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this work we introduce ProMotion a unified prototypical transformer-based framework
engineered to model fundamental motion tasks. ProMotion offers a range of compelling …

Clip fusion with bi-level optimization for human mesh reconstruction from monocular videos

P Wu, X Lu, J Shen, Y Yin - Proceedings of the 31st ACM international …, 2023 - dl.acm.org
Human mesh reconstruction (HMR) from monocular video is the key step to many mixed
reality and robotic applications. Although existing methods show promising results by …

Dual-constraint coarse-to-fine network for camouflaged object detection

G Yue, H Xiao, H Xie, T Zhou, W Zhou… - … on Circuits and …, 2023 - ieeexplore.ieee.org
Camouflaged object detection (COD) is an important yet challenging task, with great
application values in industrial defect detection, medical care, etc. The challenges mainly …

Video corpus moment retrieval via deformable multigranularity feature fusion and adversarial training

X Zhang, P Zhao, J Ji, X Lu, Y Yin - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
As a new emerging task, video corpus moment retrieval (VCMR) aims to find the video
segments relevant to a given natural language query from a large number of untrimmed …

Image translation as diffusion visual programmers

C Han, JC Liang, Q Wang, M Rabbani, S Dianat… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce the novel Diffusion Visual Programmer (DVP), a neuro-symbolic image
translation framework. Our proposed DVP seamlessly embeds a condition-flexible diffusion …

Hybridvps: Hybrid-supervised video polyp segmentation under low-cost labels

W Li, X Xiong, S Li, F Fan - IEEE Signal Processing Letters, 2023 - ieeexplore.ieee.org
Deep polyp segmentation methods have shown remarkable potential in boosting diagnostic
efficiency. Nevertheless, these methods rely on sufficient pixel-wise annotated data, which is …