Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Towards open vocabulary learning: A survey

J Wu, X Li, S Xu, H Yuan, H Ding… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
In the field of visual scene understanding, deep neural networks have made impressive
advancements in various core tasks like segmentation, tracking, and detection. However …

A survey on open-vocabulary detection and segmentation: Past, present, and future

C Zhu, L Chen - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org
As the most fundamental scene understanding tasks, object detection and segmentation
have made tremendous progress in deep learning era. Due to the expensive manual …

DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data

C Fan, M Zhu, H Chen, Y Liu, W Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Instance segmentation is data-hungry and as model capacity increases data scale becomes
crucial for improving the accuracy. Most instance segmentation datasets today require costly …

Ai-generated images as data source: The dawn of synthetic era

Z Yang, F Zhan, K Liu, M Xu, S Lu - arXiv preprint arXiv:2310.01830, 2023 - arxiv.org
The advancement of visual intelligence is intrinsically tethered to the availability of data. In
parallel, generative Artificial Intelligence (AI) has unlocked the potential to create synthetic …

Open-vocabulary SAM: Segment and recognize twenty-thousand classes interactively

H Yuan, X Li, C Zhou, Y Li, K Chen, CC Loy - arXiv preprint arXiv …, 2024 - arxiv.org
The CLIP and Segment Anything Model (SAM) are remarkable vision foundation models
(VFMs). SAM excels in segmentation tasks across diverse domains, while CLIP is renowned …

[HTML][HTML] Ov-vg: A benchmark for open-vocabulary visual grounding

C Wang, W Feng, X Li, G Cheng, S Lyu, B Liu, L Chen… - Neurocomputing, 2024 - Elsevier
Open-vocabulary learning has emerged as a cutting-edge research area, particularly in light
of the widespread adoption of vision-based foundational models. Its primary objective is to …

Explore in-context segmentation via latent diffusion models

C Wang, X Li, H Ding, L Qi, J Zhang, Y Tong… - arXiv preprint arXiv …, 2024 - arxiv.org
In-context segmentation has drawn more attention with the introduction of vision foundation
models. Most existing approaches adopt metric learning or masked image modeling to build …

[HTML][HTML] Automated region of interest-based data augmentation for fallen person detection in off-road autonomous agricultural vehicles

H Baek, S Yu, S Son, J Seo, Y Chung - Sensors, 2024 - mdpi.com
Due to the global population increase and the recovery of agricultural demand after the
COVID-19 pandemic, the importance of agricultural automation and autonomous agricultural …

Generative Data Augmentation Improves Scribble-supervised Semantic Segmentation

J Schnell, J Wang, L Qi, VT Hu, M Tang - arXiv preprint arXiv:2311.17121, 2023 - arxiv.org
Recent advances in generative models, such as diffusion models, have made generating
high-quality synthetic images widely accessible. Prior works have shown that training on …