Neural networks are prone to overfitting and memorizing data patterns. To avoid over-fitting and enhance their generalization and performance, various methods have been suggested …
In this work, we present SEEM, a promotable and interactive model for segmenting everything everywhere all at once in an image. In SEEM, we propose a novel and versatile …
Although perception systems have made remarkable advancements in recent years they still rely on explicit human instruction or pre-defined categories to identify the target objects …
Q Yu, J He, X Deng, X Shen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing objects from an open set of categories in diverse environments. One way to address this …
Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks (DNNs) training, and they usually train a DNN for each single visual recognition task …
Understanding and reasoning about spatial relationships is crucial for Visual Question Answering (VQA) and robotics. Vision Language Models (VLMs) have shown impressive …
W Yu, P Zhou, S Yan, X Wang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Inspired by the long-range modeling ability of ViTs large-kernel convolutions are widely studied and adopted recently to enlarge the receptive field and improve model performance …
Z Li, X Wang, X Liu, J Jiang - IEEE Transactions on Image …, 2024 - ieeexplore.ieee.org
Monocular depth estimation (MDE) is a fundamental task in computer vision and has drawn increasing attention. Recently, some methods reformulate it as a classification-regression …
Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require …