T Yu, Z Lu, X Jin, Z Chen… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Large-scale vision-language models (VLMs) pre-trained on billion-level data have learned general visual representations and broad visual concepts. In principle, the well-learned …
T Chen, W Wang, T Pu, J Qin, Z Yang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Modern visual recognition models often display overconfidence due to their reliance on complex deep neural networks and one-hot target supervision, resulting in unreliable …
Semantic segmentation models comprise an encoder to extract features and a classifier for prediction. However, the learning of the classifier suffers from the ambiguity which is caused …
Few-shot object detection (FSOD) identifies objects from extremely few annotated samples. Most existing FSOD methods, recently, apply the two-stage learning paradigm, which …
Z Lu, J Bai, X Li, Z Xiao, X Wang - arXiv preprint arXiv:2311.17091, 2023 - arxiv.org
Fine-tuning pre-trained vision-language models (VLMs), eg, CLIP, for the open-world generalization has gained increasing popularity due to its practical value. However …
Retrieving natural images with the query sketches under the zero-shot scenario is known as zero-shot sketch-based image retrieval (ZS-SBIR). Most of the best-performing methods …
We address a weakly-supervised low-shot instance segmentation, an annotation-efficient training method to deal with novel classes effectively. Since it is an under-explored problem …
J Wang, B Zhang, J Pang, H Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com
Few-shot segmentation remains challenging due to the limitations of its labeling information for unseen classes. Most previous approaches rely on extracting high-level feature maps …
Land-cover mapping is one of the vital applications in Earth observation aiming at classifying each pixel's land-cover type of remote-sensing images. As natural and human …