Cdul: Clip-driven unsupervised learning for multi-label image classification

R Abdelfattah, Q Guo, X Li, X Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
This paper presents a CLIP-based unsupervised learning method for annotation-free multi-
label image classification, including three stages: initialization, training, and inference. At the …

Texts as images in prompt tuning for multi-label image recognition

Z Guo, B Dong, Z Ji, J Bai, Y Guo… - Proceedings of the …, 2023 - openaccess.thecvf.com
Prompt tuning has been employed as an efficient way to adapt large vision-language pre-
trained models (eg CLIP) to various downstream tasks in data-limited or label-limited …

Exploring structured semantic prior for multi label recognition with incomplete labels

Z Ding, A Wang, H Chen, Q Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Multi-label recognition (MLR) with incomplete labels is very challenging. Recent works strive
to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to …

Bridging the gap between model explanations in partially annotated multi-label classification

Y Kim, JM Kim, J Jeong, C Schmid… - Proceedings of the …, 2023 - openaccess.thecvf.com
Due to the expensive costs of collecting labels in multi-label classification datasets, partially
annotated multi-label classification has become an emerging field in computer vision. One …

Dualcoop++: Fast and effective adaptation to multi-label recognition with limited annotations

P Hu, X Sun, S Sclaroff… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Multi-label image recognition in the low-label regime is a task of great challenge and
practical significance. Previous works have focused on learning the alignment between …

Spatial-temporal knowledge-embedded transformer for video scene graph generation

T Pu, T Chen, H Wu, Y Lu, L Lin - IEEE Transactions on Image …, 2023 - ieeexplore.ieee.org
Video scene graph generation (VidSGG) aims to identify objects in visual scenes and infer
their relationships for a given video. It requires not only a comprehensive understanding of …

Ingredient prediction via context learning network with class-adaptive asymmetric loss

M Luo, W Min, Z Wang, J Song… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Ingredient prediction has received more and more attention with the help of image
processing for its diverse real-world applications, such as nutrition intake management and …

Saliency Regularization for Self-Training with Partial Annotations

S Wang, Q Wan, X Xiang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Partially annotated images are easy to obtain in multi-label classification. However,
unknown labels in partially annotated images exacerbate the positive-negative imbalance …

Generating diverse augmented attributes for generalized zero shot learning

X Zhao, Y Shen, S Wang, H Zhang - Pattern Recognition Letters, 2023 - Elsevier
Abstract Generalized Zero-Shot Learning (GZSL) has become an important research due to
its powerful ability of recognizing unseen objects. Generative methods, converting …

Positive label is all you need for multi-label classification

Z Yuan, K Zhang, T Huang - arXiv preprint arXiv:2306.16016, 2023 - arxiv.org
Multi-label classification (MLC) suffers from the inevitable label noise in training data due to
the difficulty in annotating various semantic labels in each image. To mitigate the influence …