相关文章- 学术资源搜索

Unified contrastive learning in image-text-label space

J Yang, C Li, P Zhang, B Xiao, C Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com

Visual recognition is recently learned via either supervised learning on human-annotated
image-label data or language-image contrastive learning with webly-crawled image-text …

被引用次数：173 相关文章所有 5 个版本

[PDF] arxiv.org

Data efficient language-supervised zero-shot recognition with optimal transport distillation

B Wu, R Cheng, P Zhang, T Gao, P Vajda… - arXiv preprint arXiv …, 2021 - arxiv.org

Traditional computer vision models are trained to predict a fixed set of predefined
categories. Recently, natural language has been shown to be a broader and richer source of …

被引用次数：36 相关文章所有 3 个版本

[PDF] thecvf.com

Ra-clip: Retrieval augmented contrastive language-image pre-training

CW Xie, S Sun, X Xiong, Y Zheng… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Contrastive Language-Image Pre-training (CLIP) is attracting increasing attention
for its impressive zero-shot recognition performance on different down-stream tasks …

被引用次数：19 相关文章所有 5 个版本

[PDF] thecvf.com

I2mvformer: Large language model generated multi-view document supervision for zero-shot image classification

MF Naeem, MGZA Khan, Y Xian… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent works have shown that unstructured text (documents) from online sources can serve
as useful auxiliary information for zero-shot image classification. However, these methods …

被引用次数：41 相关文章所有 7 个版本

[PDF] neurips.cc

I2dformer: Learning image to document attention for zero-shot image classification

MF Naeem, Y Xian, LV Gool… - Advances in Neural …, 2022 - proceedings.neurips.cc

Despite the tremendous progress in zero-shot learning (ZSL), the majority of existing
methods still rely on human-annotated attributes, which are difficult to annotate and scale …

被引用次数：30 相关文章所有 7 个版本

[PDF] thecvf.com

Sus-x: Training-free name-only transfer of vision-language models

V Udandarao, A Gupta… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract Contrastive Language-Image Pre-training (CLIP) has emerged as a simple yet
effective way to train large-scale vision-language models. CLIP demonstrates impressive …

被引用次数：56 相关文章所有 5 个版本

[PDF] thecvf.com

Domain-aware visual bias eliminating for generalized zero-shot learning

S Min, H Yao, H Xie, C Wang… - Proceedings of the …, 2020 - openaccess.thecvf.com

Generalized zero-shot learning aims to recognize images from seen and unseen domains.
Recent methods focus on learning a unified semantic-aligned visual representation to …

被引用次数：162 相关文章所有 6 个版本

[PDF] thecvf.com

Non-contrastive learning meets language-image pre-training

J Zhou, L Dong, Z Gan, L Wang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Contrastive language-image pre-training (CLIP) serves as a de-facto standard to align
images and texts. Nonetheless, the loose correlation between images and texts of web …

被引用次数：14 相关文章所有 5 个版本

[PDF] thecvf.com

Data-efficient language-supervised zero-shot learning with self-distillation

R Cheng, B Wu, P Zhang, P Vajda… - Proceedings of the …, 2021 - openaccess.thecvf.com

Traditional computer vision models are trained to predict a fixed set of predefined
categories. Recently, natural language has been shown to be a broader and richer source of …

被引用次数：32 相关文章所有 6 个版本

[PDF] thecvf.com

Progressive ensemble networks for zero-shot recognition

M Ye, Y Guo - Proceedings of the IEEE/CVF conference on …, 2019 - openaccess.thecvf.com

Despite the advancement of supervised image recognition algorithms, their dependence on
the availability of labeled data and the rapid expansion of image categories raise the …

被引用次数：78 相关文章所有 5 个版本

高级搜索

QQ 群

Unified contrastive learning in image-text-label space

Data efficient language-supervised zero-shot recognition with optimal transport distillation

Ra-clip: Retrieval augmented contrastive language-image pre-training

I2mvformer: Large language model generated multi-view document supervision for zero-shot image classification

I2dformer: Learning image to document attention for zero-shot image classification

Sus-x: Training-free name-only transfer of vision-language models

Domain-aware visual bias eliminating for generalized zero-shot learning

Non-contrastive learning meets language-image pre-training

Data-efficient language-supervised zero-shot learning with self-distillation

Progressive ensemble networks for zero-shot recognition

引用