Coca: Contrastive captioners are image-text foundation models

J Yu, Z Wang, V Vasudevan, L Yeung… - arXiv preprint arXiv …, 2022 - arxiv.org
Exploring large-scale pretrained foundation models is of significant interest in computer
vision because these models can be quickly transferred to many downstream tasks. This …

[PDF][PDF] CoCa: Contrastive Captioners are Image-Text Foundation Models

J Yu, Z Wang, VVLYMS YonghuiWu - arXiv preprint arXiv:2205.01917, 2022 - r.jordan.im
Exploring large-scale pretrained foundation models is of significant interest in computer
vision because these models can be quickly transferred to many downstream tasks. This …

CoCa: Contrastive Captioners are Image-Text Foundation Models

J Yu, Z Wang, V Vasudevan, L Yeung… - 2022 - openreview.net
Exploring large-scale pretrained foundation models is of significant interest in computer
vision because these models can be quickly transferred to many downstream tasks. This …

CoCa: Contrastive Captioners are Image-Text Foundation Models

J Yu, Z Wang, V Vasudevan, L Yeung… - research.google
Exploring large-scale pretrained foundation models is of significant interest in computer
vision because these models can be quickly transferred to many downstream tasks. This …

CoCa: Contrastive Captioners are Image-Text Foundation Models

J Yu, Z Wang, V Vasudevan, L Yeung… - arXiv e …, 2022 - ui.adsabs.harvard.edu
Exploring large-scale pretrained foundation models is of significant interest in computer
vision because these models can be quickly transferred to many downstream tasks. This …

CoCa: Contrastive Captioners are Image-Text Foundation Models

J Yu, Z Wang, V Vasudevan, L Yeung… - research.google
Exploring large-scale pretrained foundation models is of significant interest in computer
vision because these models can be quickly transferred to many downstream tasks. This …

CoCa: Contrastive Captioners are Image-Text Foundation Models

J Yu, Z Wang, V Vasudevan, L Yeung… - … on Machine Learning … - openreview.net
Exploring large-scale pretrained foundation models is of significant interest in computer
vision because these models can be quickly transferred to many downstream tasks. This …