COOKIE: Contrastive cross-modal knowledge sharing pre-training for vision-language representation

K Wen, J Xia, Y Huang, L Li, J Xu… - Proceedings of the …, 2021 - openaccess.thecvf.com
There has been a recent surge of interest in cross-modal pre-training. However, existed
approaches pre-train a one-stream model to learn joint vision-language representation …

COOKIE: Contrastive Cross-Modal Knowledge Sharing Pre-training for Vision-Language Representation

K Wen, J Xia, Y Huang, L Li, J Xu… - 2021 IEEE/CVF …, 2021 - ieeexplore.ieee.org
There has been a recent surge of interest in cross-modal pre-training. However, existed
approaches pre-train a one-stream model to learn joint vision-language representation …

COOKIE: Contrastive Cross-Modal Knowledge Sharing Pre-training for Vision-Language Representation

K Wen, J Xia, Y Huang, L Li, J Xu… - 2021 IEEE/CVF …, 2021 - computer.org
There has been a recent surge of interest in cross-modal pre-training. However, existed
approaches pre-train a one-stream model to learn joint vision-language representation …

[PDF][PDF] COOKIE: Contrastive Cross-Modal Knowledge Sharing Pre-training for Vision-Language Representation

K Wen, J Xia, Y Huang, L Li, J Xu, J Shao - openaccess.thecvf.com
There has been a recent surge of interest in crossmodal pre-training. However, existed
approaches pre-train a one-stream model to learn joint vision-language representation …