On Scaling Up a Multilingual Vision and Language Model

文章

学术资源搜索

获得 1 条结果（用时0.02秒）

我的图书馆

On Scaling Up a Multilingual Vision and Language Model

在引用文章中搜索

[PDF] thecvf.com

Distilling vision-language models on millions of videos

Y Zhao, L Zhao, X Zhou, J Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com

The recent advance in vision-language models is largely attributed to the abundance of
image-text data. We aim to replicate this success for video-language models but there …

被引用次数：7 相关文章所有 4 个版本

高级搜索

QQ 群

On Scaling Up a Multilingual Vision and Language Model

Distilling vision-language models on millions of videos

引用