Z Gan, L Li, C Li, L Wang, Z Liu, J Gao - arXiv preprint arXiv:2210.09263, 2022 - arxiv.org
This paper surveys vision-language pre-training (VLP) methods for multimodal intelligence that have been developed in the last few years. We group these approaches into three …
Z Gan, L Li, C Li, L Wang, Z Liu, J Gao - arXiv e-prints, 2022 - ui.adsabs.harvard.edu
This paper surveys vision-language pre-training (VLP) methods for multimodal intelligence that have been developed in the last few years. We group these approaches into three …
Z Gan, L Li, C Li, L Wang, Z Liu, J Gao - 2022 - ieeexplore.ieee.org
Humans perceive the world through many channels, such as images viewed by the eyes, or voices heard by the ears. Though any individual channel might be incomplete or noisy …
Z Gan, L Li, C Li, L Wang, Z Liu, J Gao - Foundations and Trends® in …, 2022 - dl.acm.org
This monograph surveys vision-language pre-training (VLP) methods for multimodal intelligence that have been developed in the last few years. We group these approaches …
Z Gan, L Li, C Li, L Wang, Z Liu, J Gao - 2022 - ieeexplore.ieee.org
Humans perceive the world through many channels, such as images viewed by the eyes, or voices heard by the ears. Though any individual channel might be incomplete or noisy …