J Wang,
H Wang, J Deng,
W Wu, D Zhang - arXiv preprint arXiv …, 2021 - arxiv.org
While large scale pre-training has achieved great achievements in bridging the gap
between vision and language, it still faces several challenges. First, the cost for pre-training …