… Yvan Leclerc and Pascal Fua, colleagues from my brief interlude at SRI International, gave me new perspectives on alternative approaches to computervision. During my six years of …
C Jia, Y Yang, Y Xia, YT Chen… - International …, 2021 - proceedings.mlr.press
… In this work, we leverage a dataset of over one billion noisy image alt-text pairs to scale visual and vision-language representation learning. We follow the procedures described in the …
… We collect 4 billion image and alt-text pairs following the same process as ALIGN [30], with the same image-based filtering but simpler text-based filtering. Appendix L shows that …
X Hu, Z Gan, J Wang, Z Yang, Z Liu… - … on computer vision …, 2022 - openaccess.thecvf.com
… We remove the alt-text if any of its unigrams cannot be found in the vocabulary. Afterwards, … 200 million images, each corresponding to one alt-text. The word cloud of 200 most frequent …
S Long, X He, C Yao - International Journal of Computer Vision, 2021 - Springer
… With the rise and development of deep learning, computervision has been tremendously transformed and reshaped. As an important research area in computervision, scene text …
N Sarafianos, X Xu… - … on computer vision, 2019 - openaccess.thecvf.com
… For many computervision applications such as image captioning, … and text level is an essential yet challenging problem. Its challenges originate from the large word variance in the text …
We present a method for zero-shot, text-driven editing of natural images and videos. Given an image or a video and a text prompt, our goal is to edit the appearance of existing objects (…
… As an alternative, we propose learning visual grounding from freely-available internet data, … alt-text captured in the Conceptual Captions dataset [24], containing around 3.3M image-text …
… shared representation, we introduce a new computervision foundation model, Florence, to … image-text data, our Florence model can be easily adapted for various computervision tasks…