Computer vision-enhanced selection of geo-tagged photos on social network sites for land cover classification

MM ElQadi, M Lesiv, AG Dyer, A Dorin - Environmental Modelling & …, 2020 - Elsevier
… Our framework uses computer vision to analyse the content of geo-tagged photos on social
network sites to generate descriptive tags. These are used to train artificial neural networks to …

Position-guided text prompt for vision-language pre-training

J Wang, P Zhou, MZ Shou… - … on Computer Vision and …, 2023 - openaccess.thecvf.com
Vision-Language Pre-Training (VLP) has shown promising capabilities to align image and
text pairs, facilitating a broad variety of cross-modal learning tasks. However, we observe that …

ViSpa (Vision Spaces): A computer-vision-based representation system for individual images and concept prototypes, with large-scale evaluation.

F Günther, M Marelli, S Tureski, MA Petilli - Psychological Review, 2023 - psycnet.apa.org
… In order to model vision-based mental … , vision-based representations obtained from
computer-vision DCNNs have received considerably less attention as compared to computer science…

Learning deep structure-preserving image-text embeddings

L Wang, Y Li, S Lazebnik - … conference on computer vision …, 2016 - openaccess.thecvf.com
This paper proposes a method for learning joint embeddings of images and text using a two-branch
neural network with multiple layers of linear projections followed by nonlinearities. …

Classroom slide narration system

KV Jobin, A Mondal, CV Jawahar - … Conference on Computer Vision and …, 2021 - Springer
… csns and the existing assistive systems—Automatic Alt-Text (aat) [30] and Tesseract ocr [26].
The proposed system generates a markup text, where each logical content is tagged with its …

[图书][B] Digital image processing and analysis: human and computer vision applications with CVIPtools

SE Umbaugh - 2010 - books.google.com
… Divided into five major sections, this book provides the concepts and models required to
analyze digital images and develop computer vision and human consumption applications as …

Medical Prescription Label Reading Using Computer Vision and Deep Learning

A Henry, R Sujee - Soft Computing for Problem Solving: Proceedings of …, 2023 - Springer
… reduced dimensions and then fed into two alternative architectures, CRNN alone and EAST
… to conventional text using these models. After obtaining the texts, the text is calculated using …

Learning transferable visual models from natural language supervision

A Radford, JW Kim, C Hallacy… - International …, 2021 - proceedings.mlr.press
… SOTA computer vision systems are trained to predict a fixed … from raw text about images
is a promising alternative which … We study performance on over 30 different computer vision

Learning audio-video modalities from image captions

A Nagrani, PH Seo, B Seybold, A Hauth… - … on Computer Vision, 2022 - Springer
… -text datasets, as images with alt-text captions can be easily obtained online. Obtaining
large-scale, high quality data for video in the form of text-video and text-… are suitable for text-audio …

StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation

A Maharana, D Hannan, M Bansal - … Conference on Computer Vision, 2022 - Springer
Recent advances in text-to-image synthesis have led to large pretrained transformers with
excellent capabilities to generate visualizations from a given text. However, these models are …