I2t: Image parsing to text description

BZ Yao, X Yang, L Lin, MW Lee… - Proceedings of the …, 2010 - ieeexplore.ieee.org
In this paper, we present an image parsing to text description (I2T) framework that generates
text descriptions of image and video content based on image understanding. The proposed …

Image understanding using vision and reasoning through scene description graph

S Aditya, Y Yang, C Baral, Y Aloimonos… - Computer Vision and …, 2018 - Elsevier
Two of the fundamental tasks in image understanding using text are caption generation and
visual question answering (Antol et al., 2015; Xiong et al., 2016). This work presents an …

[PDF][PDF] Composing simple image descriptions using web-scale n-grams

S Li, G Kulkarni, T Berg, A Berg… - Proceedings of the …, 2011 - aclanthology.org
Studying natural language, and especially how people describe the world around them can
help us better understand the visual world. In turn, it can also help us in the quest to …

VIsual TRAnslator: Linking perceptions and natural language descriptions

G Herzog, P Wazinski - Artificial Intelligence Review, 1994 - Springer
Despite the fact that image understanding and natural language processing constitute two
major areas of AI, there have only been a few attempts toward the integration of computer …

Image captioning and visual question answering based on attributes and external knowledge

Q Wu, C Shen, P Wang, A Dick… - IEEE transactions on …, 2017 - ieeexplore.ieee.org
Much of the recent progress in Vision-to-Language problems has been achieved through a
combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks …

[PDF][PDF] Collective generation of natural image descriptions

P Kuznetsova, V Ordonez, A Berg… - Proceedings of the …, 2012 - aclanthology.org
We present a holistic data-driven approach to image description generation, exploiting the
vast amount of (noisy) parallel image data and associated natural language descriptions …

Learning Bayesian classifiers for scene classification with a visual grammar

S Aksoy, K Koperski, C Tusk… - IEEE Transactions on …, 2005 - ieeexplore.ieee.org
A challenging problem in image content extraction and classification is building a system
that automatically learns high-level semantic interpretations of images. We describe a …

Bringing semantics into focus using visual abstraction

CL Zitnick, D Parikh - … of the IEEE Conference on Computer …, 2013 - openaccess.thecvf.com
Relating visual information to its linguistic semantic meaning remains an open and
challenging area of research. The semantic meaning of images depends on the presence of …

A thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching

P Das, C Xu, RF Doell, JJ Corso - Proceedings of the IEEE …, 2013 - openaccess.thecvf.com
The problem of describing images through natural language has gained importance in the
computer vision community. Solutions to image description have either focused on a top …

Babytalk: Understanding and generating simple image descriptions

G Kulkarni, V Premraj, V Ordonez… - IEEE transactions on …, 2013 - ieeexplore.ieee.org
We present a system to automatically generate natural language descriptions from images.
This system consists of two parts. The first part, content planning, smooths the output of …