相关文章- 学术资源搜索

I2t: Image parsing to text description

BZ Yao, X Yang, L Lin, MW Lee… - Proceedings of the …, 2010 - ieeexplore.ieee.org

In this paper, we present an image parsing to text description (I2T) framework that generates
text descriptions of image and video content based on image understanding. The proposed …

被引用次数：399 相关文章所有 9 个版本

[PDF] asu.edu

Image understanding using vision and reasoning through scene description graph

S Aditya, Y Yang, C Baral, Y Aloimonos… - Computer Vision and …, 2018 - Elsevier

Two of the fundamental tasks in image understanding using text are caption generation and
visual question answering (Antol et al., 2015; Xiong et al., 2016). This work presents an …

被引用次数：92 相关文章所有 3 个版本

[PDF] aclanthology.org

[PDF][PDF] Composing simple image descriptions using web-scale n-grams

S Li, G Kulkarni, T Berg, A Berg… - Proceedings of the …, 2011 - aclanthology.org

Studying natural language, and especially how people describe the world around them can
help us better understand the visual world. In turn, it can also help us in the quest to …

被引用次数：513 相关文章所有 21 个版本

[PDF] psu.edu

VIsual TRAnslator: Linking perceptions and natural language descriptions

G Herzog, P Wazinski - Artificial Intelligence Review, 1994 - Springer

Despite the fact that image understanding and natural language processing constitute two
major areas of AI, there have only been a few attempts toward the integration of computer …

被引用次数：138 相关文章所有 16 个版本

[PDF] arxiv.org

Image captioning and visual question answering based on attributes and external knowledge

Q Wu, C Shen, P Wang, A Dick… - IEEE transactions on …, 2017 - ieeexplore.ieee.org

Much of the recent progress in Vision-to-Language problems has been achieved through a
combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks …

被引用次数：459 相关文章所有 8 个版本

[PDF] aclanthology.org

[PDF][PDF] Collective generation of natural image descriptions

P Kuznetsova, V Ordonez, A Berg… - Proceedings of the …, 2012 - aclanthology.org

We present a holistic data-driven approach to image description generation, exploiting the
vast amount of (noisy) parallel image data and associated natural language descriptions …

被引用次数：448 相关文章所有 21 个版本

[PDF] bilkent.edu.tr

Learning Bayesian classifiers for scene classification with a visual grammar

S Aksoy, K Koperski, C Tusk… - IEEE Transactions on …, 2005 - ieeexplore.ieee.org

A challenging problem in image content extraction and classification is building a system
that automatically learns high-level semantic interpretations of images. We describe a …

被引用次数：219 相关文章所有 15 个版本

[PDF] thecvf.com

Bringing semantics into focus using visual abstraction

CL Zitnick, D Parikh - … of the IEEE Conference on Computer …, 2013 - openaccess.thecvf.com

Relating visual information to its linguistic semantic meaning remains an open and
challenging area of research. The semantic meaning of images depends on the presence of …

被引用次数：240 相关文章所有 23 个版本

[PDF] thecvf.com

A thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching

P Das, C Xu, RF Doell, JJ Corso - Proceedings of the IEEE …, 2013 - openaccess.thecvf.com

The problem of describing images through natural language has gained importance in the
computer vision community. Solutions to image description have either focused on a top …

被引用次数：395 相关文章所有 15 个版本

[PDF] iitk.ac.in

Babytalk: Understanding and generating simple image descriptions

G Kulkarni, V Premraj, V Ordonez… - IEEE transactions on …, 2013 - ieeexplore.ieee.org

We present a system to automatically generate natural language descriptions from images.
This system consists of two parts. The first part, content planning, smooths the output of …

被引用次数：1626 相关文章所有 22 个版本

高级搜索

QQ 群