相关文章- 学术资源搜索

Cider: Consensus-based image description evaluation

R Vedantam, C Lawrence Zitnick… - Proceedings of the …, 2015 - openaccess.thecvf.com

Automatically describing an image with a sentence is a long-standing challenge in computer
vision and natural language processing. Due to recent progress in object detection, attribute …

被引用次数：4663 相关文章所有 20 个版本

[PDF] thecvf.com

Skeleton key: Image captioning by skeleton-attribute decomposition

Y Wang, Z Lin, X Shen, S Cohen… - Proceedings of the …, 2017 - openaccess.thecvf.com

Recently, there has been a lot of interest in automatically generating descriptions for an
image. Most existing language-model based approaches for this task learn to generate an …

被引用次数：137 相关文章所有 14 个版本

[PDF] thecvf.com

Paying attention to descriptions generated by image captioning models

HR Tavakoli, R Shetty, A Borji… - Proceedings of the …, 2017 - openaccess.thecvf.com

To bridge the gap between humans and machines in image understanding and describing,
we need further insight into how people describe a perceived scene. In this paper, we study …

被引用次数：92 相关文章所有 16 个版本

[PDF] thecvf.com

From captions to visual concepts and back

H Fang, S Gupta, F Iandola… - Proceedings of the …, 2015 - openaccess.thecvf.com

This paper presents a novel approach for automatically generating image descriptions:
visual detectors, language models, and multimodal similarity models learnt directly from a …

被引用次数：1625 相关文章所有 25 个版本

[PDF] cv-foundation.org

Rich image captioning in the wild

K Tran, X He, L Zhang, J Sun, C Carapcea… - Proceedings of the …, 2016 - cv-foundation.org

We present an image caption system that addresses new challenges of automatically
describing images in the wild. The challenges include generating high quality caption with …

被引用次数：164 相关文章所有 11 个版本

[PDF] thecvf.com

A thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching

P Das, C Xu, RF Doell, JJ Corso - Proceedings of the IEEE …, 2013 - openaccess.thecvf.com

The problem of describing images through natural language has gained importance in the
computer vision community. Solutions to image description have either focused on a top …

被引用次数：394 相关文章所有 18 个版本

[PDF] thecvf.com

Generation and comprehension of unambiguous object descriptions

J Mao, J Huang, A Toshev… - Proceedings of the …, 2016 - openaccess.thecvf.com

We propose a method that can generate an unambiguous description (known as a referring
expression) of a specific object or region in an image, and which can also comprehend or …

被引用次数：1214 相关文章所有 17 个版本

GLA: Global–local attention for image description

L Li, S Tang, Y Zhang, L Deng… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org

In recent years, the task of automatically generating image description has attracted a lot of
attention in the field of artificial intelligence. Benefitting from the development of …

被引用次数：119 相关文章所有 3 个版本

[PDF] arxiv.org

The long-short story of movie description

A Rohrbach, M Rohrbach, B Schiele - … October 7-10, 2015, Proceedings 37, 2015 - Springer

Generating descriptions for videos has many applications including assisting blind people
and human-robot interaction. The recent advances in image captioning as well as the …

被引用次数：157 相关文章所有 6 个版本

[PDF] jair.org

Framing image description as a ranking task: Data, models and evaluation metrics

M Hodosh, P Young, J Hockenmaier - Journal of Artificial Intelligence …, 2013 - jair.org

The ability to associate images with natural language sentences that describe what is
depicted in them is a hallmark of image understanding, and a prerequisite for applications …

被引用次数：1514 相关文章所有 17 个版本

高级搜索

QQ 群

Cider: Consensus-based image description evaluation

Skeleton key: Image captioning by skeleton-attribute decomposition

Paying attention to descriptions generated by image captioning models

From captions to visual concepts and back

Rich image captioning in the wild

A thousand frames in just a few words: Lingual description of videos through latent topics and sparse object stitching

Generation and comprehension of unambiguous object descriptions

GLA: Global–local attention for image description

The long-short story of movie description

Framing image description as a ranking task: Data, models and evaluation metrics

相关搜索

引用