相关文章- 学术资源搜索

Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models

BA Plummer, L Wang, CM Cervantes… - Proceedings of the …, 2015 - openaccess.thecvf.com

The Flickr30k dataset has become a standard benchmark for sentence-based image
description. This paper presents Flickr30k Entities, which augments the 158k captions from …

被引用次数：1933 相关文章所有 29 个版本

[PDF] uea.ac.uk

A hierarchical and regional deep learning architecture for image description generation

P Kinghorn, L Zhang, L Shao - Pattern Recognition Letters, 2019 - Elsevier

This research proposes a distinctive deep learning network architecture for image
captioning and description generation. Specifically, we propose a hierarchically trained …

被引用次数：72 相关文章所有 13 个版本

[PDF] thecvf.com

Intention oriented image captions with guiding objects

Y Zheng, Y Li, S Wang - … of the IEEE/CVF conference on …, 2019 - openaccess.thecvf.com

Although existing image caption models can produce promising results using recurrent
neural networks (RNNs), it is difficult to guarantee that an object we care about is contained …

被引用次数：57 相关文章所有 5 个版本

CaptionNet: A tailor-made recurrent neural network for generating image descriptions

L Yang, H Wang, P Tang, Q Li - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

Image captioning is a challenging task of visual understanding and has drawn more
attention of researchers. In general, two inputs are required at each time step by the Long …

被引用次数：43 相关文章所有 2 个版本

[PDF] thecvf.com

Describing like humans: on diversity in image captioning

Q Wang, AB Chan - … of the IEEE/CVF Conference on …, 2019 - openaccess.thecvf.com

Recently, the state-of-the-art models for image captioning have overtaken human
performance based on the most popular metrics, such as BLEU, METEOR, ROUGE and …

被引用次数：111 相关文章所有 7 个版本

[PDF] thecvf.com

Image captioning with semantic attention

Q You, H Jin, Z Wang, C Fang… - Proceedings of the IEEE …, 2016 - openaccess.thecvf.com

Automatically generating a natural language description of an image has attracted interests
recently both because of its importance in practical applications and because it connects two …

被引用次数：2088 相关文章所有 10 个版本

[PDF] cv-foundation.org

Deep visual-semantic alignments for generating image descriptions

A Karpathy, L Fei-Fei - Proceedings of the IEEE conference on …, 2015 - cv-foundation.org

We present a model that generates natural language descriptions of images and their
regions. Our approach leverages datasets of images and their sentence descriptions to …

被引用次数：6878 相关文章所有 39 个版本

[PDF] arxiv.org

Rethinking the reference-based distinctive image captioning

Y Mao, L Chen, Z Jiang, D Zhang, Z Zhang… - Proceedings of the 30th …, 2022 - dl.acm.org

Distinctive Image Captioning (DIC)---generating distinctive captions that describe the unique
details of a target image---has received considerable attention over the last few years. A …

被引用次数：21 相关文章所有 3 个版本

[PDF] uea.ac.uk

A region-based image caption generator with refined descriptions

P Kinghorn, L Zhang, L Shao - Neurocomputing, 2018 - Elsevier

Describing the content of an image is a challenging task. To enable detailed description, it
requires the detection and recognition of objects, people, relationships and associated …

被引用次数：109 相关文章所有 12 个版本

[PDF] thecvf.com

Discriminability objective for training descriptive captions

R Luo, B Price, S Cohen… - Proceedings of the …, 2018 - openaccess.thecvf.com

One property that remains lacking in image captions generated by contemporary methods is
discriminability: being able to tell two images apart given the caption for one of them. We …

被引用次数：223 相关文章所有 7 个版本

高级搜索

QQ 群

Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models

A hierarchical and regional deep learning architecture for image description generation

Intention oriented image captions with guiding objects

CaptionNet: A tailor-made recurrent neural network for generating image descriptions

Describing like humans: on diversity in image captioning

Image captioning with semantic attention

Deep visual-semantic alignments for generating image descriptions

Rethinking the reference-based distinctive image captioning

A region-based image caption generator with refined descriptions

Discriminability objective for training descriptive captions

引用