Generalizing image captions for image-text parallel corpus

Y Li, W Ouyang, B Zhou, K Wang… - Proceedings of the …, 2017 - openaccess.thecvf.com

Object detection, scene graph generation and region captioning, which are three scene
understanding tasks at different semantic levels, are tied together: scene graphs are …

被引用次数：581 相关文章所有 10 个版本

[PDF] thecvf.com

Densecap: Fully convolutional localization networks for dense captioning

J Johnson, A Karpathy… - Proceedings of the IEEE …, 2016 - openaccess.thecvf.com

We introduce the dense captioning task, which requires a computer vision system to both
localize and describe salient regions in images in natural language. The dense captioning …

被引用次数：1442 相关文章所有 12 个版本

Chinese image captioning via fuzzy attention-based DenseNet-BiLSTM

H Lu, R Yang, Z Deng, Y Zhang, G Gao… - ACM Transactions on …, 2021 - dl.acm.org

Chinese image description generation tasks usually have some challenges, such as single-
feature extraction, lack of global information, and lack of detailed description of the image …

被引用次数：123 相关文章

[PDF] mit.edu

Generating sentences by editing prototypes

K Guu, TB Hashimoto, Y Oren, P Liang - Transactions of the …, 2018 - direct.mit.edu

We propose a new generative language model for sentences that first samples a prototype
sentence from the training corpus and then edits it into a new sentence. Compared to …

被引用次数：362 相关文章所有 13 个版本

[PDF] thecvf.com

Guiding the long-short term memory model for image caption generation

X Jia, E Gavves, B Fernando… - Proceedings of the …, 2015 - openaccess.thecvf.com

In this work we focus on the problem of image caption generation. We propose an extension
of the long short term memory (LSTM) model, which we coin gLSTM for short. In particular …

被引用次数：550 相关文章所有 16 个版本

[PDF] thecvf.com

Vip-cnn: Visual phrase guided convolutional neural network

Y Li, W Ouyang, X Wang… - Proceedings of the IEEE …, 2017 - openaccess.thecvf.com

As the intermediate level task connecting image captioning and object detection, visual
relationship detection started to catch researchers' attention because of its descriptive power …

被引用次数：296 相关文章所有 7 个版本

[PDF] ict.ac.cn

Know more say less: Image captioning based on scene graphs

X Li, S Jiang - IEEE Transactions on Multimedia, 2019 - ieeexplore.ieee.org

Automatically describing the content of an image has been attracting considerable research
attention in the multimedia field. To represent the content of an image, many approaches …

被引用次数：182 相关文章所有 5 个版本

[PDF] arxiv.org

Recent advances in neural text generation: A task-agnostic survey

C Tang, F Guerin, C Lin - arXiv preprint arXiv:2203.03047, 2022 - arxiv.org

In recent years, considerable research has been dedicated to the application of neural
models in the field of natural language generation (NLG). The primary objective is to …

被引用次数：17 相关文章所有 3 个版本

[PDF] neurips.cc

A retrieve-and-edit framework for predicting structured outputs

TB Hashimoto, K Guu, Y Oren… - Advances in Neural …, 2018 - proceedings.neurips.cc

For the task of generating complex outputs such as source code, editing existing outputs can
be easier than generating complex outputs from scratch. With this motivation, we propose an …

被引用次数：172 相关文章所有 7 个版本

[PDF] neurips.cc

Diverse and accurate image description using a variational auto-encoder with an additive gaussian encoding space

L Wang, A Schwing, S Lazebnik - Advances in Neural …, 2017 - proceedings.neurips.cc

This paper explores image caption generation using conditional variational auto-encoders
(CVAEs). Standard CVAEs with a fixed Gaussian prior yield descriptions with too little …

被引用次数：199 相关文章所有 6 个版本

高级搜索

QQ 群